Merged
Conversation
Introduce twinID parameter across pedigree segmentation functions (ped2fam/.ped2id/ped2graph/ped2maternal/ped2paternal) and thread it through calls so twin IDs are considered when building family/graph structures. Add mz_twins logical option to ped2com (default FALSE) and, when TRUE and twinID is present, call addMZtwins(ped) to treat MZ twins as an additional parent–child relationship for relatedness computations. Also fix a minor typo in a comment.
* Fix MZ twins coded as 0.5 instead of 1.0 in relatedness matrix Implement addMZtwins() which redirects one MZ co-twin's parent links to point to the other twin before adjacency matrix construction. This produces isPar[twin2, twin1] = 1.0 in the sparse matrix (two 0.5 entries summed), so path tracing yields relatedness 1 between MZ pairs. Users provide a twinID column (and optionally zygosity) and pass mz_twins=TRUE to ped2add()/ped2com(). DZ twins are left unchanged when zygosity column is present. https://claude.ai/code/session_01P3RQTYpWtAtheSqi4aPjR5 * Replace pedigree redirection with r2 column merge for MZ twins The previous approach redirected twin2's parents to twin1 in the pedigree, which inflated twin2's diagonal (1.5 instead of 1.0) and twin2-to-child relatedness (0.75 instead of 0.5). New approach: after path tracing but before tcrossprod, merge twin2's column into twin1's in the r2 matrix. MZ twins share the same genetic source, so this correctly produces: - MZ twin relatedness = 1.0 - Self-relatedness = 1.0 (no inflation) - Parent-child and all downstream values correct - No post-hoc diagonal patching needed https://claude.ai/code/session_01P3RQTYpWtAtheSqi4aPjR5 * Use symmetric column merge for MZ twins instead of zeroing Both twin columns now get the same normalized values (r2_merged = (col1 + col2) / sqrt(2)) so both twins remain present and contribute equally. Produces the same final relatedness matrix as the zero approach but without erasing one twin from the genetic source matrix. https://claude.ai/code/session_01P3RQTYpWtAtheSqi4aPjR5 * Merge-then-restore approach for MZ twins Temporarily absorb twin2's column into twin1's before tcrossprod, then copy twin1's row/col back to twin2 afterward. This keeps the computation correct while ensuring neither twin is erased from the final relatedness matrix. https://claude.ai/code/session_01P3RQTYpWtAtheSqi4aPjR5 * Accept lowercase 'mz' and add MZ twin tests Treat both "mz" and "MZ" as monozygotic in findMZtwins (zygosity check now uses %in% c("mz","MZ")). Minor formatting tweak to the verbose message. Added unit tests (tests/testthat/test-buildComponent.R) verifying that MZ twins are coded with relatedness 1 when mz_twins=TRUE, that siblings remain 0.5 when mz_twins=FALSE, self-relatedness stays 1, and parent-child relatedness is unchanged. * fix tests --------- Co-authored-by: Claude <noreply@anthropic.com>
Introduce mz_method (default "merge_before_tcrossprod") to ped2com and gate MZ twin column-merge/restore logic behind this option. Restrict the merge/restore behavior to the additive component, adjust verbose messages, and add a TODO outlining an alternative MZ handling approach. Update tests to cover MZ twin child relatedness cases and add clarifying comments for ped2fam string/numeric ID behavior.
Expose a new mz_method argument in ped2add (default 'addtwins') and forward it to ped2com. Update tests to pass mz_method = 'merging' when mz_twins is used so alternative MZ-twin handling is exercised. This lets callers choose the method for handling MZ twins without changing the default behavior.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds support for handling monozygotic (MZ) twins in relatedness-matrix generation (and associated documentation/tests), along with minor test/doc updates related to simulation “beta/optimized” modes.
Changes:
- Introduce
mz_twins/mz_methodplumbing inped2com()/ped2add()and add an internalfindMZtwins()helper. - Extend pedigree segmentation wrappers to accept a
twinIDargument (documentation updated accordingly). - Expand/adjust test coverage for MZ twin behavior and update simulation beta-mode expectations/docs.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| R/buildComponent.R | Adds MZ twin handling options to relatedness computation (mz_twins, mz_method). |
| R/constructAdjacency.R | Adds internal findMZtwins() helper used to locate MZ co-twin pairs. |
| R/segmentPedigree.R | Adds twinID parameter plumbed through family/line segmentation wrappers and ped2graph(). |
| tests/testthat/test-buildComponent.R | Adds tests asserting MZ co-twin relatedness = 1 (and downstream effects) for two methods. |
| tests/testthat/test-simulatePedigree.R | Adjusts simulation expectations for beta/optimized behavior and formatting. |
| tests/testthat/test-segmentPedigree.R | Adds clarifying comments for numeric vs string ID behavior in ped2fam(). |
| man/ped2com.Rd | Documents new mz_twins argument (but not mz_method). |
| man/ped2add.Rd | Documents new mz_twins argument (but not mz_method). |
| man/findMZtwins.Rd | Adds internal documentation page for findMZtwins(). |
| man/ped2fam.Rd | Adds twinID parameter documentation. |
| man/ped2graph.Rd | Adds twinID parameter documentation. |
| man/ped2maternal.Rd | Adds twinID parameter documentation. |
| man/ped2paternal.Rd | Adds twinID parameter documentation. |
| man/simulatePedigree.Rd | Expands beta documentation to include character aliases and reproducibility notes. |
| man/buildWithinGenerations.Rd | Expands beta documentation similarly for within-generation building. |
| man/buildBetweenGenerations.Rd | Expands beta documentation similarly for between-generation building. |
| man/adjustKidsPerCouple.Rd | Expands beta documentation similarly for kids-per-couple adjustment. |
Add cross-method validations for MZ twin handling. In data-raw/optimizing.R create r_mz1 and r_mz2 using ped2add with mz_method 'merging' and 'addtwins' and assert their sparse matrix internals (@i and @x) match. Update tests/testthat/test-buildComponent.R to loop over mz_method options when verifying parent/child relatedness for MZ twins and add an explicit equality assertion between the two method results. These changes ensure different mz_method implementations produce equivalent relatedness outputs.
fb20cde to
4d841f0
Compare
56479d0 to
8c0d792
Compare
Allow mz_method to be either "merging" or "addtwins" in key pedigree-processing branches and only perform parent redirection when mz_method == "merging". Move verbose message into the merging branch and simplify restoration checks for MZ pairs. Add a new test that verifies equivalence of additive matrices for merging vs addtwins (including sparse-matrix slot checks). Also fix an expectation bound in simulatePedigree tests. Files changed: R/buildComponent.R, tests/testthat/test-buildComponent.R, tests/testthat/test-simulatePedigree.R.
Create mz_id_pairs earlier when MZ twin pairs exist and streamline the merging branch to only run for mz_method == "merging". Remove redundant mz_id_pairs construction, keep logic that makes the second twin a founder and redirects its children to the first twin, and preserve the verbose merge message. Add object.size() calls in the test to inspect memory usage of intermediate matrices.
) Two issues caused the sparse matrices from the two mz_method paths to differ structurally: 1. The merging method iterated over mz_pairs (row indices) but treated the values as IDs. Changed to iterate over mz_id_pairs which holds the actual pedigree IDs. Currently masked because test data has sequential IDs matching row positions, but would break with non-sequential IDs. 2. The different computational paths (column-merge-before-tcrossprod vs pedigree-modify-then-post-copy) leave different explicit zeros in the sparse matrix. Added Matrix::drop0() before returning to normalize the sparse representation so both methods produce structurally identical results. https://claude.ai/code/session_01PXbgV7bdA9sg7St6xBbg3T Co-authored-by: Claude <noreply@anthropic.com>
Only restore MZ twin rows/cols in buildComponent when mz_method is merging/addtwins AND the component is 'additive' (buildComponent.R). In constructAdjacency.R adjust the .adjIndexed parameter layout and add a new isTwin helper to detect twins from the pedigree. Update tests to call ped2add without explicit mz_method, comment out object.size checks, and compare sparse adjacency matrices by converting to dense (as.matrix) and asserting their difference sums to zero to avoid incorrect sparse-matrix summation.
* Fix twin matrix size mismatch between addtwins and merging methods Two issues caused the sparse matrices from the two mz_method paths to differ structurally: 1. The merging method iterated over mz_pairs (row indices) but treated the values as IDs. Changed to iterate over mz_id_pairs which holds the actual pedigree IDs. Currently masked because test data has sequential IDs matching row positions, but would break with non-sequential IDs. 2. The different computational paths (column-merge-before-tcrossprod vs pedigree-modify-then-post-copy) leave different explicit zeros in the sparse matrix. Added Matrix::drop0() before returning to normalize the sparse representation so both methods produce structurally identical results. https://claude.ai/code/session_01PXbgV7bdA9sg7St6xBbg3T * Fix sparse matrix class mismatch between MZ twin methods The merging method's post-copy (r[idx2,] <- r[idx1,]) causes Matrix to coerce from dsCMatrix (symmetric, one triangle) to dgCMatrix (general, both triangles), doubling @i/@x slot lengths. Add forceSymmetric() after the post-copy to restore symmetric storage. Also guard the merging pedigree modification with a component check (additive only) to match the restoration block, and move the verbose message inside the merging if-block so it only prints when merging actually occurs. https://claude.ai/code/session_01PXbgV7bdA9sg7St6xBbg3T * Fix test assertion: use sum(A - B) instead of sum(A, B) sum(A, B) sums all elements of both matrices, which can never be 0 for non-negative relatedness values. Use sum(A - B) to check that the element-wise differences are zero. https://claude.ai/code/session_01PXbgV7bdA9sg7St6xBbg3T --------- Co-authored-by: Claude <noreply@anthropic.com>
Refactor and expand the optimize/benchmark script to better compare mz handling and beta variants: changed seed, increased kpc, marR and reps; pre-generate pedigrees for 1/low/mid/high generation sizes; add cases for mz_method ("merging" vs "addtwins") and beta_null; adjust gen factor mapping and use beta_factor in regressions and plots. Also added assertions in simulatePedigree tests to verify that parents exist for all generations >1 (and none in generation 1), fixed a spacing/logic expression in an existing length check, and kept existing sex-ratio and generation checks. These changes improve performance analysis coverage and strengthen structural tests.
7970283 to
16ae62c
Compare
Fix call in buildComponent.R to pass mz_row_pairs through to fuseTwins. Enhance fuseTwins by adding a test_df_twins parameter to return the intermediate df_twins for testing, refactoring the fusion-condition logic, validating inputs, and emitting clearer messages when no MZ twin pairs are found. Add unit tests (tests/testthat/test-helpTwins.R) to exercise findMZtwins/fuseTwins with different return modes and ensure consistent outputs.
b725942 to
cd677af
Compare
Introduce a beta benchmarking flag and streamline MZ twin handling across the package: remove per-pair loops in ped2com in favor of batch operations, expose beta in documentation (R and Rd updates), and ensure rows/cols are copied in batch. Add a new fuseTwins helper (with man page) to support fusing MZ pairs for path tracing, and expand findMZtwins/help docs to describe return formats and the beta option. Update data-raw/optimizing.R to incorporate the beta factor into models/plots, comment out some high-gen benchmark cases, switch to boxplots, and adjust plot scales. Remove an unused vignette figure PNG.
ec6e074 to
9d45d96
Compare
Apply code-style and whitespace cleanup across twin-related code and tests for readability and consistency. Changes include reflowing long function calls onto multiple lines, normalizing comma/paren placement, adding braces around single-line if/next for clarity, and removing extraneous blank lines. Updated files: R/buildComponent.R, R/helpTwins.R, and tests/testthat/test-buildComponent.R and test-helpTwins.R. These edits are non-functional and intended only to improve maintainability and style.
b4e3b8e to
db03121
Compare
db03121 to
ba12485
Compare
6825c1e to
51c5f50
Compare
b72be1f to
9dccf8a
Compare
d05f664 to
10fcfc2
Compare
Allow specifying only one twin or mate ID and auto-find a suitable partner. makeTwins now builds an ID-to-row map for O(1) lookups and supports cases where only ID_twin1 or only ID_twin2 is provided (selects sibling by zygosity/sex). makeInbreeding gains symmetric logic to auto-select ID_mate2/ID_mate1 when only one mate is given. dropLink standardizes pedigree column names (with an optional verbose flag) before processing. Updated tests to cover these single-ID behaviors and adjusted an existing MZ twin test; NEWS updated to mention the tweakPedigree change.
7c99169 to
6ae78ef
Compare
Allow specifying only one twin or mate ID and auto-find a suitable partner. makeTwins now builds an ID-to-row map for O(1) lookups and supports cases where only ID_twin1 or only ID_twin2 is provided (selects sibling by zygosity/sex). makeInbreeding gains symmetric logic to auto-select ID_mate2/ID_mate1 when only one mate is given. dropLink standardizes pedigree column names (with an optional verbose flag) before processing. Updated tests to cover these single-ID behaviors and adjusted an existing MZ twin test; NEWS updated to mention the tweakPedigree change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Template
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (reference issue number here, if any)
Type of change
Please delete options that are not relevant (including this line).
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test Configuration:
Checklist:
Requirements Met:
Recommendations Met:
While these are not absolutely required for a contribution via pull request, we
do encourage these items are completed.