-
Notifications
You must be signed in to change notification settings - Fork 79
Remap individual ids after simplification #1186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Er, yes, actually, I think that basically does it. There'll be some subtleties, but that should be the fundamental change we need, since we're already remapping the individual IDs. What we really want here to test this stuff out is a way of simulating individuals in the simple forward simulator we have. I think we should be able to add an option That way we can get some decent sized test data without having to hand-craft tree sequences. |
Codecov Report
@@ Coverage Diff @@
## main #1186 +/- ##
=======================================
Coverage 93.72% 93.72%
=======================================
Files 26 26
Lines 21507 21511 +4
Branches 904 904
=======================================
+ Hits 20157 20161 +4
Misses 1312 1312
Partials 38 38
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
I've had a stab at that in 0ad085b adding the same level of testing that migrations have - which seems to just be counting the rows. |
Nice! I'd need to take the code for a spin to be sure I know what's happening, but can't right now (should have started this presentation yesterday...) |
No worries - I'm working on the c side now. |
|
Working on it now - will tack in a commit with some updates to the Python code. |
|
@benjeffery I pushed a commit with some updates. There was a slight logic bug in the individual ID remapping and I fixed up the tests a bit. The test refactoring could probably be done better, but it's a big improvement over what it was anyway. |
|
So the tests to be done are now to see if the identity relationships between nodes and individuals are retained through simplification. The easiest way to do that I think is to bung in some metadata into both, simplify, and then see if every node's individual has the same metadata as the node. Then, for those parents that happen to survive simplification (which probably won't be many, most will be associated with unary nodes), the parent relationships should be the same. For |
14ae365 to
e78192c
Compare
|
This is going to be so much easier to test now that we have metadata! |
4acb3a5 to
678b6b3
Compare
|
I've added some tests using the forward sim. |
|
Ok, there is one last thing to decide here - in #1192 @jeromekelleher suggested simplify should cope with unordered individuals. The current re-referencing code in this PR only works with sorted individuals as it works in the same pass building the new table. It's no biggie to add another pass over the individual table though. |
|
@petrelharp, your call here. A second linear pass through the individuals is no biggie IMO, and probably worth not enforcing a full topological sort on the input data for. |
|
Sounds good - let's do the second pass. SLiM currently might have individuals before their parents (because the 'remembered' ones come up top of the table, so if a child is remembered and its parent is not...), and we could change that, but it might actually be a can of worms (because (a) if a child gets remembered before a parent, they'd be out of order; or (b) when we output, we stick the current generation on to the individual table, with the remembered indiviiduals up top; to enforce the order in these two situations we'd have to do the topological sort...). |
fdaa464 to
307226e
Compare
|
@jeromekelleher I've added tests with shuffled individuals (that triggered |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some minor suggestions to reduce the amount of code.
8ac388f to
9e62d17
Compare
|
@jeromekelleher Comments addressed, thanks for pointing out the existing shuffle method, I should have had a look for one! |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
9e62d17 to
40c4b80
Compare
@jeromekelleher Just trying to get my head round this by adding it to the python simplify first. Is it this simple if we aren't worrying about consistency with the genetic info?