-
Notifications
You must be signed in to change notification settings - Fork 78
Mutation inherited state #3277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutation inherited state #3277
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3277 +/- ##
=======================================
Coverage 89.80% 89.81%
=======================================
Files 29 29
Lines 32612 32665 +53
Branches 5974 5982 +8
=======================================
+ Hits 29288 29338 +50
- Misses 1884 1886 +2
- Partials 1440 1441 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
617903d to
e594e8e
Compare
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly concerned about tacking in a ref to the tree sequence to a dataclass here. These lightweight objects might want to be pickled for parallel processing, e.g.. Perhaps better to just compute and store the value at instantiation time?
python/tskit/trees.py
Outdated
| time=time, | ||
| edge=edge, | ||
| metadata_decoder=self.table_metadata_schemas.mutation.decode_row, | ||
| tree_sequence=self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - what does this mean for serialisation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied the pattern from Individual, which we've been using since 8a8e61e in 2022. These can still be pickled as the reference is followed, and the TS pickled too. If you are pickling a collection of these, pickle is smart enough to only store the ts once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would find it surprising that the whole ts gets pickled if I just wanted to pass a simple mutation though. It it that bad if we just store an extra attribute on the object?
I'm not sure the individual thing was such a good idea looking back - we don't have to fix it now, but I don't want to reinforce a questionable decision there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I get what you're saying. Let me think about the options and I'll make a proposal here.
e594e8e to
d9fec2c
Compare
|
@jeromekelleher Hows this? We pre-compute, delaying any errors from that to when the property is actually accessed. I need to add a couple more tests for coverage but I'm pretty happy with this. |
d9fec2c to
e3f2068
Compare
|
Ack, it all gets very messy! Maybe it's easier just do this right, starting at the bottom like this? #3279 Let's back out of the changes to Individual, it's not worth getting into |
2830801 to
b335fe7
Compare
|
@jeromekelleher I've pushed it down to the C API. Now that it is pre-computed in C, we might as well pass the array versionup too? |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Let's update the C API changelog as we're here so we don't forget
c/tskit/trees.c
Outdated
| - sites_ancestral_state_offset[site_id]; | ||
| } else { | ||
| /* Has parent: inherited state is parent's derived state */ | ||
| tsk_id_t parent_id = mutation_parent[mutation_id]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Push variable declaration to the top of the function for consistency
If you like - I doubt it makes any difference though as we'll essentially be doing the same thing (it's still making a copy of the data for the numpy C string dtype). |
24abce7 to
e4e7bff
Compare
|
@jeromekelleher I've moved the array down to CPython, and I've also added it to the JIT class. |
|
Out of curiosity, could you do a timing test on the numpy vs C Python way of computing this array? |
For chr21 of the Quebequios: |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Very nice
e4e7bff to
dec1fa3
Compare
dec1fa3 to
f210492
Compare
Fixes #2631