-
Notifications
You must be signed in to change notification settings - Fork 78
Refactor python vargen #2172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor python vargen #2172
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2172 +/- ##
=======================================
Coverage 93.28% 93.28%
=======================================
Files 27 27
Lines 26059 26077 +18
Branches 1163 1165 +2
=======================================
+ Hits 24308 24325 +17
- Misses 1721 1722 +1
Partials 30 30
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
Here are some timings for iterating through all the variants in this tree:
Somewhat disappointing, not where the extra time is coming from yet. |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM
We should either nuke the low level VariantGenerator object as part of this, or register an issue for it so we don't forget.
Oh, I missed this. Hmm, interesting. Try running through |
c34d5bc to
a691242
Compare
On reflection, I realised this isn't a fair comparison as So this means that we've clawed back to much closer to the 8bit genotypes, which is nice. |
a691242 to
2815fbe
Compare
|
Aha, great. |
9d0cbcb to
5b715ea
Compare
|
@jeromekelleher I think this is ready to go - the breakage in the test suite was more than I expected, mostly due to removing |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Some minor comments.
python/CHANGELOG.rst
Outdated
| that applying a schema to an existing table will no longer necessitate modifying the | ||
| existing rows. (:user:`benjeffery`, :issue:`2064`, :pr:`2104`) | ||
|
|
||
| - ``tree.mrca`` now takes 2 or more arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related, but Tree.mrca isn't a breaking change it's a new feature so may as well move while we're updating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - ``tree.mrca`` now takes 2 or more arguments. | ||
| (:user:`savitakartik`, :issue:`1340`, :pr:`2121`) | ||
|
|
||
| - Remove the previously deprecated ``as_bytes`` argument to ``TreeSequence.variants``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's super old, but it would be good to be quantitative about when it was deprecated (version or date, I guess)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing something like this does make this a major version bump, so I guess we should update our milestones accordingly (either 0.5 or 1.0 I guess)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should give some guidance on what people should do as well I suppose. If you do use as_bytes, how do you fix your code now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's from the msprime days! 983d969
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we don't need to worry too much then.
I guess the functionality is actually pretty handy though, so I've opened a new issue #2181. We should update this note to tell users to use this method instead. The as_macs method can then use this function too.
|
@jeromekelleher All fixed up - needs a squash before merge. |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
See new issue for the proposed replacement for as_bytes
| - ``tree.mrca`` now takes 2 or more arguments. | ||
| (:user:`savitakartik`, :issue:`1340`, :pr:`2121`) | ||
|
|
||
| - Remove the previously deprecated ``as_bytes`` argument to ``TreeSequence.variants``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we don't need to worry too much then.
I guess the functionality is actually pretty handy though, so I've opened a new issue #2181. We should update this note to tell users to use this method instead. The as_macs method can then use this function too.
277631d to
6ab2ae8
Compare
6ab2ae8 to
b061737
Compare
Stacked on #2169
A proposal for how
TreeSequence.variantsand the Variants class will look. No docs or any changes to tests yet. Note that theas_bytesoption has been removed for now - we can easily reinstate it. The other breaking change is that the genotype array onVariantis now read-only. I think this makes sense, it technically could be writeable though.