-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular incremental trees #2786
Modular incremental trees #2786
Conversation
Sounds great. IMO, it is important to have that core logic separated from updating all those tree arrays! |
Codecov Report
@@ Coverage Diff @@
## main #2786 +/- ##
==========================================
+ Coverage 89.93% 89.96% +0.02%
==========================================
Files 30 30
Lines 29171 29274 +103
Branches 5697 5706 +9
==========================================
+ Hits 26236 26335 +99
- Misses 1666 1669 +3
- Partials 1269 1270 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
80649a6
to
158ed1a
Compare
158ed1a
to
3cfab54
Compare
I think this is ready for review and merging now. The idea is to add a new class I've implemented the basic next/prev operations here in Python and C, and used the core code to replace the bits of the library code that use the One key thing that needs to be mentioned here is that I've incorporated quite a clean way of seeking to arbitrary locations along the sequence while touching only the edges that need to be changed in order to transform the current tree into the target tree. However this has raise this issue that these edges are not necessarily inserted in time-increasing order. This needs to be thought about a bit, which is why I haven't gone ahead and switched over to using this code for the @molpopgen - any chance you could take a look and see what you think? |
I'll take a look, but it'll be next week. |
@duncanMR good to get your eyes on this also please |
3cfab54
to
3843e87
Compare
@jeromekelleher I will start examining the code today. Have you found a test case where the edges aren't inserted in time-decreasing order? |
Excellent. I haven't picked out specific examples where the edges aren't time sorted in seek_forward, and that would be a great thing to do in a follow up. I stopped working on the random seek aspect once I realised the sortedness problem, and decided to focus on getting the basic code merged so we could start using it. So it's in a half finished state at the moment |
Let's see if I got this:
|
Yep - but without having to allocate any extra memory because we're using the existing index arrays to represent the ranges.
This is subtle. The input/output arrays are sorted by position and time, which guarantees that we insert/remove the edges in the correct order for adjacent trees. However, when we're seeking to non-adjacent trees (or randomly into the middle of the sequence, like seek_from_null) we lose this guarantee. |
Can I get a review here please @benjeffery? I want to merge this so downstream stuff can go in too |
@Mergifyio rebase |
✅ Branch has been successfully rebased |
3843e87
to
2c264be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, just a couple of small comments.
|
||
|
||
@dataclasses.dataclass | ||
class Interval: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the existing Interval
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just simpler to keep track of what it does for this kind of prototype/test code, and not have to worry about external APIs and so on
left_current_index = self->in.stop; | ||
right_current_index = self->out.stop; | ||
} else { | ||
left_current_index = self->out.stop + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some missing coverage here and below.
Sure, will do! |
Try to centralise our tree positioning code by bringing the logic into
TreePosition
class. See #2778 for discussion.@molpopgen - the idea here is to try and pull out all the of the code that we use in the
tsk_tree_t
class for tree positioning into this class, so that it can then be shared by incremental algorithms as well. We should then be able to do efficient skipping to a particular tree, which will be useful for parallel algorithms.WIP