-
Notifications
You must be signed in to change notification settings - Fork 232
Conversation
roaring/containers_btree.go
Outdated
if btc.tree.Len() == 0 { | ||
return 0, nil | ||
} | ||
return btc.tree.First() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will subtly be broken in some contexts by the rowcache code, because I think there are some circumstances where the rowcache code lets a nil container sneak into the btree for a while. I don't think that's necessarily a bug in this code; probably it means that the rowcache code has to be stricter about ensuring that nil containers don't end up in the tree. (It turns out that the thing where an in-place operation can result in a nil container somewhat breaks iterators, though, if we try to delete the nil containers instead of allowing them in temporarily.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I see most bTreeContainers
methods assume btc.tree
is not nil
.First
method has almost the same code with Last
. Doesn't Last
cause any problems with the rowcache code?
EDITED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. We should possibly file a separate issue to check through that. And actually it might have gotten fixed, it might be the SliceContainers that wasn't dealing with it well. My memory is fuzzy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this code is wrong, but it will clash with the rowcache code which I just merged; in some cases, such as removing a bit from the first container, I think First() will be able to produce a nil container, when we really almost certainly want it to produce the first non-empty container.
I should probably fix the rowcache code, in this case.
I've added |
I've implemented filter support, will push it if it useful:
|
Pushed the filter change. We can disable/remove it if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could avoid a lot of the new code in roaring by taking advantage of the existing Bitmap.Iterator. (e.g. Bitmap.Iterator().Next())
I'd prefer that we not add new fields to fragment unless absolutely necessary (it seems to be more of an optimization in this case).
This does bring up the unfortunate omission that we need a fast way to iterate over a bitmap's containers in reverse in order to support Max with filters. Looping from max down to min will be incredibly inefficient for high cardinality, sparse data.
executor.go
Outdated
}, nil | ||
} | ||
|
||
// executeMaxRowShard returns the minimum row ID for a shard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/min/max
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my Neighbors patch included adding Prev to bitmap iterators, and allowed iterating down through containers, but I didn't implement the corresponding iteration down through bits.
updated. |
roaring/roaring.go
Outdated
if v != 0 { | ||
r := bits.LeadingZeros64(v) | ||
return uint16((i-1)*64 + 63 - r) | ||
return uint16(i*64 + 63 - r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the changes to this function purely stylistic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that actually changes the computed value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's exactly the same except that it goes down to 1 instead of 0. I think the loop condition needs to be >=0
now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhh. i missed the "-1" up above, and was just looking at the "i-1" here. nevermind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reversed these changes.
fragment.go
Outdated
@@ -118,6 +118,8 @@ type fragment struct { | |||
|
|||
// Stats reporting. | |||
maxRowID uint64 | |||
minRowID uint64 | |||
hasRowID bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see much benefit to caching hasRowID
and minRowID
on the fragment. They shouldn't be very expensive to look up (without a filter), and this brings up the opportunity for cache invalidation bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved minRowID
to a function.
fragment.go
Outdated
return f.maxRowID, 1 | ||
} | ||
// iterate back from max row ID and return the first that intersects with filter. | ||
for i := f.maxRowID; i >= f.minRowID; i-- { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a TODO to implement reverse container iteration to improve performance here for sparse data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have that implemented already in one of my branches. it wasn't very hard but it also wasn't tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great, thanks!
great! |
Overview
Fixes #1984
Pull request checklist
Code review checklist
This is the checklist that the reviewer will follow while reviewing your pull request. You do not need to do anything with this checklist, but be aware of what the reviewer will be looking for.