New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse global order reader: refactor merge algorithm. #3173
Conversation
While running tests with real world array, I noticed that the optimization trying to find the the length of a cell slab once a cell to be merged was found didn't work. Trying to use a binary search across the whole tile resulted in more comparisons than a linear search since the cell slab lengths are never that long. Also, next_cell in ResultCoords didn't use the bitmap, which caused some issues, so ResultCoords was split into two classes, ResultCoords and GlobalOrderResultCoords (which has access to the bitmap). This also allowed to push some of the logic of the merge down into that class to simplify graetly the merge function. --- TYPE: IMPROVEMENT DESC: Sparse global order reader: refactor merge algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactoring and the fixes! Readability has improved a lot!
A few comments to verify my understanding/correctness and a few about adding UTs for those nice testable new classes you have added.
std::vector<BitmapType> bitmap_; | ||
|
||
/** Number of cells in this bitmap. */ | ||
uint64_t bitmap_result_num_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make those attributes private and access/update through member functions so that we can actually follow/test the behavior of that class?
Same for GlobalOrderResultTile
When this is done, we should also consider adding UTs, again there are methods that are safer to validate by UT than by reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some, but for ResultTileWithBitmap, it would be a change too big for a patch release so I filed the following: https://app.shortcut.com/tiledb-inc/story/17815/make-resulttilewithbtimap-more-opaque
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not that some unit tests are already present in unit-result-tile.cc.
41c3a0a
to
21c3d7d
Compare
21c3d7d
to
10182eb
Compare
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release-2.9 release-2.9
# Navigate to the new working tree
cd .worktrees/backport-release-2.9
# Create a new branch
git switch --create backport-3173-to-release-2.9
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick --mainline 1 1b9a20e85416f7c8fd8286d2352306b652a91fda
# Push it to GitHub
git push --set-upstream origin backport-3173-to-release-2.9
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release-2.9 Then, create a pull request where the |
* Sparse global order reader: refactor merge algorithm. While running tests with real world array, I noticed that the optimization trying to find the the length of a cell slab once a cell to be merged was found didn't work. Trying to use a binary search across the whole tile resulted in more comparisons than a linear search since the cell slab lengths are never that long. Also, next_cell in ResultCoords didn't use the bitmap, which caused some issues, so ResultCoords was split into two classes, ResultCoords and GlobalOrderResultCoords (which has access to the bitmap). This also allowed to push some of the logic of the merge down into that class to simplify graetly the merge function. --- TYPE: IMPROVEMENT DESC: Sparse global order reader: refactor merge algorithm. * Addressing feedback from @ypatia. * Addressing feedback from @ypatia, part 2.
* Sparse global order reader: refactor merge algorithm. While running tests with real world array, I noticed that the optimization trying to find the the length of a cell slab once a cell to be merged was found didn't work. Trying to use a binary search across the whole tile resulted in more comparisons than a linear search since the cell slab lengths are never that long. Also, next_cell in ResultCoords didn't use the bitmap, which caused some issues, so ResultCoords was split into two classes, ResultCoords and GlobalOrderResultCoords (which has access to the bitmap). This also allowed to push some of the logic of the merge down into that class to simplify graetly the merge function. --- TYPE: IMPROVEMENT DESC: Sparse global order reader: refactor merge algorithm. * Addressing feedback from ypatia. * Addressing feedback from ypatia, part 2.
While running tests with real world array, I noticed that the
optimization trying to find the the length of a cell slab once a cell
to be merged was found didn't work. Trying to use a binary search across
the whole tile resulted in more comparisons than a linear search since
the cell slab lengths are never that long. Also, next_cell in
ResultCoords didn't use the bitmap, which caused some issues, so
ResultCoords was split into two classes, ResultCoords and
GlobalOrderResultCoords (which has access to the bitmap). This also
allowed to push some of the logic of the merge down into that class to
simplify graetly the merge function.
TYPE: IMPROVEMENT
DESC: Sparse global order reader: refactor merge algorithm.