Merged
Conversation
chrisvittal
requested changes
Dec 12, 2024
Collaborator
chrisvittal
left a comment
There was a problem hiding this comment.
Looks great, I have one suggestion to simplify the parameters you added to the fixture.
e831a77 to
0e05095
Compare
Member
Author
|
Done! |
ehigham
approved these changes
Dec 12, 2024
Member
ehigham
left a comment
There was a problem hiding this comment.
Well done for finding and fixing this so quickly
chrisvittal
approved these changes
Dec 13, 2024
grohli
pushed a commit
to grohli/hail
that referenced
this pull request
Jan 27, 2025
## Change Description Fixes a bug which was reported in https://discuss.hail.is/t/arrayindexoutofboundsexception-using-cdf-combine/4008/6. The bug was caused by an unenforced invariant in the ApproxCDF aggregator state. Roughly, the state consists of a number of "levels" (stored flattened into a single array), where each level contains a number of samples from the data. There is also a notion of the "capacity" of each level. The data structure assumes that whenever the array containing the flattened levels becomes full, there must be at least one level which is above capacity, and can therefore be compacted to free space. In other words, the size of the array must be at least the sum of the capacities of the present levels. In hail-is#13935, we changed the ApproxCDF aggregator to return the raw internal state, and moved the computation of the cdf from the internal state to python. The problem is that we "compress" the states before returning them as hail values by reducing the size of the array of samples to only the number of present samples. But then, when the exposed "combine" function recreates ApproxCDF aggregator states from the returned values, the array of samples is no longer large enough to satisfy the invariant. In this PR, I * add a test case in python which fails in main, by running approx_cdf over a very small dataset * fix `ApproxCDFStateManager.fromData` to "uncompress" the aggregator state * add a check for the violated invariant in the `ApproxCDFStateManager` constructor. (This isn't a perfect check, as the underlying state can be changed after construction, but I didn't see a way to make a more complete check without significant refactoring, and this would have caught the current bug.) ## Security Assessment - This change has no security impact ### Impact Description Low-level refactoring of non-security code
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Change Description
Fixes a bug which was reported in https://discuss.hail.is/t/arrayindexoutofboundsexception-using-cdf-combine/4008/6.
The bug was caused by an unenforced invariant in the ApproxCDF aggregator state. Roughly, the state consists of a number of "levels" (stored flattened into a single array), where each level contains a number of samples from the data. There is also a notion of the "capacity" of each level. The data structure assumes that whenever the array containing the flattened levels becomes full, there must be at least one level which is above capacity, and can therefore be compacted to free space. In other words, the size of the array must be at least the sum of the capacities of the present levels.
In #13935, we changed the ApproxCDF aggregator to return the raw internal state, and moved the computation of the cdf from the internal state to python. The problem is that we "compress" the states before returning them as hail values by reducing the size of the array of samples to only the number of present samples. But then, when the exposed "combine" function recreates ApproxCDF aggregator states from the returned values, the array of samples is no longer large enough to satisfy the invariant.
In this PR, I
ApproxCDFStateManager.fromDatato "uncompress" the aggregator stateApproxCDFStateManagerconstructor. (This isn't a perfect check, as the underlying state can be changed after construction, but I didn't see a way to make a more complete check without significant refactoring, and this would have caught the current bug.)Security Assessment
Impact Description
Low-level refactoring of non-security code