-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accumulo Bitmask Optimization #742
Conversation
…ividual bit masks
*/ | ||
public static void setFieldIds( | ||
final IteratorSetting setting, | ||
final DataAdapter<?> adapterAssociatedWithFieldIds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stupid question ... if someone adds a new field in the future, it will always be added at the end, right? Like, there's no chance that it could be added somewhere in the middle due to its type or name or ordering or something, and mess up the bitmasks?
Also, what happens to the bitmask AND operation earlier in this file, if the length of the bitmasks is different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question.. How about MSB/LSB and Endian-ness? If I have a client on windows and a accumulo iterator running on linux, does the bitmask implementation you're using translate everything to the same Endian-ness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the and operation on different length byte arrays, the operation produces the smallest length as the result - essentially if bytes are missing at the end they are assumed 0 or missing fields which should accurately reflect the behavior.
If someone were to add fields, correct, it would have to be at the end.
I don't think endian-ness comes into play here - its always bitwise operations on raw bytes without any translation to numbers. As long as the bytes come across consistently in serialization/deserialization (which they will) after that point its just bitwise operations.
Did you re-test the other DistributableQueryFilters like the DateRange, or TextQuery? etc? Do they still work? |
...Reviewed ... Didn't see anything. |
#563