Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accumulo Bitmask Optimization #742

Merged
merged 11 commits into from
Apr 30, 2016
Merged

Accumulo Bitmask Optimization #742

merged 11 commits into from
Apr 30, 2016

Conversation

dcy2003
Copy link
Contributor

@dcy2003 dcy2003 commented Apr 28, 2016

*/
public static void setFieldIds(
final IteratorSetting setting,
final DataAdapter<?> adapterAssociatedWithFieldIds,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid question ... if someone adds a new field in the future, it will always be added at the end, right? Like, there's no chance that it could be added somewhere in the middle due to its type or name or ordering or something, and mess up the bitmasks?

Also, what happens to the bitmask AND operation earlier in this file, if the length of the bitmasks is different?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question.. How about MSB/LSB and Endian-ness? If I have a client on windows and a accumulo iterator running on linux, does the bitmask implementation you're using translate everything to the same Endian-ness?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the and operation on different length byte arrays, the operation produces the smallest length as the result - essentially if bytes are missing at the end they are assumed 0 or missing fields which should accurately reflect the behavior.

If someone were to add fields, correct, it would have to be at the end.

I don't think endian-ness comes into play here - its always bitwise operations on raw bytes without any translation to numbers. As long as the bytes come across consistently in serialization/deserialization (which they will) after that point its just bitwise operations.

@datasedai
Copy link
Contributor

Did you re-test the other DistributableQueryFilters like the DateRange, or TextQuery? etc? Do they still work?

@datasedai
Copy link
Contributor

...Reviewed ... Didn't see anything.

@rfecher rfecher merged commit dfe997d into 0.9.1 Apr 30, 2016
@rfecher rfecher deleted the bitmask-rebase-0.9.1 branch April 30, 2016 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants