-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new int64 add merge operator, also with Java API #11202
base: main
Are you sure you want to change the base?
Add a new int64 add merge operator, also with Java API #11202
Conversation
f99d536
to
568db6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would appreciate if reviewers could help decide whether this is a useful library for customer API or whether other facilities can do what this does. Can clients always merge() (i.e. never put()) and fetch a merged value back via get() ? I don't know the details of RocksDB well enough to answer this.
The RocksDB user needs to encode their data for going into Merge (or Put) and coming out from Get, so the user does need serialization / deserialization compatible with the operator.
b09dd98
to
fbbb6f8
Compare
@dannyhchen has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
fbbb6f8
to
47d81d1
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
@anand1976 @adamretter rebased and once again ready for review/import. |
@anand1976 rebased and once again ready for review/import. |
47d81d1
to
f7abeed
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
f7abeed
to
cde1f97
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
@anand1976 I have once again rebased this and it is ready for review/import. |
cde1f97
to
f5e7bd9
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
2 similar comments
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
d228702
to
84424f4
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
84424f4
to
cfad4ca
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
cfad4ca
to
f0e9a37
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
Build a Java API for the new Int64AddMergeOperator that looks like all the other merge operators. Add tests. Individual operator tests which are part of merge_operators_test shouldn’t be in TEST_MAIN_SOURCES Add int64 merge operator to the tests which mention the other merge operators.
Include the header that says `extern “C” { … `
Implement encoding and zigzag in Java so that we can test it more thoroughly.
Exporting the int64 merge operator to Java requires support for encoding to make it useful. It also requires that support in order to test it from Java.
Accidentally re-introduced deleted loading code when rebasing. Now removed that again, and added the loading of the new int64add merge operator into the new loading code. Moved the new int64add operator into ROCKSDB_NAMESPACE because that’s where the other merge operators are. Split the class into .h and .cc
In the context in which we use them (values in RocksDB (key,value)-pairs, value lengths are always known, so they do not need a continuation bit and a 7-bit encoding, but can be encoded more efficiently as an 8-bit encoding. This has the added advantage of being consistent with fixed size encodings, which can be seen as having trailing-zeros in the most significant bytes, which are truncated in this encoding.
Fix the test inheritance
$ python3 buckifier/buckify_rocksdb.py
The problem only shows up when ASSERT_STATUS_CHECKED=1, i.e. running changed merge_operators_test under CI
make format doesn’t format this but does complain
Don’t need to turn them into zigzag first. There end up being 2 branches in encoding anyway, but there’s perhaps a little less actual code all told.
Changed format in core C++ 8 bit varint encoding, so the Java shadow implementation (used for testing) needs to work the same way.
stdout or stderr was causing a decoding error. Make the error mode “backslashreplace” instead of the default “strict” so we have a chance of seeing the output.
7e762e8
to
16502aa
Compare
@alanpaxton has updated the pull request. You must reimport the pull request before landing. |
Supercedes #6884 and #6877 (cc @adamretter )
This branch/PR is based on a copy of the code in #6884, which given the age of the PR we chose to restart, though the core of the merge operator is carried over.
This merge operator adds int64_t values. At a cursory glance, this may appear similar to UInt64AddOperator, but this merge operator handles signed numbers. It can also handle both Addition or Subtraction by merging either positive or negative integers.
Values are stored into the database using 8-bit variable-byte encoding (as suggested by @pdillinger). Because the length is known, we do not require a bit flag for continuation. So for smaller positive numbers the value stored has the advantage of also consuming less bytes than UInt64AddOperator. The most significant stored byte is a 2s complement value which is sign-extended on decoding.
Therefore the range -128..127 can be stored in 1 bytes, -32768..32767 in 2 bytes, etc.
Zig-zag encoded values are in turn encoded as varints in the data. Varints now implement the 8-bit variable-length encoding (known length) suggested by @pdillinger.
A Java API to this merge operator has been added. In addition, a small Java utility is provided which implements the same encoding as the merge operator.
Changing the UInt64MergeOperator to use a similar varint encoding would require that a "new" operator be created, in order to avoid backward compatibility breaking.
merge_operators_test
binary which combines the GTEST for this merge operator with that of stringappend. This addresses your concerns re additional test binaries.