Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java API consistency between RocksDB.put() , .merge() and Transaction.put() , .merge() #11019

Closed

Conversation

alanpaxton
Copy link
Contributor

@alanpaxton alanpaxton commented Dec 6, 2022

Implement new Java API get()/put()/merge() methods, and transactional variants.

The Java API methods are very inconsistent in terms of how they pass parameters (byte[], ByteBuffer), and what variants and defaulted parameters they support. We try to bring some consistency to this.

  • All APIs should support calls with ByteBuffer parameters.
  • Similar methods (RocksDB.get() vs Transaction.get()) should support as similar as possible sets of parameters for predictability.
  • get()-like methods should provide variants where the caller supplies the target buffer, for the sake of efficiency. Allocation costs in Java can be significant when large buffers are repeatedly allocated and freed.

API Additions

  1. RockDB.get implement indirect ByteBuffers. Added indirect ByteBuffers and supporting native methods for get().
  2. RocksDB.Iterator implement missing (byte[], offset, length) variants for key() and value() parameters.
  3. Transaction.get() implement missing methods, based on RocksDB.get. Added ByteBuffer.get with and without column family. Added byte[]-as-target get.
  4. Transaction.iterator() implement a getIterator() which defaults ReadOptions; as per RocksDB.iterator(). Rationalize support API for this and RocksDB.iterator()
  5. RocksDB.merge implement ByteBuffer methods; both direct and indirect buffers. Shadow the methods of RocksDB.put; RocksDB.put only offers ByteBuffer API with explicit WriteOptions. Duplicated this with RocksDB.merge
  6. Transaction.merge implement methods as per RocksDB.merge methods. Transaction is already constructed with WriteOptions, so no explicit WriteOptions methods required.
  7. Transaction.mergeUntracked implement the same API methods as Transaction.merge except the ones that use assumeTracked, because that’s not a feature of merge untracked.

Support Changes (C++)

The current JNI code in C++ supports multiple variants of methods through a number of helper functions. There are numerous TODO suggestions in the code proposing that the helpers be re-factored/shared.

We have taken a different approach for the new methods; we have created wrapper classes JDirectBufferSlice, JDirectBufferPinnableSlice, JByteArraySlice and JByteArrayPinnableSlice RAII classes which construct slices from JNI parameters and can then be passed directly to RocksDB methods. For instance, the Java_org_rocksdb_Transaction_getDirect method is implemented like this:

  try {
    ROCKSDB_NAMESPACE::JDirectBufferSlice key(env, jkey_bb, jkey_off,
                                              jkey_part_len);
    ROCKSDB_NAMESPACE::JDirectBufferPinnableSlice value(env, jval_bb, jval_off,
                                                        jval_part_len);
    ROCKSDB_NAMESPACE::KVException::ThrowOnError(
        env, txn->Get(*read_options, column_family_handle, key.slice(),
                      &value.pinnable_slice()));
    return value.Fetch();
  } catch (const ROCKSDB_NAMESPACE::KVException& e) {
    return e.Code();
  }

Notice the try/catch mechanism with the KVException class, which combined with RAII and the wrapper classes means that there is no ad-hoc cleanup necessary in the JNI methods.

We propose to extend this mechanism to existing JNI methods as further work.

Support Changes (Java)

Where there are multiple parameter-variant versions of the same method, we use fewer or just one supporting native method for all of them. This makes maintenance a bit easier and reduces the opportunity for coding errors mixing up (untyped) object handles.

In order to support this efficiently, some classes need to have default values for column families and read options added and cached so that they are not re-constructed on every method call.

This PR closes #9776

@alanpaxton alanpaxton marked this pull request as draft December 6, 2022 14:30
@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch 6 times, most recently from ca49c01 to 7db31d6 Compare December 19, 2022 11:06
@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch from 9e8129c to 27eaf01 Compare January 9, 2023 08:30
@alanpaxton alanpaxton marked this pull request as ready for review January 9, 2023 11:16
@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch from 848d1e6 to 4b92030 Compare January 26, 2023 12:44
@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch from 4b92030 to b3c2e98 Compare March 6, 2023 11:05
@anand1976
Copy link
Contributor

@alanpaxton Please rebase. A lot of new merge conflicts due to #10951 getting merged, I think.

@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch from b3c2e98 to 9c9fa99 Compare May 30, 2023 07:51
@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch from 9c9fa99 to 2bd1093 Compare July 6, 2023 11:24
@alanpaxton
Copy link
Contributor Author

@anand1976 @adamretter rebased and once again ready for review/import.

@adamretter
Copy link
Collaborator

@anand1976 rebased and once again ready for review/import.

@adamretter adamretter force-pushed the eb-1634-put-merge-txn-consistency branch from 2bd1093 to e5ec8ce Compare July 28, 2023 14:34
@adamretter adamretter force-pushed the eb-1634-put-merge-txn-consistency branch from e5ec8ce to a57d1eb Compare August 16, 2023 16:55
@adamretter
Copy link
Collaborator

@anand1976 rebased and once again ready for review/import.

@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch from a57d1eb to f06e2b7 Compare September 12, 2023 14:22
@alanpaxton
Copy link
Contributor Author

@anand1976 rebased and once again ready for review/import.

@adamretter
Copy link
Collaborator

Hi @ajkr @ltamasi would you be able to get this one merged pleased?

@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch 6 times, most recently from 76c6b46 to eeb799c Compare October 30, 2023 08:43
@alanpaxton
Copy link
Contributor Author

@anand1976 I have rebased, and this is now passing except for the known vs2019 issue on main which @rhubner is addressing. Are you happy to review/merge it at this point ?

MS compiler doesn’t like it.
Wrap steps in try { … } catch (ROCKSDB_NAMESPACE::KVException&) {
}
A KVException is thrown if something goes wrong at any stage, RAII takes care of cleanup in which case we return the appropriate type of error value.
Implement testing for put() to indirect ByteBuffer which needs default CF handle.
Missing get(ro, cf, byte[], byte[]) so add that.
Test that, and add tests for other new get() methods missing tests.
Re-order params in 1 method for consistency.
Fix some holes in RocksDB.defaultColumnFamily; initialization was missed by some open() calls and TransactionDB.
- there’s no assumeTracked flag in mergeUntracked
- Add ByteBuffer methods
- rationalise and clean up native support methods
To conform to the general pattern of get() / merge() … methods we are trying to enforce across the codebase.
Methods to return key / value in provided byte[], in addition to existing method where the API creates a new one.
Move shared CheckBounds method to new BufferUtil class.
Replace with new-style slice wrappers.
Having changed some of the Java APIs to store the default column family handle as a singleton from the DB, it turns out we missed a couple of places where this needed to be copied. Thanks, tests.
Add missing API methods to Transaction.getIterator() class to match those in RocksDB.newIterator(..)
Fix all the C++ support to go through a single JNI method for the Transaction class.
Fix all the C++ support to go through a single JNI method for the RocksDB class.
Writing HISTORY.md for public facing APIs

I discovered that I had not implemented the orthogonal set of methods for Transaction.getForUpdate().
Add consequent tests and documentation comments.

Testing showed some ByteBuffer methods were not updating position/limit appropriately, or at all. Fix them. Fix consequently broken tests.
Allow key / value size parameters
Add byte buffer tests as well as byte[]
Implemented/extended PutBenchmark java/jmh sanity test to confirm that the branch/PR changes do not materially alter put performance.

Remove questionable/unrepro short Mac benchmark from documentation of java/jmh performance tests.

Add BEFORE results
Add AFTER results
happens in quite obvious patterns
@alanpaxton alanpaxton force-pushed the eb-1634-put-merge-txn-consistency branch from 7726049 to 30cd7a6 Compare December 11, 2023 10:12
@facebook-github-bot
Copy link
Contributor

@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jowlyzhang merged this pull request in 5a063ec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement ByteBuffer methods for Transaction in Java API
5 participants