JMH microbenchmarks for RocksJava #6241

adamretter · 2019-12-23T17:16:57Z

This is the start of some JMH microbenchmarks for RocksJava.

Such benchmarks can help us decide on performance improvements of the Java API.

At the moment, I have only added benchmarks for various Comparator options, as that is one of the first areas where I want to improve performance. I plan to expand this to many more tests.

Details of how to compile and run the benchmarks are in the README.md.

A run of these on a XEON 3.5 GHz 4vCPU (QEMU Virtual CPU version 2.5+) / 8GB RAM KVM with Ubuntu 18.04, OpenJDK 1.8.0_232, and gcc 8.3.0 produced the following:

# Run complete. Total time: 01:43:17

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                         (comparatorName)   Mode  Cnt       Score       Error  Units
ComparatorBenchmarks.put                           native_bytewise thrpt   25   122373.920 ±  2200.538  ops/s
ComparatorBenchmarks.put              java_bytewise_adaptive_mutex thrpt   25    17388.201 ±  1444.006  ops/s
ComparatorBenchmarks.put          java_bytewise_non-adaptive_mutex thrpt   25    16887.150 ±  1632.204  ops/s
ComparatorBenchmarks.put       java_direct_bytewise_adaptive_mutex thrpt   25    15644.572 ±  1791.189  ops/s
ComparatorBenchmarks.put   java_direct_bytewise_non-adaptive_mutex thrpt   25    14869.601 ±  2252.135  ops/s
ComparatorBenchmarks.put                   native_reverse_bytewise thrpt   25   116528.735 ±  4168.797  ops/s
ComparatorBenchmarks.put      java_reverse_bytewise_adaptive_mutex thrpt   25    10651.975 ±   545.998  ops/s
ComparatorBenchmarks.put  java_reverse_bytewise_non-adaptive_mutex thrpt   25    10514.224 ±   930.069  ops/s

Indicating a ~7x difference between comparators implemented natively (C++) and those implemented in Java. Let's see if we can't improve on that in the near future...

adamretter · 2019-12-23T17:17:14Z

@koldat these may be of interest to you too ;-)

pdillinger

Please just clarify "JMH", such as how I suggest.

pdillinger · 2020-01-02T21:21:26Z

java/jmh/README.md

@@ -0,0 +1,18 @@
+# JMH Benchmarks for RocksJava
+
+These are JMH micro-benchmarks for RocksJava functionality.


I (and probably others) am not familiar with JMH. Suggest:

These are micro-benchmarks for RocksJava functionality, using JMH (Java Microbenchmark Harness - https://openjdk.java.net/projects/code-tools/jmh/).

Okay, all done :-)

pdillinger · 2020-01-06T17:12:27Z

java/jmh/README.md

@@ -0,0 +1,18 @@
+# JMH Benchmarks for RocksJava
+
+These are micro-benchmarks for RocksJava functionality, using [JMH](Java Microbenchmark Harness - https://openjdk.java.net/projects/code-tools/jmh/).


Nit: malformed markdown link

I see you have imported it to Phabricator. Should I still push an update to fix the markdown?

I guess I'm sending mixed signals. Sorry. Yes, please update.

Phabricator import is required for our internal validation, and that's more useful to run early enough to share results with the author, saving round trips. We (or at least I) treat the PR as source of truth until closed, but it probably makes sense to assume we might be trying to land the PR if it's accepted and imported. So I'll try to be decisive about accepted vs. request changes.

For this PR, internal checks saw that LICENSE-HEADER.txt is missing a trailing newline. Please fix.

If you generally don't mind me pushing cosmetic things to your PR branches, I can probably fix them that way faster than round-trip. Let me know.

Thanks @pdillinger. I just fixed those two small things and pushed.
I don't mind you pushing cosmetic commits at all, after all open-source is a team sport ;-)

facebook-github-bot

@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-01-07T20:48:12Z

@adamretter has updated the pull request. Re-import the pull request

facebook-github-bot

@pdillinger has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-01-08T03:14:15Z

@pdillinger merged this pull request in 6477075.

Summary: This is a redesign of the API for RocksJava comparators with the aim of improving performance. It also simplifies the class hierarchy. **NOTE**: This breaks backwards compatibility for existing 3rd party Comparators implemented in Java... so we need to consider carefully which release branches this goes into. Previously when implementing a comparator in Java the developer had a choice of subclassing either `DirectComparator` or `Comparator` which would use direct and non-direct byte-buffers resepectively (via `DirectSlice` and `Slice`). In this redesign there we have eliminated the overhead of using the Java Slice classes, and just use `ByteBuffer`s. The `ComparatorOptions` supplied when constructing a Comparator allow you to choose between direct and non-direct byte buffers by setting `useDirect`. In addition, the `ComparatorOptions` now allow you to choose whether a ByteBuffer is reused over multiple comparator calls, by setting `maxReusedBufferSize > 0`. When buffers are reused, ComparatorOptions provides a choice of mutex type by setting `useAdaptiveMutex`. --- [JMH benchmarks previously indicated](#6241 (comment)) that the difference between C++ and Java for implementing a comparator was ~7x slowdown in Java. With these changes, when reusing buffers and guarding access to them via mutexes the slowdown is approximately the same. However, these changes offer a new facility to not reuse mutextes, which reduces the slowdown to ~5.5x in Java. We also offer a `thread_local` mechanism for reusing buffers, which reduces slowdown to ~5.2x in Java (closes #4425). These changes also form a good base for further optimisation work such as further JNI lookup caching, and JNI critical. --- These numbers were captured without jemalloc. With jemalloc, the performance improves for all tests, and the Java slowdown reduces to between 4.8x and 5.x. ``` ComparatorBenchmarks.put native_bytewise thrpt 25 124483.795 ± 2032.443 ops/s ComparatorBenchmarks.put native_reverse_bytewise thrpt 25 114414.536 ± 3486.156 ops/s ComparatorBenchmarks.put java_bytewise_non-direct_reused-64_adaptive-mutex thrpt 25 17228.250 ± 1288.546 ops/s ComparatorBenchmarks.put java_bytewise_non-direct_reused-64_non-adaptive-mutex thrpt 25 16035.865 ± 1248.099 ops/s ComparatorBenchmarks.put java_bytewise_non-direct_reused-64_thread-local thrpt 25 21571.500 ± 871.521 ops/s ComparatorBenchmarks.put java_bytewise_direct_reused-64_adaptive-mutex thrpt 25 23613.773 ± 8465.660 ops/s ComparatorBenchmarks.put java_bytewise_direct_reused-64_non-adaptive-mutex thrpt 25 16768.172 ± 5618.489 ops/s ComparatorBenchmarks.put java_bytewise_direct_reused-64_thread-local thrpt 25 23921.164 ± 8734.742 ops/s ComparatorBenchmarks.put java_bytewise_non-direct_no-reuse thrpt 25 17899.684 ± 839.679 ops/s ComparatorBenchmarks.put java_bytewise_direct_no-reuse thrpt 25 22148.316 ± 1215.527 ops/s ComparatorBenchmarks.put java_reverse_bytewise_non-direct_reused-64_adaptive-mutex thrpt 25 11311.126 ± 820.602 ops/s ComparatorBenchmarks.put java_reverse_bytewise_non-direct_reused-64_non-adaptive-mutex thrpt 25 11421.311 ± 807.210 ops/s ComparatorBenchmarks.put java_reverse_bytewise_non-direct_reused-64_thread-local thrpt 25 11554.005 ± 960.556 ops/s ComparatorBenchmarks.put java_reverse_bytewise_direct_reused-64_adaptive-mutex thrpt 25 22960.523 ± 1673.421 ops/s ComparatorBenchmarks.put java_reverse_bytewise_direct_reused-64_non-adaptive-mutex thrpt 25 18293.317 ± 1434.601 ops/s ComparatorBenchmarks.put java_reverse_bytewise_direct_reused-64_thread-local thrpt 25 24479.361 ± 2157.306 ops/s ComparatorBenchmarks.put java_reverse_bytewise_non-direct_no-reuse thrpt 25 7942.286 ± 626.170 ops/s ComparatorBenchmarks.put java_reverse_bytewise_direct_no-reuse thrpt 25 11781.955 ± 1019.843 ops/s ``` Pull Request resolved: #6252 Differential Revision: D19331064 Pulled By: pdillinger fbshipit-source-id: 1f3b794e6a14162b2c3ffb943e8c0e64a0c03738

adamretter added enhancement java-api labels Dec 23, 2019

adamretter requested review from gfosco, siying and pdillinger December 23, 2019 17:16

facebook-github-bot added the CLA Signed label Dec 23, 2019

adamretter force-pushed the feature/rocksdbjni-jmh branch 2 times, most recently from 8f5fcf8 to 0e55e04 Compare January 2, 2020 17:48

pdillinger approved these changes Jan 2, 2020

View reviewed changes

adamretter mentioned this pull request Jan 3, 2020

Improve RocksJava Comparator #6252

Closed

adamretter added 6 commits January 3, 2020 10:56

JMH microbenchmarks for RocksJava

afb0960

Put benchmarks

212fce6

Get benchmarks

cb294d3

MultiGet benchmarks

6a967da

Correct directory for Put benchmarks

9e92ae8

Use implicit blackholes

9e41917

adamretter force-pushed the feature/rocksdbjni-jmh branch from 0e55e04 to 9e41917 Compare January 3, 2020 10:56

pdillinger approved these changes Jan 6, 2020

View reviewed changes

facebook-github-bot reviewed Jan 6, 2020

View reviewed changes

adamretter added 2 commits January 7, 2020 20:46

Fix broken link

dd9e70c

Add missing newline

0e7447a

facebook-github-bot reviewed Jan 7, 2020

View reviewed changes

facebook-github-bot closed this in 6477075 Jan 7, 2020

facebook-github-bot added the Merged label Jan 8, 2020

adamretter deleted the feature/rocksdbjni-jmh branch April 29, 2020 18:41

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JMH microbenchmarks for RocksJava #6241

JMH microbenchmarks for RocksJava #6241

adamretter commented Dec 23, 2019 •

edited

Loading

adamretter commented Dec 23, 2019

pdillinger left a comment

pdillinger Jan 2, 2020

adamretter Jan 3, 2020

pdillinger Jan 6, 2020

adamretter Jan 7, 2020

pdillinger Jan 7, 2020

adamretter Jan 7, 2020

facebook-github-bot left a comment

facebook-github-bot commented Jan 7, 2020

facebook-github-bot left a comment

facebook-github-bot commented Jan 8, 2020

		@@ -0,0 +1,18 @@
		# JMH Benchmarks for RocksJava

		These are JMH micro-benchmarks for RocksJava functionality.

		@@ -0,0 +1,18 @@
		# JMH Benchmarks for RocksJava

		These are micro-benchmarks for RocksJava functionality, using [JMH](Java Microbenchmark Harness - https://openjdk.java.net/projects/code-tools/jmh/).

JMH microbenchmarks for RocksJava #6241

JMH microbenchmarks for RocksJava #6241

Conversation

adamretter commented Dec 23, 2019 • edited Loading

adamretter commented Dec 23, 2019

pdillinger left a comment

Choose a reason for hiding this comment

pdillinger Jan 2, 2020

Choose a reason for hiding this comment

adamretter Jan 3, 2020

Choose a reason for hiding this comment

pdillinger Jan 6, 2020

Choose a reason for hiding this comment

adamretter Jan 7, 2020

Choose a reason for hiding this comment

pdillinger Jan 7, 2020

Choose a reason for hiding this comment

adamretter Jan 7, 2020

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jan 7, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jan 8, 2020

adamretter commented Dec 23, 2019 •

edited

Loading