Skip to content

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Feb 8, 2017

Motivation:

RPTokenFactory.md5() is called for almost every query plan by TokenAwarePolicy.
This method in turn calls MessageDigest.getInstance(String),
which is known to create lock contentions (see https://bugs.openjdk.java.net/browse/JDK-7092821).

Modification:

Clone a prototype instance of MessageDigest instead of creating a new one from scratch.

Result:

RPTokenFactory.md5() has an improved performance as no more lock contentions arrive.

@adutra adutra added this to the 3.2.0 milestone Feb 8, 2017
@adutra
Copy link
Contributor Author

adutra commented Feb 8, 2017

Note to reviewers: com.google.common.hash.Hashing is marked beta. Do you think this could be of a concern for us?

@adutra
Copy link
Contributor Author

adutra commented Feb 8, 2017

Nevermind, after a quick benchmark, it appears that Guava's Hashing is not that performant compared to a simple prototype implementation.

Benchmark code here.

Results on my local machine:

Benchmark                                              Mode  Cnt         Score        Error  Units
TokenBenchmark.benchmarkGuavaHashing                  thrpt   10   8585051.684 ± 338078.968  ops/s
TokenBenchmark.benchmarkMessageDigestCustomPrototype  thrpt   10  10545521.389 ± 640569.143  ops/s
TokenBenchmark.benchmarkMessageDigestGetInstance      thrpt   10   4465755.653 ± 274247.054  ops/s

So I will push a simplified implementation shortly.

@tleach
Copy link

tleach commented Feb 8, 2017

I'd suggest adding a a test case in RPTokenFactoryTest.java along the lines of this:

    @Test(groups = "unit")
    public void should_hash_consistently() {
        ByteBuffer byteBuffer = Bytes.fromHexString("0xCAFEBABE");
        Token tokenA = factory.hash(byteBuffer);
        Token tokenB = factory.hash(byteBuffer);
        assertThat(tokenA).isEqualTo(factory.fromString("59959303159920881837560881824507314222"));
        assertThat(tokenA).isEqualTo(tokenB);
    }

I'd added this as a commit to my local dev branch before making any changes to the factory itself to ensure that any changes I did make didn't accidentally result in a different hash value being returned.

@adutra
Copy link
Contributor Author

adutra commented Feb 10, 2017

I'd suggest adding a a test case in RPTokenFactoryTest.java

Added thanks!

Copy link
Contributor

@olim7t olim7t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

ByteBuffer byteBuffer = Bytes.fromHexString("0xCAFEBABE");
Token tokenA = factory.hash(byteBuffer);
Token tokenB = factory.hash(byteBuffer);
assertThat(tokenA).isEqualTo(factory.fromString("59959303159920881837560881824507314222"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial: you could have chained the isEqualTo calls.

Copy link
Contributor

@tolbertam tolbertam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it was brought up on the mailing list I updated the Benchmark to include a test using a ThreadLocal to reuse MessageDigest. So I am 👍 on the proposed implementation.

It did appear to give a slight throughput improvement over the clone implementation, but I don't think it offers enough of an improvement to justify possible issues around classloader leaks. I think we can't be certain what threads call hash since both Metadata.getReplicas() and Metadata.newToken() call it and are public API methods, so it probably isn't a good idea to create a ThreadLocal for this.

8 threads (number of cores on my local laptop):

Benchmark                                                      Mode  Cnt         Score        Error  Units
MessageDigestBenchmark.benchmarkGuavaHashing                  thrpt   10  10473072.107 ± 713321.359  ops/s
MessageDigestBenchmark.benchmarkMessageDigestCustomPrototype  thrpt   10  12769089.493 ± 246295.850  ops/s
MessageDigestBenchmark.benchmarkMessageDigestGetInstance      thrpt   10   5334760.325 ± 616163.000  ops/s
MessageDigestBenchmark.benchmarkMessageDigestThreadLocal      thrpt   10  13531501.777 ± 410707.323  ops/s

4 threads (cores / 2):

Benchmark                                                      Mode  Cnt         Score        Error  Units
MessageDigestBenchmark.benchmarkGuavaHashing                  thrpt   10   8977372.306 ± 643934.523  ops/s
MessageDigestBenchmark.benchmarkMessageDigestCustomPrototype  thrpt   10  10462846.827 ± 806385.871  ops/s
MessageDigestBenchmark.benchmarkMessageDigestGetInstance      thrpt   10   7134434.241 ± 116276.480  ops/s
MessageDigestBenchmark.benchmarkMessageDigestThreadLocal      thrpt   10  11669175.419 ± 867734.739  ops/s

16 threads (cores * 2):

Benchmark                                                      Mode  Cnt         Score         Error  Units
MessageDigestBenchmark.benchmarkGuavaHashing                  thrpt   10  11907004.499 ±  263184.075  ops/s
MessageDigestBenchmark.benchmarkMessageDigestCustomPrototype  thrpt   10  12413481.943 ± 1031950.103  ops/s
MessageDigestBenchmark.benchmarkMessageDigestGetInstance      thrpt   10   5652411.878 ±  467668.584  ops/s
MessageDigestBenchmark.benchmarkMessageDigestThreadLocal      thrpt   10  14715318.554 ±  317316.420  ops/s

1 thread:

Benchmark                                                      Mode  Cnt        Score        Error  Units
MessageDigestBenchmark.benchmarkGuavaHashing                  thrpt   10  3123045.280 ±  47916.632  ops/s
MessageDigestBenchmark.benchmarkMessageDigestCustomPrototype  thrpt   10  3437793.199 ± 181377.439  ops/s
MessageDigestBenchmark.benchmarkMessageDigestGetInstance      thrpt   10  2856639.462 ±  56669.009  ops/s
MessageDigestBenchmark.benchmarkMessageDigestThreadLocal      thrpt   10  3832478.922 ±  55020.823  ops/s

Motivation:

RPTokenFactory.md5() is called for almost every query plan by TokenAwarePolicy.
This method in turn calls MessageDigest.getInstance(String),
which is known to create lock contentions (see https://bugs.openjdk.java.net/browse/JDK-7092821).

Modification:

Clone a prototype instance of MessageDigest
instead of creating a new one from scratch, thus avoiding lock contention.
See google/guava#1197.

Result:

RPTokenFactory.md5() has an improved performance as no more lock contentions arrive.
@adutra adutra merged commit a4294ca into 3.x Feb 16, 2017
@adutra adutra deleted the java1392 branch February 16, 2017 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants