Adding microbenchmarks for encoder/decoder. #24

nmittler · 2015-04-24T19:28:09Z

Fixes #23

nmittler · 2015-04-24T19:30:13Z

@jpinner I went with the multi-module approach ... just thought it was simpler than profiles with a custom artifact.

This code was taken from the work @Scottmitch had done in the Netty Http2FrameWriterBenchmark.

/cc @Scottmitch @louiscryan

Fixes twitter#23

nmittler · 2015-04-24T20:46:36Z

Early results on my machine:

*** ENCODER ***
Benchmark                (duplicates)  (limitToAscii)  (maxTableSize)  (sensitive)  (size)   Mode  Cnt        Score       Error  Units
EncoderBenchmark.encode          true            true            4096         true   SMALL  thrpt    5   144598.269 ±  3421.345  ops/s
EncoderBenchmark.encode          true            true            4096         true  MEDIUM  thrpt    5    19030.893 ±   571.609  ops/s
EncoderBenchmark.encode          true            true            4096         true   LARGE  thrpt    5     1211.194 ±    33.147  ops/s
EncoderBenchmark.encode          true            true            4096        false   SMALL  thrpt    5   533033.716 ±  6770.979  ops/s
EncoderBenchmark.encode          true            true            4096        false  MEDIUM  thrpt    5   176330.303 ±  5114.456  ops/s
EncoderBenchmark.encode          true            true            4096        false   LARGE  thrpt    5    16089.896 ±   175.859  ops/s
EncoderBenchmark.encode          true           false            4096         true   SMALL  thrpt    5   876345.143 ± 21624.669  ops/s
EncoderBenchmark.encode          true           false            4096         true  MEDIUM  thrpt    5   167799.886 ±  3948.852  ops/s
EncoderBenchmark.encode          true           false            4096         true   LARGE  thrpt    5    15147.187 ±   254.053  ops/s
EncoderBenchmark.encode          true           false            4096        false   SMALL  thrpt    5  1287017.653 ± 29359.893  ops/s
EncoderBenchmark.encode          true           false            4096        false  MEDIUM  thrpt    5   298737.613 ±  5634.058  ops/s
EncoderBenchmark.encode          true           false            4096        false   LARGE  thrpt    5    18372.766 ±   222.920  ops/s
EncoderBenchmark.encode         false            true            4096         true   SMALL  thrpt    5   147270.968 ±  2546.198  ops/s
EncoderBenchmark.encode         false            true            4096         true  MEDIUM  thrpt    5    19334.192 ±   386.885  ops/s
EncoderBenchmark.encode         false            true            4096         true   LARGE  thrpt    5     1225.367 ±    17.731  ops/s
EncoderBenchmark.encode         false            true            4096        false   SMALL  thrpt    5   130446.563 ±  3996.027  ops/s
EncoderBenchmark.encode         false            true            4096        false  MEDIUM  thrpt    5    17788.704 ±   280.622  ops/s
EncoderBenchmark.encode         false            true            4096        false   LARGE  thrpt    5     1118.390 ±    27.787  ops/s
EncoderBenchmark.encode         false           false            4096         true   SMALL  thrpt    5   837626.917 ± 20311.898  ops/s
EncoderBenchmark.encode         false           false            4096         true  MEDIUM  thrpt    5   160997.243 ±  3107.243  ops/s
EncoderBenchmark.encode         false           false            4096         true   LARGE  thrpt    5    12217.239 ±   339.904  ops/s
EncoderBenchmark.encode         false           false            4096        false   SMALL  thrpt    5   513390.074 ± 17306.385  ops/s
EncoderBenchmark.encode         false           false            4096        false  MEDIUM  thrpt    5    85446.719 ±  1965.551  ops/s
EncoderBenchmark.encode         false           false            4096        false   LARGE  thrpt    5     7098.081 ±   153.516  ops/s

*** DECODER ***
Benchmark                (limitToAscii)  (maxHeaderSize)  (maxTableSize)  (sensitive)  (size)   Mode  Cnt        Score        Error  Units
DecoderBenchmark.decode            true             8192            4096         true   SMALL  thrpt    5   412684.628 ±  14461.568  ops/s
DecoderBenchmark.decode            true             8192            4096         true  MEDIUM  thrpt    5    62778.412 ±   1304.433  ops/s
DecoderBenchmark.decode            true             8192            4096         true   LARGE  thrpt    5     3328.122 ±     86.049  ops/s
DecoderBenchmark.decode            true             8192            4096        false   SMALL  thrpt    5   429827.618 ±  13236.084  ops/s
DecoderBenchmark.decode            true             8192            4096        false  MEDIUM  thrpt    5    61995.807 ±   1717.544  ops/s
DecoderBenchmark.decode            true             8192            4096        false   LARGE  thrpt    5     3305.783 ±    101.331  ops/s
DecoderBenchmark.decode           false             8192            4096         true   SMALL  thrpt    5  1734749.592 ±  40199.111  ops/s
DecoderBenchmark.decode           false             8192            4096         true  MEDIUM  thrpt    5   494409.765 ±  13761.175  ops/s
DecoderBenchmark.decode           false             8192            4096         true   LARGE  thrpt    5    70768.210 ±   1877.377  ops/s
DecoderBenchmark.decode           false             8192            4096        false   SMALL  thrpt    5  1659467.543 ± 100223.534  ops/s
DecoderBenchmark.decode           false             8192            4096        false  MEDIUM  thrpt    5   464148.755 ±   8314.758  ops/s
DecoderBenchmark.decode           false             8192            4096        false   LARGE  thrpt    5    67834.973 ±   1510.951  ops/s

Scottmitch · 2015-04-24T21:07:27Z

microbench/main/java/com.twitter.hpack.microbench/DecoderBenchmark.java

+        decoder.decode(new ByteArrayInputStream(input), new HeaderListener() {
+            @Override
+            public void addHeader(byte[] name, byte[] value, boolean sensitive) {
+                // Do nothing.


I wonder if this will get optimized away? Perhaps we should do something in here...BlackHole or something?

Scottmitch · 2015-04-24T21:24:06Z

@nmittler - Not sure if you saw netty/netty#3503 (comment) but that describes some stuff that may be limiting with what I had in Netty. Looks like you already are exercising the sensitive parameter but some of the other items may still be relevant.

nmittler · 2015-04-24T22:08:01Z

@Scottmitch good point. In particular your points 1 and 2 might be worth exploring in these benchmarks.

@buchgr in your benchmarking for grpc have you put any thought into "representational" header name/values?

buchgr · 2015-04-24T23:00:40Z

@nmittler I collected some grpc headers for my headers benchmarks https://github.com/netty/netty/pull/3681/files#diff-ca4f23f3c28ec164487afb498dc26e5dR37 other than that I don't have any, but if you collect a few headers I am interested to also include them in the Netty PR :-).

buchgr · 2015-04-24T23:23:20Z

@nmittler at least in GRPC we are not spending much time in hpack in our load tests (~2%), but then again we also don't use many headers.

The decode and encode methods are definitely a good starting point for benchmarking.

Before starting to do micro optimizations, I thinkt it would make sense to do some load tests with a small client / server that uses lots of headers and is build on top of Netty's HTTP/2 codec. We could then look at it with a profiler and see what the total optimization potential is and where the time is going.

tatsuhiro-t · 2015-04-26T08:14:08Z

Since Encoder only uses huffman if it makes string strictly shorter, Encoder does not use huffman encoding in Decoder benchmark since we have 0-255 value range for each byte and not frequently used byte not in ascii range is very long huffman code. To get more realistic benchmark, perhaps we can limit the random value range into printable ascii?

jpinner · 2015-04-26T19:33:28Z

When I did some benchmarking I used two sets of random strings -- one across all range of bytes and one limited to ascii characters.

nmittler · 2015-04-27T22:45:52Z

PTAL, I've added an extra parameter to limitToAscii where I use a reduced ASCII character set (upper/lowercase and a couple of other characters). I've updated the results with these changes.

nmittler · 2015-04-27T22:49:44Z

@buchgr I understand that grpc isn't spending a lot of time in hpack, but I still think it's useful to have the benchmarks to get a basic feel for how well it performs overall and so that we have a basis for comparison going forward.

buchgr · 2015-04-28T00:23:17Z

@nmittler yeah absolutely. I think those benchmarks are great. I guess what I wanted to say is that, before as a next step you and @Scottmitch start spending time on micro optimizations it might make sense to see what the optimization potential is, in order to see if this is high priority or not. My guess would be that, since this stuff runs in production at Twitter it likely performs pretty decent.

Scottmitch · 2015-04-28T01:44:24Z

@buchgr - +1.

@jpinner - Are you using this library with your HTTP/2 codec that you open-sourced a while back, or some thing else? I'm curious what your interface looks like to this (do you have InputStream, OutputStream, and byte[] handy, or is there some amount of copy going on)?

jpinner · 2015-04-28T04:19:39Z

@Scottmitch we use this with the http/2 codec. For decoding there is some copying on decode into a "cumulation" buffer like in most of the netty frame decoders and then we wrap the buffer in a ByteBufInputStream. For encoding we create a ByteBufOutputStream. See:

https://github.com/twitter/netty-http2/blob/master/src/main/java/com/twitter/http2/HttpHeaderBlockDecoder.java

https://github.com/twitter/netty-http2/blob/master/src/main/java/com/twitter/http2/HttpHeaderBlockEncoder.java

nmittler · 2015-04-28T16:27:09Z

@buchgr +1 :)

@jpinner thanks for the links!

@Scottmitch I've raised netty/netty#3700 to compare Netty against the Twitter encoder/decoder.

nmittler · 2015-04-28T17:11:49Z

@buchgr @Scottmitch @jpinner any other changes you'd like to see?

buchgr · 2015-04-28T17:29:38Z

microbench/main/java/com.twitter.hpack.microbench/DecoderBenchmark.java

+            public void addHeader(byte[] name, byte[] value, boolean sensitive) {
+                bh.consume(name);
+                bh.consume(value);
+                bh.consume(sensitive);


@nmittler nit: it should be enough to consume just one i.e. the boolean?

buchgr · 2015-04-28T17:39:13Z

@nmittler LGTM.

Scottmitch · 2015-04-28T18:59:43Z

microbench/main/java/com.twitter.hpack.microbench/AbstractMicrobenchmarkBase.java

+
+    private static final Map<HeadersKey, List<Header>> headersMap;
+    static {
+        headersMap = new HashMap<HeadersKey, List<Header>>();


Initialize size to HeadersSize.values().size()?

Scottmitch · 2015-04-28T19:03:14Z

microbench/main/java/com.twitter.hpack.microbench/EncoderBenchmark.java

+        OutputStream outputStream = new ByteArrayOutputStream(1048576);
+        for (int i = 0; i < headers.size(); ++i) {
+            // If duplicates is set, re-add the same header each time.
+            Header header = duplicates ? headers.get(0) : headers.get(i);


Maybe have 2 loops based upon this condition? We don't need the conditional overhead in each loop iteration to be part of the benchmark, and duplicates shouldn't be changing in the benchmark.

Scottmitch · 2015-04-28T19:04:16Z

@nmittler - LGTM!

jpinner · 2015-04-28T20:12:39Z

microbench/main/java/com.twitter.hpack.microbench/AbstractMicrobenchmarkBase.java

+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package com.twitter.hpack.microbench;


is there a reason these files don't follow the maven directory hierarchy?

the netty version uses the src/test/java/pacakge-name.../microbench directory structure

@jpinner oof, good catch! I've replaced com.twitter.hpack.microbench with com/twitter/hpack/microbench.

does it also need to be moved under src/test like in netty?

On Tue, Apr 28, 2015 at 1:26 PM, Nathan Mittler notifications@github.com
wrote:

In
microbench/main/java/com.twitter.hpack.microbench/AbstractMicrobenchmarkBase.java
#24 (comment):

+/*

* Copyright 2015 Twitter, Inc.

* Licensed under the Apache License, Version 2.0 (the "License");

* you may not use this file except in compliance with the License.

* You may obtain a copy of the License at

* http://www.apache.org/licenses/LICENSE-2.0

* Unless required by applicable law or agreed to in writing, software

* distributed under the License is distributed on an "AS IS" BASIS,

* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

* See the License for the specific language governing permissions and

* limitations under the License.

*/
+package com.twitter.hpack.microbench;

@jpinner https://github.com/jpinner oof, good catch! I've replaced
com.twitter.hpack.microbench with com/twitter/hpack/microbench.

—
Reply to this email directly or view it on GitHub
https://github.com/twitter/hpack/pull/24/files#r29284441.

Actually, Netty has it under src/main: https://github.com/netty/netty/tree/master/microbench/src

so we need to move it under "src" still?

On Tue, Apr 28, 2015 at 1:38 PM, Nathan Mittler notifications@github.com
wrote:

In
microbench/main/java/com.twitter.hpack.microbench/AbstractMicrobenchmarkBase.java
#24 (comment):

+/*

* Copyright 2015 Twitter, Inc.

* Licensed under the Apache License, Version 2.0 (the "License");

* you may not use this file except in compliance with the License.

* You may obtain a copy of the License at

* http://www.apache.org/licenses/LICENSE-2.0

* Unless required by applicable law or agreed to in writing, software

* distributed under the License is distributed on an "AS IS" BASIS,

* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

* See the License for the specific language governing permissions and

* limitations under the License.

*/
+package com.twitter.hpack.microbench;

Actually, Netty has it under src/main:
https://github.com/netty/netty/tree/master/microbench/src

—
Reply to this email directly or view it on GitHub
https://github.com/twitter/hpack/pull/24/files#r29285752.

Ah missed that ... done.

Scottmitch · 2015-04-28T22:36:47Z

microbench/src/main/java/com/twitter/hpack/microbench/EncoderBenchmark.java

+    @BenchmarkMode(Mode.Throughput)
+    public void encode(Blackhole bh) throws IOException {
+        Encoder encoder = new Encoder(maxTableSize);
+        OutputStream outputStream = new ByteArrayOutputStream(1048576);


@nmittler - Is this number significant, or is it just 1 << 20 which is assumed to be big enough? Maybe add a comment?

See previous reply.

nmittler · 2015-04-29T15:44:39Z

@Scottmitch @jpinner PTAL

jpinner · 2015-04-29T18:19:07Z

hpack/pom.xml

+    <parent>
+        <groupId>com.twitter</groupId>
+        <artifactId>hpack-parent</artifactId>
+        <version>0.10.2-SNAPSHOT</version>


let's update this to 0.11.0-SNAPSHOT

jpinner · 2015-04-29T18:22:44Z

@nmittler looks great! just a few nits above -- thanks for getting this all set up!

Scottmitch · 2015-04-29T18:24:56Z

microbench/src/main/java/com/twitter/hpack/microbench/HeadersSize.java

+/**
+ * Enum that indicates the size of the headers to be used for the benchmark.
+ */
+public enum HeadersSize {


Enum is private but all methods are package protected. Should we just make class package protected and methods public?

It has to be public for use with JMH. I can make the methods public for consistency.

I see. Yah just make the methods public too I guess.

Scottmitch · 2015-04-29T18:28:33Z

@nmittler - LGTM! Just a few cleanup questions.

nmittler · 2015-04-29T21:55:50Z

@Scottmitch @jpinner committed changes ... anything else?

nmittler · 2015-04-30T17:22:34Z

Great! thanks for the review, guys! :)

Adding microbenchmarks for encoder/decoder.

d50c545

Fixes twitter#23

nmittler force-pushed the benchmarks branch from e73273b to d50c545 Compare April 24, 2015 19:41

Scottmitch reviewed Apr 24, 2015
View reviewed changes

addressing comments.

4493d2c

nmittler force-pushed the benchmarks branch from 1df850e to 4493d2c Compare April 24, 2015 21:12

nmittler force-pushed the benchmarks branch from 6e04fce to 4adb246 Compare April 24, 2015 22:47

Adding duplicates parameter for the encoder.

1017196

nmittler force-pushed the benchmarks branch from 4adb246 to 1017196 Compare April 24, 2015 22:48

adding parameter for limitToAscii

3f7141a

nmittler mentioned this pull request Apr 28, 2015

Compare HPACK encoder/decoder with Twitter implementation netty/netty#3700

Closed

buchgr reviewed Apr 28, 2015
View reviewed changes

addressing comments.

a7e6754

Scottmitch reviewed Apr 28, 2015
View reviewed changes

addressing comments.

45156a7

jpinner reviewed Apr 28, 2015
View reviewed changes

nmittler added 2 commits April 28, 2015 13:24

fixing package directories

a1ada15

adding missing "src" directory

ebcb5df

Scottmitch reviewed Apr 28, 2015
View reviewed changes

Removing buffer allocations from benchmark methods.

58ec7ff

jpinner reviewed Apr 29, 2015
View reviewed changes

Scottmitch reviewed Apr 29, 2015
View reviewed changes

addressing comments.

bf50173

jpinner closed this Apr 30, 2015

nmittler mentioned this pull request Apr 30, 2015

Twitter HPACK updates netty/netty#3597

Closed

Adding microbenchmarks for encoder/decoder. #24

Adding microbenchmarks for encoder/decoder. #24

Conversation

nmittler commented Apr 24, 2015

nmittler commented Apr 24, 2015

nmittler commented Apr 24, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Scottmitch commented Apr 24, 2015

nmittler commented Apr 24, 2015

buchgr commented Apr 24, 2015

buchgr commented Apr 24, 2015

tatsuhiro-t commented Apr 26, 2015

jpinner commented Apr 26, 2015

nmittler commented Apr 27, 2015

nmittler commented Apr 27, 2015

buchgr commented Apr 28, 2015

Scottmitch commented Apr 28, 2015

jpinner commented Apr 28, 2015

nmittler commented Apr 28, 2015

nmittler commented Apr 28, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

buchgr commented Apr 28, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Scottmitch commented Apr 28, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmittler commented Apr 29, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpinner commented Apr 29, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Scottmitch commented Apr 29, 2015

nmittler commented Apr 29, 2015

nmittler commented Apr 30, 2015