Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding microbenchmarks for encoder/decoder. #24

Closed
wants to merge 10 commits into from

Conversation

nmittler
Copy link
Contributor

Fixes #23

@nmittler
Copy link
Contributor Author

@jpinner I went with the multi-module approach ... just thought it was simpler than profiles with a custom artifact.

This code was taken from the work @Scottmitch had done in the Netty Http2FrameWriterBenchmark.

/cc @Scottmitch @louiscryan

@nmittler
Copy link
Contributor Author

Early results on my machine:

*** ENCODER ***
Benchmark                (duplicates)  (limitToAscii)  (maxTableSize)  (sensitive)  (size)   Mode  Cnt        Score       Error  Units
EncoderBenchmark.encode          true            true            4096         true   SMALL  thrpt    5   144598.269 ±  3421.345  ops/s
EncoderBenchmark.encode          true            true            4096         true  MEDIUM  thrpt    5    19030.893 ±   571.609  ops/s
EncoderBenchmark.encode          true            true            4096         true   LARGE  thrpt    5     1211.194 ±    33.147  ops/s
EncoderBenchmark.encode          true            true            4096        false   SMALL  thrpt    5   533033.716 ±  6770.979  ops/s
EncoderBenchmark.encode          true            true            4096        false  MEDIUM  thrpt    5   176330.303 ±  5114.456  ops/s
EncoderBenchmark.encode          true            true            4096        false   LARGE  thrpt    5    16089.896 ±   175.859  ops/s
EncoderBenchmark.encode          true           false            4096         true   SMALL  thrpt    5   876345.143 ± 21624.669  ops/s
EncoderBenchmark.encode          true           false            4096         true  MEDIUM  thrpt    5   167799.886 ±  3948.852  ops/s
EncoderBenchmark.encode          true           false            4096         true   LARGE  thrpt    5    15147.187 ±   254.053  ops/s
EncoderBenchmark.encode          true           false            4096        false   SMALL  thrpt    5  1287017.653 ± 29359.893  ops/s
EncoderBenchmark.encode          true           false            4096        false  MEDIUM  thrpt    5   298737.613 ±  5634.058  ops/s
EncoderBenchmark.encode          true           false            4096        false   LARGE  thrpt    5    18372.766 ±   222.920  ops/s
EncoderBenchmark.encode         false            true            4096         true   SMALL  thrpt    5   147270.968 ±  2546.198  ops/s
EncoderBenchmark.encode         false            true            4096         true  MEDIUM  thrpt    5    19334.192 ±   386.885  ops/s
EncoderBenchmark.encode         false            true            4096         true   LARGE  thrpt    5     1225.367 ±    17.731  ops/s
EncoderBenchmark.encode         false            true            4096        false   SMALL  thrpt    5   130446.563 ±  3996.027  ops/s
EncoderBenchmark.encode         false            true            4096        false  MEDIUM  thrpt    5    17788.704 ±   280.622  ops/s
EncoderBenchmark.encode         false            true            4096        false   LARGE  thrpt    5     1118.390 ±    27.787  ops/s
EncoderBenchmark.encode         false           false            4096         true   SMALL  thrpt    5   837626.917 ± 20311.898  ops/s
EncoderBenchmark.encode         false           false            4096         true  MEDIUM  thrpt    5   160997.243 ±  3107.243  ops/s
EncoderBenchmark.encode         false           false            4096         true   LARGE  thrpt    5    12217.239 ±   339.904  ops/s
EncoderBenchmark.encode         false           false            4096        false   SMALL  thrpt    5   513390.074 ± 17306.385  ops/s
EncoderBenchmark.encode         false           false            4096        false  MEDIUM  thrpt    5    85446.719 ±  1965.551  ops/s
EncoderBenchmark.encode         false           false            4096        false   LARGE  thrpt    5     7098.081 ±   153.516  ops/s

*** DECODER ***
Benchmark                (limitToAscii)  (maxHeaderSize)  (maxTableSize)  (sensitive)  (size)   Mode  Cnt        Score        Error  Units
DecoderBenchmark.decode            true             8192            4096         true   SMALL  thrpt    5   412684.628 ±  14461.568  ops/s
DecoderBenchmark.decode            true             8192            4096         true  MEDIUM  thrpt    5    62778.412 ±   1304.433  ops/s
DecoderBenchmark.decode            true             8192            4096         true   LARGE  thrpt    5     3328.122 ±     86.049  ops/s
DecoderBenchmark.decode            true             8192            4096        false   SMALL  thrpt    5   429827.618 ±  13236.084  ops/s
DecoderBenchmark.decode            true             8192            4096        false  MEDIUM  thrpt    5    61995.807 ±   1717.544  ops/s
DecoderBenchmark.decode            true             8192            4096        false   LARGE  thrpt    5     3305.783 ±    101.331  ops/s
DecoderBenchmark.decode           false             8192            4096         true   SMALL  thrpt    5  1734749.592 ±  40199.111  ops/s
DecoderBenchmark.decode           false             8192            4096         true  MEDIUM  thrpt    5   494409.765 ±  13761.175  ops/s
DecoderBenchmark.decode           false             8192            4096         true   LARGE  thrpt    5    70768.210 ±   1877.377  ops/s
DecoderBenchmark.decode           false             8192            4096        false   SMALL  thrpt    5  1659467.543 ± 100223.534  ops/s
DecoderBenchmark.decode           false             8192            4096        false  MEDIUM  thrpt    5   464148.755 ±   8314.758  ops/s
DecoderBenchmark.decode           false             8192            4096        false   LARGE  thrpt    5    67834.973 ±   1510.951  ops/s

decoder.decode(new ByteArrayInputStream(input), new HeaderListener() {
@Override
public void addHeader(byte[] name, byte[] value, boolean sensitive) {
// Do nothing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this will get optimized away? Perhaps we should do something in here...BlackHole or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Scottmitch
Copy link

@nmittler - Not sure if you saw netty/netty#3503 (comment) but that describes some stuff that may be limiting with what I had in Netty. Looks like you already are exercising the sensitive parameter but some of the other items may still be relevant.

@nmittler
Copy link
Contributor Author

@Scottmitch good point. In particular your points 1 and 2 might be worth exploring in these benchmarks.

@buchgr in your benchmarking for grpc have you put any thought into "representational" header name/values?

@buchgr
Copy link

buchgr commented Apr 24, 2015

@nmittler I collected some grpc headers for my headers benchmarks https://github.com/netty/netty/pull/3681/files#diff-ca4f23f3c28ec164487afb498dc26e5dR37 other than that I don't have any, but if you collect a few headers I am interested to also include them in the Netty PR :-).

@buchgr
Copy link

buchgr commented Apr 24, 2015

@nmittler at least in GRPC we are not spending much time in hpack in our load tests (~2%), but then again we also don't use many headers.
hpack
The decode and encode methods are definitely a good starting point for benchmarking.

Before starting to do micro optimizations, I thinkt it would make sense to do some load tests with a small client / server that uses lots of headers and is build on top of Netty's HTTP/2 codec. We could then look at it with a profiler and see what the total optimization potential is and where the time is going.

@tatsuhiro-t
Copy link

Since Encoder only uses huffman if it makes string strictly shorter, Encoder does not use huffman encoding in Decoder benchmark since we have 0-255 value range for each byte and not frequently used byte not in ascii range is very long huffman code. To get more realistic benchmark, perhaps we can limit the random value range into printable ascii?

@jpinner
Copy link
Collaborator

jpinner commented Apr 26, 2015

When I did some benchmarking I used two sets of random strings -- one across all range of bytes and one limited to ascii characters.

@nmittler
Copy link
Contributor Author

PTAL, I've added an extra parameter to limitToAscii where I use a reduced ASCII character set (upper/lowercase and a couple of other characters). I've updated the results with these changes.

@nmittler
Copy link
Contributor Author

@buchgr I understand that grpc isn't spending a lot of time in hpack, but I still think it's useful to have the benchmarks to get a basic feel for how well it performs overall and so that we have a basis for comparison going forward.

@buchgr
Copy link

buchgr commented Apr 28, 2015

@nmittler yeah absolutely. I think those benchmarks are great. I guess what I wanted to say is that, before as a next step you and @Scottmitch start spending time on micro optimizations it might make sense to see what the optimization potential is, in order to see if this is high priority or not. My guess would be that, since this stuff runs in production at Twitter it likely performs pretty decent.

@Scottmitch
Copy link

@buchgr - +1.

@jpinner - Are you using this library with your HTTP/2 codec that you open-sourced a while back, or some thing else? I'm curious what your interface looks like to this (do you have InputStream, OutputStream, and byte[] handy, or is there some amount of copy going on)?

@jpinner
Copy link
Collaborator

jpinner commented Apr 28, 2015

@Scottmitch we use this with the http/2 codec. For decoding there is some copying on decode into a "cumulation" buffer like in most of the netty frame decoders and then we wrap the buffer in a ByteBufInputStream. For encoding we create a ByteBufOutputStream. See:

https://github.com/twitter/netty-http2/blob/master/src/main/java/com/twitter/http2/HttpHeaderBlockDecoder.java

https://github.com/twitter/netty-http2/blob/master/src/main/java/com/twitter/http2/HttpHeaderBlockEncoder.java

@nmittler
Copy link
Contributor Author

@buchgr +1 :)

@jpinner thanks for the links!

@Scottmitch I've raised netty/netty#3700 to compare Netty against the Twitter encoder/decoder.

@nmittler
Copy link
Contributor Author

@buchgr @Scottmitch @jpinner any other changes you'd like to see?

public void addHeader(byte[] name, byte[] value, boolean sensitive) {
bh.consume(name);
bh.consume(value);
bh.consume(sensitive);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nmittler nit: it should be enough to consume just one i.e. the boolean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@buchgr
Copy link

buchgr commented Apr 28, 2015

@nmittler LGTM.


private static final Map<HeadersKey, List<Header>> headersMap;
static {
headersMap = new HashMap<HeadersKey, List<Header>>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize size to HeadersSize.values().size()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

OutputStream outputStream = new ByteArrayOutputStream(1048576);
for (int i = 0; i < headers.size(); ++i) {
// If duplicates is set, re-add the same header each time.
Header header = duplicates ? headers.get(0) : headers.get(i);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have 2 loops based upon this condition? We don't need the conditional overhead in each loop iteration to be part of the benchmark, and duplicates shouldn't be changing in the benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@Scottmitch
Copy link

@nmittler - LGTM!

* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.twitter.hpack.microbench;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason these files don't follow the maven directory hierarchy?

the netty version uses the src/test/java/pacakge-name.../microbench directory structure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpinner oof, good catch! I've replaced com.twitter.hpack.microbench with com/twitter/hpack/microbench.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it also need to be moved under src/test like in netty?

On Tue, Apr 28, 2015 at 1:26 PM, Nathan Mittler notifications@github.com
wrote:

In
microbench/main/java/com.twitter.hpack.microbench/AbstractMicrobenchmarkBase.java
#24 (comment):

+/*

  • * Copyright 2015 Twitter, Inc.
  • * Licensed under the Apache License, Version 2.0 (the "License");
  • * you may not use this file except in compliance with the License.
  • * You may obtain a copy of the License at
  • * http://www.apache.org/licenses/LICENSE-2.0
  • * Unless required by applicable law or agreed to in writing, software
  • * distributed under the License is distributed on an "AS IS" BASIS,
  • * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  • * See the License for the specific language governing permissions and
  • * limitations under the License.
  • */
    +package com.twitter.hpack.microbench;

@jpinner https://github.com/jpinner oof, good catch! I've replaced
com.twitter.hpack.microbench with com/twitter/hpack/microbench.


Reply to this email directly or view it on GitHub
https://github.com/twitter/hpack/pull/24/files#r29284441.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, Netty has it under src/main: https://github.com/netty/netty/tree/master/microbench/src

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we need to move it under "src" still?

On Tue, Apr 28, 2015 at 1:38 PM, Nathan Mittler notifications@github.com
wrote:

In
microbench/main/java/com.twitter.hpack.microbench/AbstractMicrobenchmarkBase.java
#24 (comment):

+/*

  • * Copyright 2015 Twitter, Inc.
  • * Licensed under the Apache License, Version 2.0 (the "License");
  • * you may not use this file except in compliance with the License.
  • * You may obtain a copy of the License at
  • * http://www.apache.org/licenses/LICENSE-2.0
  • * Unless required by applicable law or agreed to in writing, software
  • * distributed under the License is distributed on an "AS IS" BASIS,
  • * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  • * See the License for the specific language governing permissions and
  • * limitations under the License.
  • */
    +package com.twitter.hpack.microbench;

Actually, Netty has it under src/main:
https://github.com/netty/netty/tree/master/microbench/src


Reply to this email directly or view it on GitHub
https://github.com/twitter/hpack/pull/24/files#r29285752.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah missed that ... done.

@BenchmarkMode(Mode.Throughput)
public void encode(Blackhole bh) throws IOException {
Encoder encoder = new Encoder(maxTableSize);
OutputStream outputStream = new ByteArrayOutputStream(1048576);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nmittler - Is this number significant, or is it just 1 << 20 which is assumed to be big enough? Maybe add a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous reply.

@nmittler
Copy link
Contributor Author

@Scottmitch @jpinner PTAL

<parent>
<groupId>com.twitter</groupId>
<artifactId>hpack-parent</artifactId>
<version>0.10.2-SNAPSHOT</version>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's update this to 0.11.0-SNAPSHOT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@jpinner
Copy link
Collaborator

jpinner commented Apr 29, 2015

@nmittler looks great! just a few nits above -- thanks for getting this all set up!

/**
* Enum that indicates the size of the headers to be used for the benchmark.
*/
public enum HeadersSize {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum is private but all methods are package protected. Should we just make class package protected and methods public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be public for use with JMH. I can make the methods public for consistency.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yah just make the methods public too I guess.

@Scottmitch
Copy link

@nmittler - LGTM! Just a few cleanup questions.

@nmittler
Copy link
Contributor Author

@Scottmitch @jpinner committed changes ... anything else?

@jpinner jpinner closed this Apr 30, 2015
@nmittler
Copy link
Contributor Author

Great! thanks for the review, guys! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Benchmarks for hpack
5 participants