Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Commits on May 4, 2015
  1. @louiscryan @nmittler

    Have Http2LocalFlowController.consumeBytes indicate whether a WINDOW_…

    louiscryan authored nmittler committed
    …UPDATE was written
Commits on Apr 25, 2015
  1. @Scottmitch

    ByteString arrayOffset method

    Scottmitch authored
    Motivation:
    The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support.
    
    Modifications:
    - add arrayOffset method to ByteString
    - modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset
    - update all code that uses ByteString.array to ensure it accounts for the offset
    - add unit tests to test the implementation respects the offset
    
    Result:
    ByteString and AsciiString can represent a sub region of a byte[].
Commits on Apr 23, 2015
  1. @nmittler

    Optimizing user-defined stream properties.

    nmittler authored
    Motivation:
    
    Streams currently maintain a hash map of user-defined properties, which has been shown to add significant memory overhead as well as being a performance bottleneck for lookup of frequently used properties.
    
    Modifications:
    
    Modifying the connection/stream to use an array as the storage of user-defined properties, indexed by the class that identifies the index into the array where the property is stored.
    
    Result:
    
    Stream processing performance should be improved.
Commits on Apr 22, 2015
  1. @Scottmitch

    Compile error introduced in ee9233d

    Scottmitch authored
    Motivation:
    Commit ee9233d introduced a compile error in microbench.
    
    Modifications:
    Fix compile error.
    
    Result:
    Code now builds.
Commits on Apr 21, 2015
  1. @Scottmitch

    HTTP/2 Flow Controller interface updates

    Scottmitch authored
    Motivation:
    Flow control is a required part of the HTTP/2 specification but it is currently structured more like an optional item. It must be accessed through the property map which is time consuming and does not represent its required nature. This access pattern does not give any insight into flow control outside of the codec (or flow controller implementation).
    
    Modifications:
    1. Create a read only public interface for LocalFlowState and RemoteFlowState.
    2. Add a LocalFlowState localFlowState(); and RemoteFlowState remoteFlowState(); to Http2Stream.
    
    Result:
    Flow control is not part of the Http2Stream interface. This clarifies its responsibility and logical relationship to other interfaces. The flow controller no longer must be acquired though a map lookup.
Commits on Apr 17, 2015
  1. @Scottmitch

    HTTP/2 Priority Tree Benchmark

    Scottmitch authored
    Motivation:
    There is no benchmark to measure the priority tree implementation performance.
    
    Modifications:
    Introduce a new benchmark which will populate the priority tree, and then shuffle parent/child links around.
    
    Result:
    A simple benchmark to get a baseline for the HTTP/2 codec's priority tree implementation.
  2. @louiscryan @Scottmitch

    Have microbenchmarks produce a deployable artifact. Fix some minor mi…

    louiscryan authored Scottmitch committed
    …scellaneous issues.
    
    Motivation:
    Allows for running benchmarks from built jars which is useful in development environments that only take released artifacts.
    
    Modifications:
    Move benchmarks into 'main' from 'test'
    Add @State annotations to benchmarks that are missing them
    Fix timing issue grabbing context during channel initialization
    
    Result:
    Users can run benchmarks more easily.
  3. @buchgr @Scottmitch

    Improve performance of AsciiString.equals(Object).

    buchgr authored Scottmitch committed
    Motivation:
    
    The current implementation does byte by byte comparison, which we have seen
    can be a performance bottleneck when the AsciiString is used as the key in
    a Map.
    
    Modifications:
    
    Use sun.misc.Unsafe (on supporting platforms) to compare up to eight bytes at a time
    and get closer to the performance of String.equals(Object).
    
    Result:
    
    Significant improvement (2x - 6x) in performance over the current implementation.
    
    Benchmark                                             (size)   Mode   Samples        Score  Score error    Units
    i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       10  thrpt        10 118843477.518 2347259.347    ops/s
    i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       50  thrpt        10 43910319.773   198376.996    ops/s
    i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual      100  thrpt        10 26339969.001   159599.252    ops/s
    i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual     1000  thrpt        10  2873119.030    20779.056    ops/s
    i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual    10000  thrpt        10   306370.450     1933.303    ops/s
    i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual   100000  thrpt        10    25750.415      108.391    ops/s
    i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       10  thrpt        10 248077563.510  635320.093    ops/s
    i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       50  thrpt        10 128198943.138  614827.548    ops/s
    i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual      100  thrpt        10 86195621.349  1063959.307    ops/s
    i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual     1000  thrpt        10 16920264.598    61615.365    ops/s
    i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual    10000  thrpt        10  1687454.747     6367.602    ops/s
    i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual   100000  thrpt        10   153717.851      586.916    ops/s
Commits on Apr 15, 2015
  1. @Scottmitch

    Commit b823bfa missed an include change

    Scottmitch authored
    Motivation:
    b823bfa introduced a compile error for the microbench package.
    
    Modifications:
    change AsciiString import to new package.
    
    Result:
    No more compile error.
Commits on Apr 13, 2015
  1. @nmittler

    Adding basic benchmarks for IntObjectHashMap

    nmittler authored
    Motivation:
    
    It needs to be fast :)
    
    Modifications:
    
    Added a simple benchmark to the microbench module.
    
    Result:
    
    Yay, benchmarks!
  2. @Scottmitch

    HTTP/2 Frame Writer Microbenchmark Fix

    Scottmitch authored
    Motivation:
    The Http2FrameWriterBenchmark JMH harness class name was not updated for the JVM arguments. The number of forks is 0 which means the JHM will share a JVM with the benchmarks.  Sharing the JVM may lead to less reliable benchmarking results and as doesn't allow for the command line arguments to be applied for each benchmark.
    
    Modifications:
    - Update the JMH version from 0.9 to 1.7.1.  Benchmarks wouldn't run on old version.
    - Increase the number of forks from 0 to 1.
    - Remove allocation of environment from static and cleanup AfterClass to using the Setup and Teardown methods. The forked JVM would not shut down correctly otherwise (and wait for 30+ seconds before timeing out).
    
    Result:
    Benchmarks that run as intended.
Commits on Mar 30, 2015
  1. @nmittler

    Cleaning up the initialization of Http2ConnectionHandler

    nmittler authored
    Motivation:
    
    It currently takes a builder for the encoder and decoder, which makes it difficult to decorate them.
    
    Modifications:
    
    Removed the builders from the interfaces entirely. Left the builder for the decoder impl but removed it from the encoder since it's constructor only takes 2 parameters. Also added decorator base classes for the encoder and decoder and made the CompressorHttp2ConnectionEncoder extend the decorator.
    
    Result:
    
    Fixes #3530
Commits on Mar 27, 2015
  1. @Scottmitch

    Http2DefaultFrameWriter microbenchmark

    Scottmitch authored Scottmitch committed
    Motivation:
    A microbenchmark will be useful to get a baseline for performance.
    
    Modifications:
    - Introduce a new microbenchmark which tests the Http2DefaultFrameWriter.
    - Allow benchmarks to run without thread context switching between JMH and Netty.
    
    Result:
    Microbenchmark exists to test performance.
Commits on Mar 3, 2015
  1. @normanmaurer
  2. @normanmaurer
Commits on Dec 31, 2014
  1. @daschl @trustin

    Fix ByteBufUtilBenchmark on utf8 encodings.

    daschl authored trustin committed
    Motivation
    ----------
    The performance tests for utf8 also used the getBytes on ASCII,
    which is incorrect and also provides different performance numbers.
    
    Modifications
    -------------
    Use CharsetUtil.UTF_8 instead of US_ASCII for the getBytes calls.
    
    Result
    ------
    Accurate and semantically correct benchmarking results on utf8
    comparisons.
Commits on Dec 26, 2014
  1. @normanmaurer @trustin

    Provide helper methods in ByteBufUtil to write UTF-8/ASCII CharSequen…

    normanmaurer authored trustin committed
    …ces. Related to [#909]
    
    Motivation:
    
    We expose no methods in ByteBuf to directly write a CharSequence into it. This leads to have the user either convert the CharSequence first to a byte array or use CharsetEncoder. Both cases have some overheads and we can do a lot better for well known Charsets like UTF-8 and ASCII.
    
    Modifications:
    
    Add ByteBufUtil.writeAscii(...) and ByteBufUtil.writeUtf8(...) which can do the task in an optimized way. This is especially true if the passed in ByteBuf extends AbstractByteBuf which is true for all of our implementations which not wrap another ByteBuf.
    
    Result:
    
    Writing an ASCII and UTF-8 CharSequence into a AbstractByteBuf is a lot faster then what the user could do by himself as we can make use of some package private methods and so eliminate reference and range checks. When the Charseq is not ASCII or UTF-8 we can still do a very good job and are on par in most of the cases with what the user would do.
    
    The following benchmark shows the improvements:
    
    Result: 2456866.966 ?(99.9%) 59066.370 ops/s [Average]
      Statistics: (min, avg, max) = (2297025.189, 2456866.966, 2586003.225), stdev = 78851.914
      Confidence interval (99.9%): [2397800.596, 2515933.336]
    
    Benchmark                                                        Mode   Samples        Score  Score error    Units
    i.n.m.b.ByteBufUtilBenchmark.writeAscii                         thrpt        50  9398165.238   131503.098    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeAsciiString                   thrpt        50  9695177.968   176684.821    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArray           thrpt        50  4788597.415    83181.549    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArrayWrapped    thrpt        50  4722297.435    98984.491    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringWrapped            thrpt        50  4028689.762    66192.505    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArray                 thrpt        50  3234841.565    91308.009    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArrayWrapped          thrpt        50  3311387.474    39018.933    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeAsciiWrapped                  thrpt        50  3379764.250    66735.415    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8                          thrpt        50  5671116.821   101760.081    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8String                    thrpt        50  5682733.440   111874.084    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArray            thrpt        50  3564548.995    55709.512    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArrayWrapped     thrpt        50  3621053.671    47632.820    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringWrapped             thrpt        50  2634029.071    52304.876    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArray                  thrpt        50  3397049.332    57784.119    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArrayWrapped           thrpt        50  3318685.262    35869.562    ops/s
    i.n.m.b.ByteBufUtilBenchmark.writeUtf8Wrapped                   thrpt        50  2473791.249    46423.114    ops/s
    Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1,387.417 sec - in io.netty.microbench.buffer.ByteBufUtilBenchmark
    
    Results :
    
    Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
    
    Results :
    
    Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
    
    The *ViaArray* benchmarks are basically doing a toString().getBytes(Charset) which the others are using ByteBufUtil.write*(...).
Commits on Nov 12, 2014
  1. @idelpivnitskiy @normanmaurer

    Benchmark for HttpRequestDecoder

    idelpivnitskiy authored normanmaurer committed
Commits on Sep 23, 2014
  1. @Scottmitch

    IPv6 address to string rfc5952

    Scottmitch authored
    Motivation:
    The java implementations for Inet6Address.getHostName() do not follow the RFC 5952 (http://tools.ietf.org/html/rfc5952#section-4) for recommended string representation. This introduces inconsistencies when integrating with other technologies that do follow the RFC.
    
    Modifications:
    -NetUtil.java to have another public static method to convert InetAddress to string. Inet4Address will use the java InetAddress.getHostAddress() implementation and there will be new code to implement the RFC 5952 IPV6 string conversion.
    -New unit tests to test the new method
    
    Result:
    Netty provides a RFC 5952 compliant string conversion method for IPV6 addresses
Commits on Jun 21, 2014
  1. @trustin

    Fix the inconsistencies between performance tests in ByteBufAllocator…

    trustin authored
    …Benchmark
    
    Motivation:
    
    default*() tests are performing a test in a different way, and they must be same with other tests.
    
    Modification:
    
    Make sure default*() tests are same with the others
    
    Result:
    
    Easier to compare default and non-default allocators
Commits on Jun 19, 2014
  1. @trustin

    Refactor FastThreadLocal to simplify TLV management

    trustin authored
    Motivation:
    
    When Netty runs in a managed environment such as web application server,
    Netty needs to provide an explicit way to remove the thread-local
    variables it created to prevent class loader leaks.
    
    FastThreadLocal uses different execution paths for storing a
    thread-local variable depending on the type of the current thread.
    It increases the complexity of thread-local removal.
    
    Modifications:
    
    - Moved FastThreadLocal and FastThreadLocalThread out of the internal
      package so that a user can use it.
    - FastThreadLocal now keeps track of all thread local variables it has
      initialized, and calling FastThreadLocal.removeAll() will remove all
      thread-local variables of the caller thread.
    - Added FastThreadLocal.size() for diagnostics and tests
    - Introduce InternalThreadLocalMap which is a mixture of hard-wired
      thread local variable fields and extensible indexed variables
    - FastThreadLocal now uses InternalThreadLocalMap to implement a
      thread-local variable.
    - Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator
      tells it to stop watching when its thread-local cache has been freed
      by FastThreadLocal.removeAll().
    - Added FastThreadLocalTest to ensure that removeAll() works
    - Added microbenchmark for FastThreadLocal and JDK ThreadLocal
    - Upgraded to JMH 0.9
    
    Result:
    
    - A user can remove all thread-local variables Netty created, as long as
      he or she did not exit from the current thread. (Note that there's no
      way to remove a thread-local variable from outside of the thread.)
    - FastThreadLocal exposes more useful operations such as isSet() because
      we always implement a thread local variable via InternalThreadLocalMap
      instead of falling back to JDK ThreadLocal.
    - FastThreadLocalBenchmark shows that this change improves the
      performance of FastThreadLocal even more.
Commits on Jun 13, 2014
  1. @belliottsmith @normanmaurer

    Introduce FastThreadLocal which uses an EnumMap and a predefined fixe…

    belliottsmith authored normanmaurer committed
    …d set of possible thread locals
    
    Motivation:
    Provide a faster ThreadLocal implementation
    
    Modification:
    Add a "FastThreadLocal" which uses an EnumMap and a predefined fixed set of possible thread locals (all of the static instances created by netty) that is around 10-20% faster than standard ThreadLocal in my benchmarks (and can be seen having an effect in the direct PooledByteBufAllocator benchmark that uses the DEFAULT ByteBufAllocator which uses this FastThreadLocal, as opposed to normal instantiations that do not, and in the new RecyclableArrayList benchmark);
    
    Result:
    Improved performance
Commits on Jun 5, 2014
  1. @normanmaurer

    [#2436] Unsafe*ByteBuf implementation should only invert bytes if Byt…

    normanmaurer authored
    …eOrder differ from native ByteOrder
    
    Motivation:
    Our Unsafe*ByteBuf implementation always invert bytes when the native ByteOrder is LITTLE_ENDIAN (this is true on intel), even when the user calls order(ByteOrder.LITTLE_ENDIAN). This is not optimal for performance reasons as the user should be able to set the ByteOrder to LITTLE_ENDIAN and so write bytes without the extra inverting.
    
    Modification:
    - Introduce a new special SwappedByteBuf (called UnsafeDirectSwappedByteBuf) that is used by all the Unsafe*ByteBuf implementation and allows to write without inverting the bytes.
    - Add benchmark
    - Upgrade jmh to 0.8
    
    Result:
    The user is be able to get the max performance even on servers that have ByteOrder.LITTLE_ENDIAN as their native ByteOrder.
Commits on May 29, 2014
  1. @trustin

    More realistic ByteBuf allocation benchmark

    trustin authored
    Motivation:
    
    Allocating a single buffer and releasing it repetitively for a benchmark will not involve the realistic execution path of the allocators.
    
    Modifications:
    
    Keep the last 8192 allocations and release them randomly.
    
    Result:
    
    We are now getting the result close to what we got with caliper.
Commits on Feb 23, 2014
  1. @daschl @normanmaurer

    Upgrade JMH to 0.4.1 and make use of @Params.

    daschl authored normanmaurer committed
Commits on Feb 14, 2014
  1. @daschl @trustin

    Update JMH to 0.3.2

    daschl authored trustin committed
  2. @trustin

    Fix wiki link

    trustin authored
Commits on Jan 15, 2014
  1. @daschl @normanmaurer

    Using SystemPropertyUtil for prperty parsing.

    daschl authored normanmaurer committed
  2. @daschl @normanmaurer
  3. @trustin
Commits on Jan 14, 2014
  1. @daschl @trustin

    microbench: move from Caliper to JMH

    daschl authored trustin committed
Commits on Dec 22, 2013
  1. @trustin
  2. @trustin
Commits on Nov 4, 2013
  1. @trustin
Commits on Aug 26, 2013
  1. @normanmaurer
Something went wrong with that request. Please try again.