Skip to content

Performance

artsiomkorzun edited this page Sep 2, 2021 · 14 revisions

Performance

Intro

How to measure a logging library performance? It is a good question. Here we are going to make a relative comparison with log4j well-known among java community. We will take one logger configuration for gflog and log4j and see how they perform from 3 different angles: sustained throughput, service latency and memory footprint. We will use log4j async garbage-free mode and gflog almost-default configuration. For sanity's sake we will consider only one benchmark case with 5 params message. But you can always check out the sources to change, build and run any benchmark, as well as the results to different benchmark cases tested on the same environment (you need to clone wiki repo and go to benchmark directory).

Environment

Software/Hardware:

  • Java 11 (OpenJDK 11.0.10)
  • CentOS 7
  • Kernel Boot Parameters: isolcpus=2-7 nohz_full=2-7 rcu_nocbs=2-7
  • Intel(R) Xeon(R) CPU E5-2687W v2 @ 3.40GHz (8 cores, HT disabled)
  • Samsung SSD 850 (500 GB capacity, 540 MB/sec sequential read, 520 MB/sec sequential write)

Hyper-threading is disabled. 2-7 cpus are isolated from the kernel scheduler. Application threads use thread affinity to pin themselves to a specific cpu in sequential order (0 -> 2, 1 -> 3, etc). The background logger threads are not pinned and allowed to migrate from 0-1 cpus.

Configuration

Versions:

  • gflog 3.0.1
  • log4j 2.14.0, disruptor 3.4.2.

gflog config:

<config>
    <appender name="noop" factory="com.epam.deltix.gflog.benchmark.util.NoOpAppenderFactory"/>
    <!-- noop or file (see the value in chart's brackets)-->
    <appender name="file" factory="com.epam.deltix.gflog.core.appender.FileAppenderFactory" file="${temp-file}">
        <layout template="%d %p '%c' [%t] %m%n"/>
    </appender>
    
    <logger>
        <appender-ref ref="noop"/>
        <!-- or -->
        <appender-ref ref="file"/>
    </logger>
    <service entryEncoding="UTF-8"/>
</config>

The basic idea is not to change gflog defaults to provide optimal performance out of the box. So only one changed setting is entry encoding. ASCII encoding is default one and a subset of UTF-8 encoding. It is changed to make an apple-to-apple comparison, though only ASCII characters are used in benchmarks.

log4j config:

<Configuration name="Benchmark" status="info">
    <Appenders>
        <Null name="noop">
        </Null>
        <!-- noop or file (see the value in chart's brackets)-->
        <File name="file" fileName="${sys:temp-file}" immediateFlush="false">
            <PatternLayout pattern="%d %p '%c' [%t] %m%n"/>
        </File>
    </Appenders>
    <Loggers>
        <Root level="INFO">
            <AppenderRef ref="noop"/>
            <!-- or -->
            <AppenderRef ref="file"/>
        </Root>
    </Loggers>
</Configuration>

log4j system properties:

Log4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
AsyncLogger.RingBufferSize=262144
AsyncLogger.WaitStrategy=Yield
log4j2.enable.threadlocals=true
log4j2.enable.direct.encoders=true

The same log4j configuration is used as here.

Benchmark parameters:

  • Logger Name (30 chars): "123456789012345678901234567890"
  • Thread Name (23 chars): "12345678901234567890123"
  • Log Message (52 chars): "Some array: [string,c,1234567,12345678901234,string]"
  • Log Record (122 chars): "2021-03-30 17:48:43.390 INFO '123456789012345678901234567890' [main] Some array: [string,c,1234567,12345678901234,string]\n"
  • Log Message Params (5): string, int, long, char, string

Throughput

Java Microbenchmark Harness (JMH) library is used to measure sustained throughput:

  • Warmup Iterations: 2
  • Warmup Interval: 10 sec
  • Measurement Iterations: 5
  • Measurement Interval: 10 sec
  • Forks: 3
  • Threads: 1-6
  • Affinity: 2-7

A no-operation (noop) appender is used to measure the max achievable sustained throughput. Log records are formatted and passed to the background thread, but the appender does nothing.

A file appender is used to measure the sustained throughput when logging to a file.

Disk write throughput utilization from the declared (520 MB/s):

  • gflog - 71.55% - (3.05 * 122 / 520)
  • log4j - 29.56% - (1.26 * 122 / 520)

Latency

The following code is used to measure service latency:

  • Warmup Duration: 1 min
  • Measurement Duration: 3 min
  • Thread Interval: 50 us
  • Thread Throughput: 10K
  • Overall Throughput: 50K
  • Threads: 5
  • Affinity: 2-6

A file appender is used to measure the service latency. The duration (3 min) is split into intervals (50 us per a thread) at which a message is logged. An application thread catches up instantly if the next point is missed. Note that it might cause a batch of messages to be logged at once on a context switch.

baseline is an empty method and represents the overhead of the harness which runs latency benchmarks.

timestamp is System.currentTimeMillis() and represents the cost of getting the current time within the harness.

Memory

Java Object Layout (JOL) library is used to estimate memory footprint. It is run to traverse object reachability graph from a logger instance after logging 1/1M/10M/100M messages with the following random distribution:

  • 50% - a message with a length from 0 to 100
  • 20% - a message with a length from 0 to 200
  • 10% - a message with a length from 0 to 300
  • 10% - a message with a length from 0 to 400
  • 9% - a message with a length from 0 to 500
  • 1% - a message with a length from 0 to 1000

gflog top 5 (java: 11, messages: 1)

Instances Footprint/Instances (bytes) Footprint (bytes) Type
3512 200 702416 byte[]
2443 36 88256 java.lang.Object[]
3390 24 81360 java.lang.String
2338 32 74816 java.lang.invoke.LambdaForm$Name
148 504 74624 int[]
24960 - 1654232 (total)

gflog top 5 (java: 11, messages: 1 million)

Instances Footprint/Instances (bytes) Footprint (bytes) Type
3738 190 710792 byte[]
1235 72 88920 java.lang.reflect.Field
2443 36 88256 java.lang.Object[]
3602 24 86448 java.lang.String
632 120 75840 java.lang.Class
25984 - 1702872 (total)

log4j top 7 (java: 11, messages: 1)

Instances Footprint/Instances (bytes) Footprint (bytes) Type
262144 104 27262976 org.apache.logging.log4j.core.async.RingBufferLogEvent
262143 32 8388576 org.apache.logging.log4j.util.SortedArrayStringMap
262144 24 6291456 org.apache.logging.log4j.core.time.MutableInstant
437 2583 1128808 int[]
601 1812 1089336 java.lang.Object[]
5486 148 816392 byte[]
5331 24 127944 java.lang.String
815846 - 45866000 (total)

log4j top 7 (java: 11, messages: 1 million)

Instances Footprint/Instances (bytes) Footprint (bytes) Type
268263 283 76023096 byte[]
262144 104 27262976 org.apache.logging.log4j.core.async.RingBufferLogEvent
262745 60 15769368 java.lang.Object[]
262144 24 6291456 java.lang.StringBuilder
262144 24 6291456 org.apache.logging.log4j.core.time.MutableInstant
452 2497 1129048 int[]
5958 24 142992 java.lang.String
1342827 - 133755584 (total)

Note that JOL does not count off-heap memory and static roots. 10 MB is added to gflog result. 8 MB is off-heap ring buffer. 2 MB is extra space in reserve.

Clone this wiki locally