Issue #1138 Replacement of sliding time window. #1139

storozhukBM · 2017-06-12T18:28:39Z

Hi metrics team,

This PR provides drop-in replacement of SlidingTimeWindowReservoir called SlidingTimeWindowArrayReservoir. This implementation based uppon ChunkedAssociativeLongArray structure which stores all values in long arrays that are allocated and freed by big chunks. This way of storing reservoir values gives us drastically lower memory overhead over ConcurrentSkipListMap used by current implementation.

There is also new metrics-jcstress module intoduced within this PR. I've used OpenJDK Java Concurrency Stress tests to prove thread safety of new reservoir.

As for benchmarks:

On already existing benchmark called ReservoirBenchmark new implementation has 4.2x lower latency than old one. And most importantly SlidingTimeWindowArrayReservoir spends only 271 [ms] in GC during benchmark while previous implementation spends 13428 [ms]. With all this changes SlidingTimeWindowArrayReservoir now has performance comparable with ExponentiallyDecayingReservoir.
I also created new benchmark called SlidingTimeWindowReservoirsBenchmark with different scenario for this reservoirs. New benchmark has 3 threads that update target reservoir and 1 thread that makes snapshots. In such scenario new implementation has 2.5x faster writes, 3.5x faster snapshots and spends 40x less time in GC.

Small (1 hour long) load tests for two implementations. Load test also has 3 writers and 1 reader. Frequency of writes changes by periodic function and periods become shorter with time.

Results bellow:

If you're OK with this changes we've basically two ways to merge this PR. We can leave this two separate reservoirs in library and give our users possibility to choose one. Or we can completely replace old implementation with new one.

With benchmark results. Code cleanup and refactoring required. Concurrency tests required.

…nt reads/writes

…time_window

vladimir-bukhtoyarov · 2017-06-13T06:33:01Z

metrics-core/src/main/java/com/codahale/metrics/SlidingTimeWindowArrayReservoir.java

+            newTick = getTick();
+            boolean longOverflow = newTick < lastTick;
+            if (longOverflow) {
+                measurements.clear();


Previous implementation was able to handle nanotime overflow without loosing anything.

vladimir-bukhtoyarov · 2017-06-13T07:15:37Z

metrics-core/src/main/java/com/codahale/metrics/ChunkedAssociativeLongArray.java

+            if (currentChunk.tailChunk == null) {
+                break;
+            }
+            currentChunk = currentChunk.tailChunk;


Reading mutable fields without synchronization. trim and put do mutation of:

cursor

startIndex

tailChunk
So result can be totaly wrong, even negative.

vladimir-bukhtoyarov · 2017-06-13T07:34:39Z

General comment:
ChunkedAssociativeLongArray has only one properly synchronized method - put. The rest of methods: traverse, values, out, trim, clear and others are not properly synchronized, because do access to mutable fields of class Chunk without any synchronization.

…e for 416 days from reservoir creation.

storozhukBM · 2017-06-13T09:44:21Z

Hi @vladimir-bukhtoyarov,
Thank you for review. I understand all concerns that you have but that is only after first look.
All operations in ChunkedAssociativeLongArray are designed to opreate without additional synchronization and recover from possible partial reads.
There is JCStress tests created to reproduce this concurrent writes and reads and prove that SlidingTimeWindowArrayReservoir behaves consistently.
So if you will be able to reproduce any situation that leads to data loss or run time errors it would be really helpfull, but I can assure you that I've done my best to prevent any such case.

As for long overflows:
Yes that is definitelly disadvantage of ChunkedAssociativeLongArray structure because we can't resort all values inside to recover from overflow, so we basically should clear all values from reservoir. But I have a compromise solution where we can delay such overflow by 416 days from reservoir creation.
I changed comparisonResultsTest to reproduce overflow situation and then made overflow delay change to fix this test.

vladimir-bukhtoyarov · 2017-06-14T06:55:32Z

Hi Bogdan,

You wrote that ChunkedAssociativeLongArray are designed to operate without additional synchronization. But after detailed look, I did not find any design solution in this class, which help to eliminate negative effect of evil dataraces which presented in ChunkedAssociativeLongArray multiple times. Nothing to prevent to concurrent mutation of same chunk, and concurrent reading and writing of same chunk, so I continue to blame ChunkedAssociativeLongArray as not properly synchronized code which eventually will crash monitoring. Just run following benchmark to observe how dataraces are evil:

package com.codahale.metrics.benchmarks;

import com.codahale.metrics.SlidingTimeWindowArrayReservoir;
import com.codahale.metrics.Snapshot;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.profile.GCProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.options.TimeValue;

import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;


@State(Scope.Benchmark)
public class SlidingTimeWindowReservoirsDataRaceBenchmark {

    private final SlidingTimeWindowArrayReservoir reservoir = new SlidingTimeWindowArrayReservoir(200, TimeUnit.NANOSECONDS);
    private AtomicLong sequence = new AtomicLong(1);

    @Benchmark
    @Group("contention")
    @GroupThreads(2)
    public void update() {
        reservoir.update(sequence.incrementAndGet());
    }

    @Benchmark
    @Group("contention")
    @GroupThreads(1)
    public int getSize() {
        int size = reservoir.size();
        if (size < 0) {
            throw new IllegalStateException("size must not be negative");
        }
        return size;
    }

    @Benchmark
    @Group("contention")
    @GroupThreads(1)
    public Snapshot getSnapshot() {
        Snapshot snapshot = reservoir.getSnapshot();
        if (snapshot.size() != 0 && snapshot.getMin() == 0) {
            throw new IllegalStateException("Zero was never recorded and should not present in snapshot");
        }
        return snapshot;
    }

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
            .include(".*" + SlidingTimeWindowReservoirsDataRaceBenchmark.class.getSimpleName() + ".*")
            .threads(4)
            .warmupIterations(10)
            .measurementIterations(10)
            .addProfiler(GCProfiler.class)
            .measurementTime(TimeValue.seconds(10))
            .timeUnit(TimeUnit.MICROSECONDS)
            .mode(Mode.AverageTime)
            .forks(1)
            .build();

        new Runner(opt).run();
    }
}

Benchmark crashes in few milliseconds.

# VM invoker: C:\Program Files\Java\jdk1.8.0_101\jre\bin\java.exe
# VM options: -Didea.launcher.port=7534 -Didea.launcher.bin.path=D:\dev\bin\gui-tools\IDEA\IntelliJ IDEA 2016.3\bin -Dfile.encoding=UTF-8
# Warmup: 10 iterations, 1 s each
# Measurement: 10 iterations, 10 s each
# Threads: 4 threads, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.codahale.metrics.benchmarks.SlidingTimeWindowReservoirsDataRaceBenchmark.contention

# Run progress: 0,00% complete, ETA 00:01:50
# Fork: 1 of 1
# Warmup Iteration   1: 0,619 ±(99.9%) 1,188 us/op
# Warmup Iteration   2: <failure>

java.lang.NegativeArraySizeException
	at com.codahale.metrics.ChunkedAssociativeLongArray.values(ChunkedAssociativeLongArray.java:76)
	at com.codahale.metrics.SlidingTimeWindowArrayReservoir.getSnapshot(SlidingTimeWindowArrayReservoir.java:73)
	at com.codahale.metrics.benchmarks.SlidingTimeWindowReservoirsDataRaceBenchmark.getSnapshot(SlidingTimeWindowReservoirsDataRaceBenchmark.java:47)
	at com.codahale.metrics.benchmarks.generated.SlidingTimeWindowReservoirsDataRaceBenchmark_contention.getSnapshot_avgt_jmhStub(SlidingTimeWindowReservoirsDataRaceBenchmark_contention.java:371)
	at com.codahale.metrics.benchmarks.generated.SlidingTimeWindowReservoirsDataRaceBenchmark_contention.contention_AverageTime(SlidingTimeWindowReservoirsDataRaceBenchmark_contention.java:282)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.openjdk.jmh.runner.LoopBenchmarkHandler$BenchmarkTask.call(LoopBenchmarkHandler.java:210)
	at org.openjdk.jmh.runner.LoopBenchmarkHandler$BenchmarkTask.call(LoopBenchmarkHandler.java:192)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

# Run complete. Total time: 00:00:02

According to jsstress - it is addional tool which can be helpful sometimes, but primary tool of any java programmer who creates concurrent code should be his brain and formal loggic described in JMM specification. If jcstress does not show errors, but code is not well formed synchronized according to JMM rules, that final decision about code correctnes should be taken according to JMM, violations of formal rules are quit enough to blame any code.

And general note about dataraces:
In very few very limited cases, dataraces are acceptable, for example hashcode method of java.lang.String, But each place where datarace exists, should be well documented by tons of comments which describe why each particular datarace does not damage your code, especially in opensource, else nobody can mantain library in future.

Sorry if I'm not polite.

storozhukBM · 2017-06-14T09:49:01Z

Hi Vladimir,
Thank you one more time for review and exhaustive comment.
I must admin that it was stupid to try to hack the system 😶.
My latest PR makes everything simple and straightforward.

Of course this change brings some different numbers but they are still impressive and GC pauses are even lower (ReentrantLock creates synchronization helper objects).
In ReservoirBenchmark: updates are 2.7x faster and 80x less time spent in GC.
In SlidingTimeWindowReservoirsBenchmark: updates are 2.4x faster, reads - 5.6x faster, 57x less time spent in GC.

vladimir-bukhtoyarov · 2017-06-14T12:46:19Z

Now, It looks correct!

storozhukBM · 2017-06-14T13:20:38Z

OK, so should we replace the existing SlidingTimeWindowReservoir?

vladimir-bukhtoyarov · 2017-06-14T13:24:26Z

@arteam please take a look to this request.

arteam · 2017-06-14T13:26:33Z

I will review tonight. Thank you!

arteam · 2017-06-14T17:55:54Z

OK, so should we replace the existing SlidingTimeWindowReservoir?

I reviewed and the implementation looks good (or I can't spot obvious concurrency bugs). Let's replace SlidingTimeWindowReservoir and I will merge the pull request!

ryantenney · 2017-06-14T17:58:08Z

I would have suggested adding it as part of 3.2 and letting users opt into it, and then replace the other one in 3.3 or 4.

arteam · 2017-06-14T18:04:55Z

On second thought, that's probably the best option from the point of reducing the risk of production bugs for users of 3.2. The users who will want to test this reservoir, can test use it in 3.2.3. If everything goes right, then we will make it the default one.

storozhukBM · 2017-06-14T18:06:48Z

I reviewed and the implementation looks good (or I can't spot obvious concurrency bugs).

Don't worry, everything should be OK now. 😄

arteam · 2017-06-14T18:11:21Z

Thank you very much for the contribution!

storozhukBM · 2017-06-14T18:13:14Z

🎉
Thank you too, guys. Do you have any plans for release date?

arteam · 2017-06-14T18:20:35Z

I thought about making a release on Monday. There is a couple of useful contributions in the 3.2-development branch, which would be nice to release.

arteam · 2017-06-20T20:22:29Z

Sorry for the delay, I am busy this week. Will try to make code reviews and cut a release this week.

…egistry (#238) Support for this is desirable due to the reasons discussed in http://taint.org/2014/01/16/145944a.html and https://medium.com/hotels-com-technology/your-latency-metrics-could-be-misleading-you-how-hdrhistogram-can-help-9d545b598374 According to https://metrics.dropwizard.io/4.0.0/manual/core.html#sliding-time-window-reservoirs `SlidingTimeWindowArrayReservoir` added in dropwizard/metrics#1139 should not have the same perf issues as the old sliding time window reservoir.

Rather than using a custom reservoir in a few places use the new SlidingTimeWindowArrayReservoir everywhere. This will ensure all histograms values decay in a timely fashion with low traffic and also accurately capture min and max values. https://medium.com/expedia-group-tech/your-latency-metrics-could-be-misleading-you-how-hdrhistogram-can-help-9d545b598374 goes more into the details about what is wrong with the current histogram implantation. dropwizard/metrics#1139 and https://metrics.dropwizard.io/4.0.0/manual/core.html#sliding-time-window-reservoirs

storozhukBM added 11 commits April 5, 2017 12:23

Drop-in replacement of sliding time window reservoir.

7abc2cf

With benchmark results. Code cleanup and refactoring required. Concurrency tests required.

chunks with two indexes

d30f41c

snapshot gathering optimisation

1f9ecb1

jcstress tests and few fixes

36acdc4

JCStress tests improvements and additional JMH benchmark for concurre…

ce1e0e6

…nt reads/writes

Load test plus results for new implementation.

3ab321a

PR preparation. Clean-up

b55701a

Load test results for old implementation.

36d60c6

Merge branch 'double_indexed_chunk' into 1138_replacement_of_sliding_…

8fd7bc2

…time_window

Test results clean up

13be79c

Code clean-up.

ade445e

vladimir-bukhtoyarov reviewed Jun 13, 2017

View reviewed changes

Changes to ticks calculations in order to delay overflow and make onc…

e60e167

…e for 416 days from reservoir creation.

Change synchronization mechanism.

f17c64b

arteam merged commit 0e7718b into dropwizard:3.2-development Jun 14, 2017

jplock added this to the 3.2.3 milestone Jun 14, 2017

jplock added the improvement label Jun 14, 2017

kasper0406 mentioned this pull request Apr 16, 2019

Added support for creating a SlidingTimeWindowArrayReservoir metric registry palantir/tritium#238

Merged

dvlato mentioned this pull request Jun 12, 2019

Evaluate/use alternative implementation for our HDR Histogram ExpediaGroup/styx#431

Closed

vmaheshw mentioned this pull request Sep 20, 2019

Replace SlidingTimeWindowReservoir with SlidingTimeWindowArrayReservoir and reduce Histogram time linkedin/brooklin#655

Merged

dmichel1 mentioned this pull request Apr 27, 2020

[statistics] use SlidingTimeWindowArrayReservoir for histograms spotify/heroic#649

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #1138 Replacement of sliding time window. #1139

Issue #1138 Replacement of sliding time window. #1139

storozhukBM commented Jun 12, 2017

vladimir-bukhtoyarov Jun 13, 2017

vladimir-bukhtoyarov Jun 13, 2017 •

edited

Loading

vladimir-bukhtoyarov commented Jun 13, 2017 •

edited

Loading

storozhukBM commented Jun 13, 2017

vladimir-bukhtoyarov commented Jun 14, 2017

storozhukBM commented Jun 14, 2017 •

edited

Loading

vladimir-bukhtoyarov commented Jun 14, 2017

storozhukBM commented Jun 14, 2017

vladimir-bukhtoyarov commented Jun 14, 2017

arteam commented Jun 14, 2017

arteam commented Jun 14, 2017 •

edited

Loading

ryantenney commented Jun 14, 2017

arteam commented Jun 14, 2017 •

edited

Loading

storozhukBM commented Jun 14, 2017 •

edited

Loading

arteam commented Jun 14, 2017

storozhukBM commented Jun 14, 2017

arteam commented Jun 14, 2017

arteam commented Jun 20, 2017

Issue #1138 Replacement of sliding time window. #1139

Issue #1138 Replacement of sliding time window. #1139

Conversation

storozhukBM commented Jun 12, 2017

vladimir-bukhtoyarov Jun 13, 2017

Choose a reason for hiding this comment

vladimir-bukhtoyarov Jun 13, 2017 • edited Loading

Choose a reason for hiding this comment

vladimir-bukhtoyarov commented Jun 13, 2017 • edited Loading

storozhukBM commented Jun 13, 2017

vladimir-bukhtoyarov commented Jun 14, 2017

storozhukBM commented Jun 14, 2017 • edited Loading

vladimir-bukhtoyarov commented Jun 14, 2017

storozhukBM commented Jun 14, 2017

vladimir-bukhtoyarov commented Jun 14, 2017

arteam commented Jun 14, 2017

arteam commented Jun 14, 2017 • edited Loading

ryantenney commented Jun 14, 2017

arteam commented Jun 14, 2017 • edited Loading

storozhukBM commented Jun 14, 2017 • edited Loading

arteam commented Jun 14, 2017

storozhukBM commented Jun 14, 2017

arteam commented Jun 14, 2017

arteam commented Jun 20, 2017

vladimir-bukhtoyarov Jun 13, 2017 •

edited

Loading

vladimir-bukhtoyarov commented Jun 13, 2017 •

edited

Loading

storozhukBM commented Jun 14, 2017 •

edited

Loading

arteam commented Jun 14, 2017 •

edited

Loading

arteam commented Jun 14, 2017 •

edited

Loading

storozhukBM commented Jun 14, 2017 •

edited

Loading