Skip to content
This repository was archived by the owner on Oct 28, 2020. It is now read-only.

Remove list creation from benchmark #1

Open
nipafx opened this issue Mar 12, 2017 · 2 comments
Open

Remove list creation from benchmark #1

nipafx opened this issue Mar 12, 2017 · 2 comments

Comments

@nipafx
Copy link
Member

nipafx commented Mar 12, 2017

Many implementations remove elements from the list by mutating it (others create a new one), which makes it necessary to run each invocation on a fresh list. Creating a large list takes some time as well of course, so this considerably influences the overall runtime. How can the effect of creating the list be separated from actual removal?

Include list creation - subtract baseline

The current benchmarks include list creation. To make up for that one benchmark only measures creation time (called the baseline) so the actual time for removal can be computed by subtracting it:

@Benchmark
public void baseline(Blackhole bh) {
	List<Integer> list = createArrayList();
	bh.consume(list);
}

@Benchmark
public void iterativeAt(Blackhole bh) {
	List<Integer> list = createArrayList();
	List<Integer> removed = IterativeAtRemover.remove(list, removeAts);
	bh.consume(removed);
}

This is not ideal to quickly compare results and also feels a little messy - is there no cleaner approach?

Lifecycle per invocation

With @Setup(Level.Invocation) it would be possible to set the list up in a lifecycle method but the Javadoc warns against using it and I do not understand all implications.

/**
 * Invocation level: to be executed for each benchmark method execution.
 *
 * <p><b>WARNING: HERE BE DRAGONS! THIS IS A SHARP TOOL.
 * MAKE SURE YOU UNDERSTAND THE REASONING AND THE IMPLICATIONS
 * OF THE WARNINGS BELOW BEFORE EVEN CONSIDERING USING THIS LEVEL.</b></p>
 *
 * <p>This level is only usable for benchmarks taking more than a millisecond
 * per single {@link Benchmark} method invocation. It is a good idea to validate
 * the impact for your case on ad-hoc basis as well.</p>
 *
 * <p>WARNING #1: Since we have to subtract the setup/teardown costs from
 * the benchmark time, on this level, we have to timestamp *each* benchmark
 * invocation. If the benchmarked method is small, then we saturate the
 * system with timestamp requests, which introduce artificial latency,
 * throughput, and scalability bottlenecks.</p>
 *
 * <p>WARNING #2: Since we measure individual invocation timings with this
 * level, we probably set ourselves up for (coordinated) omission. That means
 * the hiccups in measurement can be hidden from timing measurement, and
 * can introduce surprising results. For example, when we use timings to
 * understand the benchmark throughput, the omitted timing measurement will
 * result in lower aggregate time, and fictionally *larger* throughput.</p>
 *
 * <p>WARNING #3: In order to maintain the same sharing behavior as other
 * Levels, we sometimes have to synchronize (arbitrage) the access to
 * {@link State} objects. Other levels do this outside the measurement,
 * but at this level, we have to synchronize on *critical path*, further
 * offsetting the measurement.</p>
 *
 * <p>WARNING #4: Current implementation allows the helper method execution
 * at this Level to overlap with the benchmark invocation itself in order
 * to simplify arbitrage. That matters in multi-threaded benchmarks, when
 * one worker thread executing {@link Benchmark} method may observe other
 * worker thread already calling {@link TearDown} for the same object.</p>
 */

With regards to the first line:

This level is only usable for benchmarks taking more than a millisecond per single Benchmark method invocation

I could achieve that by creating a separate benchmark for larger lists, which might be a good idea anyway. But does that mean I can ignore the other warnings?

What else?

Are there other solutions for this problem?

@shipilev
Copy link

First, you have to understand that there is no good solution. There are several approaches one can use to approximate what is wanted:

a. Keep it non-steady-state: run down the removals until the collection is empty, and then refill at one go. This makes the collection replenish amortized by the number of elements in collection. The down-side is non-steady-stateness: you will get the average performance from different cases, e.g. entire spectrum from completely full collection to completely empty.

b. Replenish the collection outside of measurement, i.e. with setup at Level.Invocation. On the plus side, it keeps the code under measurement the same. On the minus side, gets all the caveats Level.Invocation.

c. Replenish the collection inside the measurement, i.e. at the same [at]Benchmark method. On the plus side, no caveats from [at]Setup(Level.Invocation). On the minus side, the measurement now includes setup costs. It might be worthwhile to explore the quickest way to reinstantiate the collection to offset setup costs.

d. Pair the removal with addition. That is, for the workload that removes the element, have the paired operation that inserts it back. While this mixes the workloads, it does keep you at steady state.

e. Create a series of lists to remove from, loop over them in benchmark method. Also offsets the replenish costs. Hopefully you can run down the entire series in some decent time, to avoid replenishes altogether.

f. Give up. Accept that some things you just cannot accurately measure.

@nipafx
Copy link
Member Author

nipafx commented Mar 15, 2017

Thank you very much for your input 👍 , I will think about and try some things out.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants