Add support for profiling benchmarks and reporting results to ReBenchDB #166

smarr · 2021-11-03T09:42:18Z

Looking at changes in benchmark numbers is unfortunately rarely very insightful by itself.

To understand what benchmarks spend their time on, it would be useful to add support for profiling.
Once upon a time, we had support for it already (for details: #18, #9, code removal: 6e6e251)

At this point in time, I am looking for having support for function profiling of interpreters with perf, Xcode Instruments, or perhaps Java's Flight Recorder.

Most urgent for me is the ability to profile the executors. One may perhaps also want to be able to profile the benchmarks themselves. Here the difference would be at which level profiling is done. So, at the VM or the application level.

Desired Features

make profiling information available where we analyze performance
define profiling commands/parameters for executors and benchmark suites
add a profiling execution mode or experiment setup to use the profiling commands/parameters
collect profiling information, extract the basic data and send to ReBenchDB for storage

ReBenchDB Mockup

An integration in ReBenchDB could include a new unfoldable section, which shows the basic profile. In this case, it's showing the result of:

perf record -g -F 9999 --call-graph lbr ./som-native-interp-ast -cp Smalltalk:Examples/Benchmarks/LanguageFeatures Examples/Benchmarks/BenchmarkHarness.som Dispatch 10 0 20
perf report -g graph --no-children --stdio

Once we have the data, we may also want a feature to compare profiles, similar to performance. Depending on the profiling data collected, which may be relative to the overall run time, it might be necessary to consider the actual run time to judge the differences, for instance to avoid showing increase where the overall time actually decreased but the relative parts increase.

Design Considerations

Integration with Benchmarking

For the seamless integration with benchmarking, we need to be able to match benchmark data with profiling data.
This mean, internally, things need to end up having the same RunId.
That is, a specific profiling Run needs to be identified by the command line of the original benchmark Run.

Currently, we use those RunIds also to store data, track progress, etc.

It seems like I should probably leave the handling of RunIds alone.
And also track completion differently, if at all.

One way of doing it would be to have a different way of executing things.
rebench.executor.Executor works together with the RunScheduler to identify the runs to be executed, and composing of the final command line.

When composing the final command line for profiling, we need to consider the details for the profiler. This could perhaps be realized as a gauge_adapter?

Though, do I track completion? One way might be simply in a different data store, where only the details needed for completion on tracked, and possibly profiling results.

Machine Setup, Denoise

For benchmarking, we may want to reduce interference, and possible profiling interrupts as much as possible.
For profiling on the other hand, we may want to configure the machine mostly for profiling.
I don't know whether these settings make a practical difference for benchmarks, if no profiling is actually done.
Though, I guess there might be a difference?

So, in the unlikely event that there is, one may want to run benchmarks and profiling with different machine setups.

At the moment, we run denoise at the start, before running benchmarks, and then disable it afterwards. Thus, we don't do it before every benchmark.

To keep it like this, it means, we need to keep profiling and benchmark separate.
But since the benchmarking and profiling configurations likely result in the same experiments, which ReBench currently doesn't handle, this is likely a good idea anyway.

TODO

add a basic implementation supporting perf to ReBench
parse data and send compact representation to ReBenchDB
instead of a stats summary, show perhaps the first 3-4 lines of the profile in the summary after a ReBench run
add support for other profilers, perhaps just for running. May need ways to define output files, as well as profiler selection

The text was updated successfully, but these errors were encountered:

smarr · 2021-11-03T09:48:27Z

Notes on invoking profiling with other tools than perf:

Xcode xcrun xctrace record --template 'Time Profiler' --output tr2.trace --launch -- /Users/smarr/Projects/FastStart/truffleruby/mxbuild/truffleruby-native/languages/ruby/bin/ruby --experimental-options --engine.Compilation=false harness.rb MicroDispatchBase 200 40
Java's Flight Recorder needs the following parameters: -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:StartFlightRecording=delay=10s,duration=10d,name=fr-recording2,filename=fr-recording2.jfr,settings=profile

Compilation Changes

GraalVM native image compilation may or may not need the some of following arguments: -H:-DeleteLocalSymbols -g

smarr · 2021-11-03T10:51:37Z

Some useful links, also to web-based profile inspectors:

One may want to keep the raw data of profiles around for inspection in an IDE context, or local tools. Though, for longer-term archival, we probably need to keep more compact information.

smarr added the Feature label Nov 3, 2021

smarr mentioned this issue Nov 7, 2021

Reconsider handling of identical BenchmarkConfig objects #21

Open

smarr mentioned this issue Dec 4, 2021

[T.1.4] Automatic Profiling #178

Closed

smarr mentioned this issue Mar 31, 2022

Add support for recording Profiling information #190

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for profiling benchmarks and reporting results to ReBenchDB #166

Add support for profiling benchmarks and reporting results to ReBenchDB #166

smarr commented Nov 3, 2021 •

edited

Loading

smarr commented Nov 3, 2021

smarr commented Nov 3, 2021

Add support for profiling benchmarks and reporting results to ReBenchDB #166

Add support for profiling benchmarks and reporting results to ReBenchDB #166

Comments

smarr commented Nov 3, 2021 • edited Loading

Desired Features

ReBenchDB Mockup

Design Considerations

Integration with Benchmarking

Machine Setup, Denoise

TODO

smarr commented Nov 3, 2021

smarr commented Nov 3, 2021

smarr commented Nov 3, 2021 •

edited

Loading