Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for profiling benchmarks and reporting results to ReBenchDB #166

Open
2 of 4 tasks
smarr opened this issue Nov 3, 2021 · 2 comments
Open
2 of 4 tasks
Labels

Comments

@smarr
Copy link
Owner

smarr commented Nov 3, 2021

Looking at changes in benchmark numbers is unfortunately rarely very insightful by itself.

To understand what benchmarks spend their time on, it would be useful to add support for profiling.
Once upon a time, we had support for it already (for details: #18, #9, code removal: 6e6e251)

At this point in time, I am looking for having support for function profiling of interpreters with perf, Xcode Instruments, or perhaps Java's Flight Recorder.

Most urgent for me is the ability to profile the executors. One may perhaps also want to be able to profile the benchmarks themselves. Here the difference would be at which level profiling is done. So, at the VM or the application level.

Desired Features

  • make profiling information available where we analyze performance
  • define profiling commands/parameters for executors and benchmark suites
  • add a profiling execution mode or experiment setup to use the profiling commands/parameters
  • collect profiling information, extract the basic data and send to ReBenchDB for storage

ReBenchDB Mockup

An integration in ReBenchDB could include a new unfoldable section, which shows the basic profile. In this case, it's showing the result of:

perf record -g -F 9999 --call-graph lbr ./som-native-interp-ast -cp Smalltalk:Examples/Benchmarks/LanguageFeatures Examples/Benchmarks/BenchmarkHarness.som Dispatch 10 0 20
perf report -g graph --no-children --stdio

Screen Shot 2021-11-02 at 19 28 33

Once we have the data, we may also want a feature to compare profiles, similar to performance. Depending on the profiling data collected, which may be relative to the overall run time, it might be necessary to consider the actual run time to judge the differences, for instance to avoid showing increase where the overall time actually decreased but the relative parts increase.

Screen Shot 2021-11-03 at 09 09 22

Design Considerations

Integration with Benchmarking

For the seamless integration with benchmarking, we need to be able to match benchmark data with profiling data.
This mean, internally, things need to end up having the same RunId.
That is, a specific profiling Run needs to be identified by the command line of the original benchmark Run.

Currently, we use those RunIds also to store data, track progress, etc.

It seems like I should probably leave the handling of RunIds alone.
And also track completion differently, if at all.

One way of doing it would be to have a different way of executing things.
rebench.executor.Executor works together with the RunScheduler to identify the runs to be executed, and composing of the final command line.

When composing the final command line for profiling, we need to consider the details for the profiler. This could perhaps be realized as a gauge_adapter?

Though, do I track completion? One way might be simply in a different data store, where only the details needed for completion on tracked, and possibly profiling results.

Machine Setup, Denoise

For benchmarking, we may want to reduce interference, and possible profiling interrupts as much as possible.
For profiling on the other hand, we may want to configure the machine mostly for profiling.
I don't know whether these settings make a practical difference for benchmarks, if no profiling is actually done.
Though, I guess there might be a difference?

So, in the unlikely event that there is, one may want to run benchmarks and profiling with different machine setups.

At the moment, we run denoise at the start, before running benchmarks, and then disable it afterwards. Thus, we don't do it before every benchmark.

To keep it like this, it means, we need to keep profiling and benchmark separate.
But since the benchmarking and profiling configurations likely result in the same experiments, which ReBench currently doesn't handle, this is likely a good idea anyway.

TODO

  • add a basic implementation supporting perf to ReBench
  • parse data and send compact representation to ReBenchDB
  • instead of a stats summary, show perhaps the first 3-4 lines of the profile in the summary after a ReBench run
  • add support for other profilers, perhaps just for running. May need ways to define output files, as well as profiler selection
@smarr smarr added the Feature label Nov 3, 2021
@smarr
Copy link
Owner Author

smarr commented Nov 3, 2021

Notes on invoking profiling with other tools than perf:

  • Xcode xcrun xctrace record --template 'Time Profiler' --output tr2.trace --launch -- /Users/smarr/Projects/FastStart/truffleruby/mxbuild/truffleruby-native/languages/ruby/bin/ruby --experimental-options --engine.Compilation=false harness.rb MicroDispatchBase 200 40

  • Java's Flight Recorder needs the following parameters: -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:StartFlightRecording=delay=10s,duration=10d,name=fr-recording2,filename=fr-recording2.jfr,settings=profile

Compilation Changes

  • GraalVM native image compilation may or may not need the some of following arguments: -H:-DeleteLocalSymbols -g

@smarr
Copy link
Owner Author

smarr commented Nov 3, 2021

Some useful links, also to web-based profile inspectors:

One may want to keep the raw data of profiles around for inspection in an IDE context, or local tools. Though, for longer-term archival, we probably need to keep more compact information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant