Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom logging/visualization during the benchmark run #2054

Closed
caaavik-msft opened this issue Jul 28, 2022 · 9 comments · Fixed by #2420
Closed

Custom logging/visualization during the benchmark run #2054

caaavik-msft opened this issue Jul 28, 2022 · 9 comments · Fixed by #2420
Assignees
Milestone

Comments

@caaavik-msft
Copy link
Contributor

Currently, if we want to do some custom visualization or export of benchmark results, we have to wait until all the benchmarks have been completed to be able to interpret the results by using a custom Exporter. As an example, the only way to see the results of benchmarks as they run is by what is passed through to the logger by BDN (link to relevant source). In addition to this, the benchmark runner in BDN does lots of logging at various stages and there is no way to customize these logs or to use them meaningfully outside the context of a console/terminal.

For large projects which have lots of benchmarks that take a long time to run, it would be good to be able to hook into the various stages of the benchmark running so that we don't have to wait until the whole run is completed to be able to get custom exported information.

For this issue, I wanted to suggest a new feature to hook into the benchmark running to be able to subscribe to "events". An example of an event might be "RunStarted" or "RunCompleted" and those events would have read-only data attached to them containing the additional data about the event. I think as well, that all the logging code that is done today by BDN could be moved out to use this same abstraction, that way we can replace or extend it with our own custom logging. Not only would this be useful for customising the log output, but it could also provide the ability to build tools on top of BDN that display live execution results graphically rather than textually.

Diagnosers do somewhat implement this functionality with the Handle(HostSignal, DiagnoserActionParameters) method, but they are only run within the context of a single benchmark execution. If we want to subscribe to events such as when a benchmark case has completed running and the BenchmarkReport is available such as the example linked above, then it would be difficult to see how IDiagnoser could be extended to accommodate such an event.

I am willing to implement this but wanted to get some confirmation that this is a desirable feature for the project. If given the go-ahead, I can follow up with a proposal for how it could be implemented. Alternatively, if this feature is a bit too ambitious with wanting to have events for everything that needs to be logged, I think I still would like to look into something that lets me handle BenchmarkReport objects as they are created.

@adamsitnik
Copy link
Member

I am willing to implement this but wanted to get some confirmation that this is a desirable feature for the project. If given the go-ahead, I can follow up with a proposal for how it could be implemented.

@caaavik-msft I confirm that this is a desirable feature. Please go ahead and send the proposal.

@caaavik-msft
Copy link
Contributor Author

caaavik-msft commented Aug 2, 2022

I spent some time writing up a proposal and realised it was getting a bit complex, and also that during the functioning of a benchmark run there are lots of cases where logs are done so it wouldn't be possible to replace every log with something that can be dispatched to an event handler. Instead, I just brought it down to the most common events and I think it will cover 99% of use cases for someone wanting to build any tools on top of BDN. I think as part of implementing this we would leave the logging in as it is, and it would not be brought out into the event handler.

I have a commit on my fork implementing this proposal to show it in practice: caaavik-msft@ab86077

Proposal

My proposal is to add a new interface called an IBenchmarkEventHandler. This interface defines handle methods for all the events that occur during the benchmark running process. Since there isn't any concept of events currently in BDN, I had to make them myself based off my understanding of how BDN works. The result is the following interface:

public interface IBenchmarkEventHandler
{
    void HandleStartValidationStage();
    void HandleUnsupportedBenchmark(BenchmarkCase benchmarkCase);
    void HandleValidationError(ValidationError validationError);
    void HandleStartBuildStage();
    void HandleBuildFailed(BenchmarkCase benchmarkCase, BuildResult buildResult);
    void HandleStartRunStage();
    void HandleRunBenchmarksInType(Type type, IReadOnlyList<BenchmarkCase> benchmarks);
    void HandleCompletedBenchmarksInType(Type type, Summary summary);
    void HandleRunBenchmark(BenchmarkCase benchmarkCase);
    void HandleCompletedBenchmark(BenchmarkCase benchmarkCase, BenchmarkReport report);
}

Here is an explanation of what each event represents:

  • HandleStartValidationStage: called before the validators and GetSupportedBenchmarks is called
  • HandleUnsupportedBenchmark: called for every BenchmarkCase that is not supported by the toolchain
  • HandleValidationError: called for every validation error
  • HandleStartBuildStage: called before the benchmarks are split up into partitions and built
  • HandleBuildFailed: called for every BenchmarkCase that had a failed build, along with the build result
  • HandleStartRunStage: called after the build stage, and before the loop over all the BenchmarkRunInfos
  • HandleRunBenchmarksInType: called before running the benchmarks for a single BenchmarkRunInfo
  • HandleCompletedBenchmarksInType: called after running the benchmarks for a single BenchmarkRunInfo and we have a Summary object
  • HandleRunBenchmark: called before running a given BenchmarkCase
  • HandleCompletedBenchmark: called after running a given BenchmarkCase and passed in the resulting BenchmarkReport

This type will be registered the same way as the other types like IExporter, IDiagnoser in that they are stored on the IConfig and can be added using a AddEventHandler(IBenchmarkEventHandler eventHandler) method on ManualConfig or could be set using attributes. Multiple event handlers can be set at once, and they can also be configured to be enabled for specific benchmark classes.

The commit I linked above is more of a proof-of-concept at the moment, so I wouldn't bother reviewing it just yet until we have had some time to review this proposal and see if we want to change the names of anything or add/remove any events. There are some other events I was considering adding such as events for each iteration (Pilot, OverheadWarmup, etc.) inside the running of a BenchmarkCase but didn't want to go crazy with lots of events when I am not sure if there will be a need to get that level of detail. Let me know what you think!

@adamsitnik
Copy link
Member

@albahari based on the integration that you have recently implemented for LINQPad (https://twitter.com/linqpad/status/1576879508078366720), is there anything you would like to add/change in this proposal?

@albahari
Copy link
Contributor

albahari commented Oct 6, 2022

That's a very good question. To create the live results visualizer in LINQPad, I implemented BenchmarkDotNet.Loggers.ILogger (I also ended up creating an event listener). In the Write/WriteLine implementations of ILogger, I look for lines that start with "OverheadActual", WorkloadActual", "WorkloadPilot", "WorkloadWarmup" and "GC:", and parse them by calling Measurement.Parse and GcStats.Parse. In order to know which case the results apply to, I ended up using the EventListener enabled for "BenchmarkDotNet.EngineEventSource", matching on eventData.EventName == "BenchmarkStart".

An interface such as the one you describe would make this considerably simpler and more robust, although I guess it would require a method to capture measurements and GC stats. Most of all, it would be great if it had a hook that fired when an exception was thrown by the code being benchmarked. Detecting this, right now, is the most brittle part of the LINQPad implementation. BenchmarkDotNet currently writes exceptions to ILogger with a LogKind.Default (and without a textual prefix) and then later writes another entry (without the exception text) with a LogKind.Error and text that contains "ExitCode != 0". Detecting and parsing this is unreliable as you can imagine, and I'm not sure that it can work with non-English locales. A HandleUserException(Exception) method would solve this nicely.

Also, might it be an idea to define BenchmarkEventHandler as a class with virtual (non-abstract) methods rather than an interface, given that it consists of hook methods?

@caaavik-msft
Copy link
Contributor Author

Completed in #2420

@timcassell timcassell linked a pull request Sep 21, 2023 that will close this issue
@timcassell timcassell added this to the v0.13.9 milestone Sep 21, 2023
@adamsitnik
Copy link
Member

Thank you for your contribution @caaavik-msft !

@adamsitnik
Copy link
Member

@albahari We won't be able to change this API easily in the future (we expect that more tooling will start depending on it very soon). It would be great if you could take a look at what we have right now, before we ship it and let us know if it meets LINQPad requirements.

The main type is define here: https://github.com/dotnet/BenchmarkDotNet/blob/master/src/BenchmarkDotNet/EventProcessors/EventProcessor.cs

You should be able to get the preview NuGet package from

<add key="bdn-nightly" value="https://www.myget.org/F/benchmarkdotnet/api/v3/index.json" />

@albahari
Copy link
Contributor

LINQPad updates its visualization by responding to each measurement, immediately after the measurement is logged. This enables early feedback on execution times and allows you see in real time how the performance figures stabilize as data from more iterations comes in. As far as I can see, none of the methods on EventProcessor fire until all measurements have been averaged and the test case has ended. Is there something that I'm missing here?

Also, I don't see a way to capture exception data, which is the most fragile part of parsing the output log. Maybe that belongs in a separate request.

@timcassell
Copy link
Collaborator

@caaavik-msft Do you want to extend the functionality to cover @albahari's needs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants