Add a by-benchmark longitudinal plot#429
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for generating benchmark longitudinal plots to show performance changes over time per benchmark. Key changes include:
- Updates to documentation in README files to introduce the new longitudinal plot.
- Modifications in generate_results.py to include the benchmark longitudinal plot in the plot generation process.
- Implementation of the new benchmark_longitudinal_plot function and its configuration in plot.py.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| bench_runner/templates/README.md | Documentation update to add a link to the new longitudinal plot. |
| bench_runner/scripts/generate_results.py | Updated to include benchmark_longitudinal_plot plot generation. |
| bench_runner/plot.py | Added new function and configuration for benchmark longitudinal plots. |
| README.md | Updated documentation sections to include benchmark longitudinal plot. |
Comments suppressed due to low confidence (1)
bench_runner/plot.py:561
- Ensure that 'r.flags' is consistently sorted prior to comparing with 'cfg["head_flags"]' so that order discrepancies do not lead to unexpected mismatches; consider sorting both values or using set comparison if order is not significant.
if r.version.startswith(cfg["version"]) and r.flags == cfg["head_flags"]:
|
|
||
| base = None | ||
| for r in results: | ||
| if r.version == cfg["base"] and r.flags == cfg["base_flags"]: |
There was a problem hiding this comment.
Consider ensuring that 'r.flags' is consistently sorted before comparison with 'cfg["base_flags"]' to guarantee reliable equality checking; if the order is not guaranteed, using a sorted version or a set comparison might be more robust.
| if r.version == cfg["base"] and r.flags == cfg["base_flags"]: | |
| if r.version == cfg["base"] and set(r.flags) == set(cfg["base_flags"]): |
|
|
||
| cfg = get_benchmark_longitudinal_plot_config() | ||
|
|
||
| results = [r for r in results if r.fork == "python" and r.nickname == cfg["runner"]] |
There was a problem hiding this comment.
Maybe this is better suited for a follow-up PR, but how about making 'runners' a list so we can plot different runners in a single graph?
There was a problem hiding this comment.
Yeah, that's a good idea.
This produces by-benchmark longitudinal plots, for example:
Plot by benchmark