Skip to content

Usability improvements: transparent levels, compute(), better errors#834

Merged
blooop merged 7 commits intomainfrom
feature/usability-improvements
Apr 5, 2026
Merged

Usability improvements: transparent levels, compute(), better errors#834
blooop merged 7 commits intomainfrom
feature/usability-improvements

Conversation

@blooop
Copy link
Copy Markdown
Owner

@blooop blooop commented Mar 25, 2026

Summary

Addresses downstream user feedback that bencher is unintuitive. Four targeted changes:

  • Transparent level parameter: Extract LEVEL_SAMPLES constant, add BenchRunCfg.level_to_samples() lookup, improved docstring with mapping table, new samples_per_var parameter as a direct alternative to level, INFO-level logging of effective sample counts
  • Simpler benchmark definitions: __init_subclass__ auto-wrapping lets subclasses define compute() instead of the update_params_from_kwargs + super().__call__() boilerplate. Classic __call__ pattern still works unchanged.
  • Helpful error messages: String/dict variable lookups now raise KeyError listing available parameter names with "Did you mean?" suggestions via difflib.get_close_matches
  • BenchRunCfg docs & factories: Restructured docstring with grouped parameter sections and quick-start examples. New for_time_series() and for_ci() factory classmethods.

Test plan

  • pixi run ci passes (945 tests, format, lint all clean)
  • New test/test_usability.py covers compute() method, LEVEL_SAMPLES, level_to_samples(), samples_per_var, and factory classmethods (16 tests)
  • New tests in test/test_sweep_executor.py cover helpful error messages on bad string/dict var names (2 tests)
  • All existing tests unaffected — compute() wrapping is opt-in only

🤖 Generated with Claude Code

Summary by Sourcery

Improve benchmark configuration usability, sample control, and error feedback for sweep variables.

New Features:

  • Add a LEVEL_SAMPLES constant and BenchRunCfg.level_to_samples() helper for querying sample counts from a sampling level.
  • Introduce a samples_per_var run configuration option that directly sets the sample count for all sweep variables, taking precedence over level.
  • Add BenchRunCfg.for_time_series() and BenchRunCfg.for_ci() factory helpers for common time-series and CI benchmarking setups.

Bug Fixes:

  • Make string and dict-based variable lookups in the sweep executor raise KeyError with available parameter names and "Did you mean?" suggestions on typos.

Enhancements:

  • Rewrite BenchRunCfg documentation with quick-start examples and grouped parameter sections to clarify usage.
  • Adjust sweep execution to apply samples_per_var or level consistently and log the effective sample counts.

Tests:

  • Add test_usability.py covering LEVEL_SAMPLES, level_to_samples(), samples_per_var behavior, and BenchRunCfg factory helpers.
  • Extend sweep executor tests to validate the new helpful error messages for invalid string or dict variable names.

…errors, BenchRunCfg factories

Address downstream feedback that bencher is unintuitive:

1. Make `level` transparent:
   - Extract LEVEL_SAMPLES constant with documented mapping
   - Add BenchRunCfg.level_to_samples() for programmatic lookup
   - Improve level docstring with full mapping table
   - Add samples_per_var parameter as direct alternative to level
   - Log effective sample counts at INFO level

2. Reduce __call__ boilerplate:
   - Add __init_subclass__ auto-wrapping so subclasses can define
     compute() instead of the update_params_from_kwargs + super().__call__
     sandwich. Classic __call__ pattern still works unchanged.

3. Better error messages for string var refs:
   - Wrap dict lookup with helpful KeyError listing available params
   - Use difflib.get_close_matches for "Did you mean?" suggestions

4. BenchRunCfg documentation & factories:
   - Restructure docstring with grouped parameter sections and examples
   - Add for_time_series() and for_ci() factory classmethods
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 25, 2026

Reviewer's Guide

Makes benchmark configuration and usage more transparent by centralizing level→sample mapping, adding an explicit samples_per_var override and convenience factory methods on BenchRunCfg, improving documentation, logging and error messages, and tightening tests around these behaviors.

Sequence diagram for plot_sweep applying samples_per_var and level

sequenceDiagram
    actor User
    participant PlotAPI as plot_sweep
    participant RunCfg as BenchRunCfg
    participant Inputs as SweepVariables

    User->>PlotAPI: call plot_sweep(input_vars_in, run_cfg)
    PlotAPI->>RunCfg: read samples_per_var
    alt samples_per_var is not None
        PlotAPI->>Inputs: with_samples(run_cfg.samples_per_var) on each variable
        PlotAPI-->>User: log info "samples_per_var applied"
    else samples_per_var is None
        PlotAPI->>RunCfg: read level
        alt level > 0
            PlotAPI->>Inputs: with_level(run_cfg.level) on each variable
            PlotAPI->>RunCfg: BenchRunCfg.level_to_samples(run_cfg.level)
            PlotAPI-->>User: log info "level -> samples per variable"
        else level == 0
            PlotAPI-->>User: use each variable's own samples
        end
    end
    PlotAPI-->>User: continue with sweep execution
Loading

Sequence diagram for SweepExecutor parameter lookup with helpful KeyError

sequenceDiagram
    participant Caller
    participant Exec as SweepExecutor
    participant Worker as ParametrizedSweep

    Caller->>Exec: convert_vars_to_params("var_name", Worker, var_type)
    Exec->>Exec: _lookup_param_by_name(Worker, "var_name", var_type)
    Exec->>Worker: param.objects(instance=False)
    Worker-->>Exec: params dict
    alt name in params
        Exec-->>Exec: return matching param.Parameter
        Exec-->>Caller: converted param.Parameter
    else name not in params
        Exec-->>Exec: build available list and close matches
        Exec-->>Caller: raise KeyError with available names and "Did you mean" suggestions
    end
Loading

Class diagram for updated BenchRunCfg, SweepBase, and SweepExecutor usability features

classDiagram
    class BenchPlotSrvCfg {
    }

    class BenchRunCfg {
        +int level
        +int samples_per_var
        +BenchRunCfg deep()
        +int level_to_samples(int level, int max_level)
        +BenchRunCfg for_time_series(str time_event, **kwargs)
        +BenchRunCfg for_ci(str time_event, **kwargs)
        +BenchRunCfg with_defaults(BenchRunCfg run_cfg, **defaults)
    }

    class SweepBase {
        +int samples
        +SweepBase with_samples(int samples)
        +SweepBase with_sample_values(list values)
        +tuple~SweepBase, Any~ with_const(Any const_value)
        +SweepBase with_level(int level, int max_level)
    }

    class LEVEL_SAMPLES {
        <<constant>>
        +list~int~ values
    }

    class SweepExecutor {
        +int cache_size
        +FutureCache sample_cache
        +SweepExecutor __init__(int cache_size)
        +param.Parameter _lookup_param_by_name(ParametrizedSweep worker_class_instance, str name, str var_type)
        +param.Parameter convert_vars_to_params(param.Parameter|str|dict|tuple variable, ParametrizedSweep worker_class_instance, str var_type)
    }

    BenchPlotSrvCfg <|-- BenchRunCfg

    LEVEL_SAMPLES --> SweepBase : used for level mapping
    BenchRunCfg --> LEVEL_SAMPLES : used by level_to_samples
    SweepExecutor --> ParametrizedSweep : operates on
    SweepExecutor --> param.Parameter : returns and manipulates
    BenchRunCfg ..> SweepExecutor : consumed via run_cfg in execution path
Loading

File-Level Changes

Change Details Files
Centralize and expose level-to-samples mapping and add helpers/factories on BenchRunCfg for clearer configuration.
  • Refactor BenchRunCfg docstring into grouped parameter sections with quick-start examples and an explicit level-to-samples table.
  • Add samples_per_var parameter that, when set, overrides level for all sweep variables.
  • Introduce BenchRunCfg.level_to_samples() static helper with validation and tests.
  • Add BenchRunCfg.for_time_series() and BenchRunCfg.for_ci() classmethods that preconfigure common benchmarking scenarios.
bencher/bench_cfg.py
test/test_usability.py
Use a shared LEVEL_SAMPLES constant for sweep variables and log effective sampling when applying level or samples_per_var.
  • Define LEVEL_SAMPLES in sweep_base.py as the single source of truth for level-to-samples mapping and update SweepBase.with_level() to use it.
  • Export LEVEL_SAMPLES from the bencher package and add tests to verify monotonicity and package exposure.
  • Update plot_sweep() to honor samples_per_var (with INFO logging) before level, and log level→samples mapping when level is applied.
bencher/variables/sweep_base.py
bencher/bencher.py
bencher/__init__.py
test/test_usability.py
Improve error messages when resolving sweep variables by name in the SweepExecutor.
  • Add SweepExecutor._lookup_param_by_name() helper that raises KeyError with available parameter names and difflib-based "Did you mean" suggestions on typos.
  • Use the new lookup helper for both string and dict variable specifications in convert_vars_to_params().
  • Add tests to assert helpful KeyError content for invalid string and dict variable names.
bencher/sweep_executor.py
test/test_sweep_executor.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions
Copy link
Copy Markdown

Performance Report for 4759383

Metric Value
Total tests 950
Total time 77.00s
Mean 0.0811s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 20.674
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.404
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 3.121
test.test_generated_examples::test_generated_example[result_types/result_image/result_image_to_video.py] 3.103
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 1.850
test.test_generated_examples::test_generated_example[1_float/over_time_repeats/sweep_1_float_3_cat_over_time_repeats.py] 1.172
test.test_result_bool.TestVolumeResult::test_volume_3float_multi_repeat 1.121
test.test_generated_examples::test_generated_example[1_float/over_time/sweep_1_float_3_cat_over_time.py] 0.920
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 0.918
test.test_generated_examples::test_generated_example[3_float/over_time/sweep_3_float_2_cat_over_time.py] 0.874

Full report

Updated by Performance Tracking workflow

The __init_subclass__/compute() pattern is superseded by the benchmark()
method already on ParametrizedSweep, which solves the same boilerplate
problem more simply via runtime dispatch.
@blooop blooop marked this pull request as ready for review April 5, 2026 13:15
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The LEVEL_SAMPLES table and BenchRunCfg.level bounds are effectively capped at 12, but LEVEL_SAMPLES defines a 13th entry and level_to_samples() validates against len(LEVEL_SAMPLES) - 1, which means level=13 is technically accepted there but not elsewhere; consider aligning the array length, validation, and documentation so the supported range is consistent across the codebase.
  • In BenchRunCfg.level_to_samples(), the validation and error message are tied to len(LEVEL_SAMPLES) rather than the max_level argument, which can be confusing when callers pass a custom max_level; it would be clearer to validate and phrase the error in terms of max_level or to remove max_level if clamping beyond the table is not required.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `LEVEL_SAMPLES` table and `BenchRunCfg.level` bounds are effectively capped at 12, but `LEVEL_SAMPLES` defines a 13th entry and `level_to_samples()` validates against `len(LEVEL_SAMPLES) - 1`, which means `level=13` is technically accepted there but not elsewhere; consider aligning the array length, validation, and documentation so the supported range is consistent across the codebase.
- In `BenchRunCfg.level_to_samples()`, the validation and error message are tied to `len(LEVEL_SAMPLES)` rather than the `max_level` argument, which can be confusing when callers pass a custom `max_level`; it would be clearer to validate and phrase the error in terms of `max_level` or to remove `max_level` if clamping beyond the table is not required.

## Individual Comments

### Comment 1
<location path="bencher/bench_cfg.py" line_range="387-396" />
<code_context>
         return BenchRunCfg(**vars(parser.parse_args()))

+    @staticmethod
+    def level_to_samples(level: int, max_level: int = 12) -> int:
+        """Return the number of samples-per-variable for a given *level*.
+
+        Args:
+            level: Sampling level (1-12).
+            max_level: Cap applied before lookup. Defaults to 12.
+
+        Returns:
+            The sample count for this level.
+
+        Raises:
+            ValueError: If *level* is out of range.
+
+        Example::
+
+            >>> BenchRunCfg.level_to_samples(5)
+            9
+        """
+        if level < 1 or level >= len(LEVEL_SAMPLES):
+            raise ValueError(f"level must be between 1 and {len(LEVEL_SAMPLES) - 1}, got {level}")
+        return LEVEL_SAMPLES[min(max_level, level)]
</code_context>
<issue_to_address>
**issue:** Guard against `max_level` values below 1 to avoid returning 0 samples.

`level_to_samples` validates `level` but not `max_level`. With `max_level=0`, `min(max_level, level)` is 0, so the function returns `LEVEL_SAMPLES[0] == 0`, conflicting with the documented guarantee of a positive sample count. Please either clamp `max_level` into `[1, len(LEVEL_SAMPLES) - 1]` or raise if `max_level < 1` to avoid this silent misconfiguration.
</issue_to_address>

### Comment 2
<location path="test/test_usability.py" line_range="55-73" />
<code_context>
+        cfg = BenchRunCfg()
+        self.assertIsNone(cfg.samples_per_var)
+
+    def test_samples_per_var_overrides_level(self):
+        """When samples_per_var is set, the bench should use that count regardless of level."""
+        bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, samples_per_var=7))
+        result = bench.plot_sweep()
+        # The sweep should have used 7 samples for theta
+        ds = result.ds
+        self.assertEqual(len(ds.coords["theta"]), 7)
</code_context>
<issue_to_address>
**suggestion (testing):** Also test interaction when both level and samples_per_var are set

This only covers the case where `samples_per_var` is set alone. To cover the documented precedence, please add a test where both `level` and `samples_per_var` are set, e.g.:

```python
cfg = BenchRunCfg(headless=True, level=5, samples_per_var=7)
bench = BenchFloat().to_bench(cfg)
result = bench.plot_sweep()
self.assertEqual(len(result.ds.coords["theta"]), 7)
```

This ensures `samples_per_var` overrides `level` when both are provided.

```suggestion
class TestSamplesPerVar(unittest.TestCase):
    def test_default_is_none(self):
        cfg = BenchRunCfg()
        self.assertIsNone(cfg.samples_per_var)

    def test_samples_per_var_overrides_level_when_only_samples_per_var_set(self):
        """When samples_per_var is set, the bench should use that count."""
        bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, samples_per_var=7))
        result = bench.plot_sweep()
        # The sweep should have used 7 samples for theta
        ds = result.ds
        self.assertEqual(len(ds.coords["theta"]), 7)

    def test_samples_per_var_overrides_level_when_both_set(self):
        """When both level and samples_per_var are set, samples_per_var takes precedence."""
        cfg = BenchRunCfg(headless=True, level=5, samples_per_var=7)
        bench = BenchFloat().to_bench(cfg)
        result = bench.plot_sweep()
        # Even though level=5, we should still get 7 samples because samples_per_var overrides level.
        ds = result.ds
        self.assertEqual(len(ds.coords["theta"]), 7)

    def test_level_still_works(self):
        bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, level=3))
        result = bench.plot_sweep()
        ds = result.ds
        # level 3 → 3 samples
        self.assertEqual(len(ds.coords["theta"]), 3)
```
</issue_to_address>

### Comment 3
<location path="test/test_sweep_executor.py" line_range="178-187" />
<code_context>
         self.assertEqual(result.name, "theta")
         # The parameter should have been processed with level adjustment

+    def test_convert_vars_to_params_bad_string_gives_helpful_error(self):
+        """Test that a typo in a string variable name gives a helpful KeyError."""
+        with self.assertRaises(KeyError) as ctx:
+            self.executor.convert_vars_to_params(
+                "thetaa",
+                "input",
+                None,
+                worker_class_instance=self.worker_instance,
+                worker_input_cfg=ExampleBenchCfg,
+            )
+        msg = str(ctx.exception)
+        self.assertIn("thetaa", msg)
+        self.assertIn("not found", msg)
+        self.assertIn("Available parameters", msg)
+        self.assertIn("theta", msg)  # "Did you mean" suggestion
+
+    def test_convert_vars_to_params_bad_dict_name_gives_helpful_error(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for when there are no close matches to avoid brittle assumptions about suggestions

To better exercise `_lookup_param_by_name`, please add a case where there are no close matches (e.g. `"zzzzzz"`). That will verify the error still mentions the missing name and available parameters, and that the "Did you mean" line is omitted (or at least doesn’t surface unrelated suggestions), keeping behavior stable if the suggestion logic or `difflib` tuning changes.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread bencher/bench_cfg.py
Comment on lines +387 to +396
def level_to_samples(level: int, max_level: int = 12) -> int:
"""Return the number of samples-per-variable for a given *level*.

Args:
level: Sampling level (1-12).
max_level: Cap applied before lookup. Defaults to 12.

Returns:
The sample count for this level.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Guard against max_level values below 1 to avoid returning 0 samples.

level_to_samples validates level but not max_level. With max_level=0, min(max_level, level) is 0, so the function returns LEVEL_SAMPLES[0] == 0, conflicting with the documented guarantee of a positive sample count. Please either clamp max_level into [1, len(LEVEL_SAMPLES) - 1] or raise if max_level < 1 to avoid this silent misconfiguration.

Comment thread test/test_usability.py
Comment on lines +55 to +73
class TestSamplesPerVar(unittest.TestCase):
def test_default_is_none(self):
cfg = BenchRunCfg()
self.assertIsNone(cfg.samples_per_var)

def test_samples_per_var_overrides_level(self):
"""When samples_per_var is set, the bench should use that count regardless of level."""
bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, samples_per_var=7))
result = bench.plot_sweep()
# The sweep should have used 7 samples for theta
ds = result.ds
self.assertEqual(len(ds.coords["theta"]), 7)

def test_level_still_works(self):
bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, level=3))
result = bench.plot_sweep()
ds = result.ds
# level 3 → 3 samples
self.assertEqual(len(ds.coords["theta"]), 3)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Also test interaction when both level and samples_per_var are set

This only covers the case where samples_per_var is set alone. To cover the documented precedence, please add a test where both level and samples_per_var are set, e.g.:

cfg = BenchRunCfg(headless=True, level=5, samples_per_var=7)
bench = BenchFloat().to_bench(cfg)
result = bench.plot_sweep()
self.assertEqual(len(result.ds.coords["theta"]), 7)

This ensures samples_per_var overrides level when both are provided.

Suggested change
class TestSamplesPerVar(unittest.TestCase):
def test_default_is_none(self):
cfg = BenchRunCfg()
self.assertIsNone(cfg.samples_per_var)
def test_samples_per_var_overrides_level(self):
"""When samples_per_var is set, the bench should use that count regardless of level."""
bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, samples_per_var=7))
result = bench.plot_sweep()
# The sweep should have used 7 samples for theta
ds = result.ds
self.assertEqual(len(ds.coords["theta"]), 7)
def test_level_still_works(self):
bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, level=3))
result = bench.plot_sweep()
ds = result.ds
# level 3 → 3 samples
self.assertEqual(len(ds.coords["theta"]), 3)
class TestSamplesPerVar(unittest.TestCase):
def test_default_is_none(self):
cfg = BenchRunCfg()
self.assertIsNone(cfg.samples_per_var)
def test_samples_per_var_overrides_level_when_only_samples_per_var_set(self):
"""When samples_per_var is set, the bench should use that count."""
bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, samples_per_var=7))
result = bench.plot_sweep()
# The sweep should have used 7 samples for theta
ds = result.ds
self.assertEqual(len(ds.coords["theta"]), 7)
def test_samples_per_var_overrides_level_when_both_set(self):
"""When both level and samples_per_var are set, samples_per_var takes precedence."""
cfg = BenchRunCfg(headless=True, level=5, samples_per_var=7)
bench = BenchFloat().to_bench(cfg)
result = bench.plot_sweep()
# Even though level=5, we should still get 7 samples because samples_per_var overrides level.
ds = result.ds
self.assertEqual(len(ds.coords["theta"]), 7)
def test_level_still_works(self):
bench = BenchFloat().to_bench(bn.BenchRunCfg(headless=True, level=3))
result = bench.plot_sweep()
ds = result.ds
# level 3 → 3 samples
self.assertEqual(len(ds.coords["theta"]), 3)

Comment on lines +178 to +187
def test_convert_vars_to_params_bad_string_gives_helpful_error(self):
"""Test that a typo in a string variable name gives a helpful KeyError."""
with self.assertRaises(KeyError) as ctx:
self.executor.convert_vars_to_params(
"thetaa",
"input",
None,
worker_class_instance=self.worker_instance,
worker_input_cfg=ExampleBenchCfg,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for when there are no close matches to avoid brittle assumptions about suggestions

To better exercise _lookup_param_by_name, please add a case where there are no close matches (e.g. "zzzzzz"). That will verify the error still mentions the missing name and available parameters, and that the "Did you mean" line is omitted (or at least doesn’t surface unrelated suggestions), keeping behavior stable if the suggestion logic or difflib tuning changes.

Resolve conflicts:
- bench_cfg.py: keep PR's restructured docstring with parameter groups
- sweep_executor.py: use main's _resolve_param (drop redundant _lookup_param_by_name)
- sweep_base.py: use PR's LEVEL_SAMPLES constant + main's list() pickle fix
- Update tests to match _resolve_param error format
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Performance Report for b061ca7

Metric Value
Total tests 1224
Total time 109.65s
Mean 0.0896s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 21.832
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.444
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.966
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 3.065
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.047
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.894
test.test_bencher.TestBencher::test_combinations_over_time 1.484
test.test_time_event_curve.TestTimeEventCurvePlot::test_curve_with_string_time_src_and_cat 1.154
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.083
test.test_optuna_result.TestOptunaReportRouting::test_optuna_plots_per_sweep_tab 1.082

Full report

Updated by Performance Tracking workflow

- Remove for_ci() and for_time_series() factory classmethods (trivial
  constructor wrappers that don't justify extra API surface)
- Restore full Attributes docstring on BenchRunCfg, merged with the new
  quick-start examples and level-to-samples table
- Add samples_per_var to Attributes list
- Bump holobench version to 1.79.0
- Update CHANGELOG with all changes since 1.75.2
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Performance Report for daba469

Metric Value
Total tests 1220
Total time 107.30s
Mean 0.0880s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 21.047
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.174
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.930
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.048
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 3.043
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.872
test.test_bencher.TestBencher::test_combinations_over_time 1.498
test.test_optuna_result.TestOptunaReportRouting::test_optuna_plots_per_sweep_tab 1.122
test.test_time_event_curve.TestTimeEventCurvePlot::test_curve_with_string_time_src_and_cat 1.090
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.035

Full report

Updated by Performance Tracking workflow

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Performance Report for 648afa6

Metric Value
Total tests 1220
Total time 105.84s
Mean 0.0868s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 21.060
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.239
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.790
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.012
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.991
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.802
test.test_bencher.TestBencher::test_combinations_over_time 1.442
test.test_time_event_curve.TestTimeEventCurvePlot::test_curve_with_string_time_src_and_cat 1.078
test.test_optuna_result.TestOptunaReportRouting::test_optuna_plots_per_sweep_tab 1.049
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.036

Full report

Updated by Performance Tracking workflow

blooop added 2 commits April 5, 2026 14:21
…ersion bump

The CHANGELOG was missing entries for releases 1.76.0, 1.77.0, and
1.78.0. Add per-release sections based on git tag ranges. Move
post-1.78.0 changes (not yet released) into [Unreleased]. Revert the
version bump to 1.79.0 since that will happen at release time.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Performance Report for d2fc588

Metric Value
Total tests 1220
Total time 106.05s
Mean 0.0869s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 20.930
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.234
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.835
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.038
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 3.018
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.777
test.test_bencher.TestBencher::test_combinations_over_time 1.421
test.test_time_event_curve.TestTimeEventCurvePlot::test_curve_with_string_time_src_and_cat 1.111
test.test_optuna_result.TestOptunaReportRouting::test_optuna_plots_per_sweep_tab 1.079
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.045

Full report

Updated by Performance Tracking workflow

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Performance Report for 1a9b2fd

Metric Value
Total tests 1220
Total time 105.12s
Mean 0.0862s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 20.833
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.100
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 3.823
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.028
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.984
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.768
test.test_bencher.TestBencher::test_combinations_over_time 1.423
test.test_time_event_curve.TestTimeEventCurvePlot::test_curve_with_string_time_src_and_cat 1.084
test.test_optuna_result.TestOptunaReportRouting::test_optuna_plots_per_sweep_tab 1.069
test.test_optuna_result.TestOptunaResult::test_collect_optuna_plots_with_repeats 1.030

Full report

Updated by Performance Tracking workflow

@blooop blooop merged commit 006dad0 into main Apr 5, 2026
7 checks passed
@blooop blooop deleted the feature/usability-improvements branch April 5, 2026 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant