feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes by sid-rl · Pull Request #712 · runloopai/api-client-python

sid-rl · 2025-12-17T01:38:26Z

Add new SDK classes for managing benchmark runs:

BenchmarkRun: Synchronous class for benchmark run operations
- get_info(): Retrieve run status and metadata
- cancel(): Cancel the benchmark run
- complete(): Mark the run as completed
- list_scenario_runs(): List scenario runs with filtering
AsyncBenchmarkRun: Async version with the same interface
SDKBenchmarkRunListScenarioRunsParams: TypedDict for list params
Unit tests for both sync and async classes
E2E smoketests validating against real API

james-rl

There's a few areas of potential inconsistency or where the naming looks a bit weird, but this looks generally ok.

james-rl · 2025-12-17T17:58:26Z

tests/smoketests/sdk/test_benchmark_run.py

+            # Get info
+            info = benchmark_run.get_info()
+            assert info.id == run_data.id
+            assert info.state in ["queued", "running", "completed", "canceled"]


queued doesn't exist as a state?

tests/smoketests/sdk/test_async_benchmark_run.py

james-rl · 2025-12-17T18:09:59Z

src/runloop_api_client/sdk/benchmark_run.py

+    def list_scenario_runs(
+        self,
+        **params: Unpack[SDKBenchmarkRunListScenarioRunsParams],
+    ) -> List[ScenarioRunView]:
+        """List all scenario runs for this benchmark run.
+
+        :param params: See :typeddict:`~runloop_api_client.sdk._types.SDKBenchmarkRunListScenarioRunsParams` for available parameters
+        :return: List of scenario run views
+        :rtype: List[ScenarioRunView]
+        """
+        page = self._client.benchmarks.runs.list_scenario_runs(
+            self._id,
+            **params,
+        )
+        return list(page)


Consider improving the interaction pattern to provide something like a paged iterator (or adding a separate iterator method to do this instead).

james-rl · 2025-12-17T18:12:14Z

src/runloop_api_client/sdk/async_benchmark_run.py

+    async def get_info(
+        self,
+        **options: Unpack[BaseRequestOptions],
+    ) -> BenchmarkRunView:


Can we call this something else instead? get_state or refresh?

…oRun objects

james-rl

it's a shame that get_info is now a widely supported convention but I agree that it's worse to make a change to only one place.

A minor nit for you in the comments, but this generally looks good

james-rl · 2025-12-17T20:32:26Z

src/runloop_api_client/sdk/async_benchmark_run.py

+        self._id = run_id
+        self._benchmark_id = benchmark_id
+
+    @override


I don't think you want @override without a base class

james-rl · 2025-12-17T20:32:42Z

src/runloop_api_client/sdk/benchmark_run.py

+    @override
+    def __repr__(self) -> str:
+        return f"<BenchmarkRun id={self._id!r}>"


same thing here

* update requirements-dev * pyproject formatting nit * feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes * fixed smoketests * `list_scenario_runs()` now returns a list of ScenarioRun/AsyncScenarioRun objects

* fix(types): allow pyright to infer TypedDict types within SequenceNotStr * chore: add missing docstrings * feat(devbox): added stdin streaming endpoint * chore(internal): add missing files argument to base client * feat(benchmarks): add `update_scenarios` method to benchmarks resource * fix(benchmarks): `update()` for benchmarks and scenarios replaces all provided fields and does not modify unspecified fields (#6702) * feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes (#712) * update requirements-dev * pyproject formatting nit * feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes * fixed smoketests * `list_scenario_runs()` now returns a list of ScenarioRun/AsyncScenarioRun objects * cleanup(agents): unified version parameter across agent sources (#713) * cleanup(agents): unified version parameter across agent sources * increase snapshot test timeout * reinsert version parameter into example code * fix: use async_to_httpx_files in patch method * codegen metadata * feat(sdk): add Benchmark and AsyncBenchmark classes (#714) * feat(sdk): add Benchmark and AsyncBenchmark classes (with some import and test id cleanup) * raise exceptions instead of skipping, more defensively run scenario * rename benchmark `run()` to `start_run()` * more helpful example docstrings * comments about params type splitting for developer clarity * remove low value unit tests * add smoketest TODOs * skip list_runs() smoketest when no available benchmark runs * create/update custom benchmark and scenarios for smoketest, remove benchmark retrieval smoketest * feat(sdk): add BenchmarkOps and AsyncBenchmarkOps to SDK (#716) * chore(internal): add `--fix` argument to lint script * chore(internal): codegen related update * feat(client): add support for binary request streaming * feat(devbox): remove this one * feat(network-policy): add network policies to api * chore(internal): update `actions/checkout` version * feat(blueprint): Set cilium network policy on blueprint build (#7006) * chore(devbox): Remove network policy from devbox view; use launch params instead (#7025) * refactor(benchmark): Deprecate /benchmark/{id}/runs in favor of /benchmark_runs (#7019) * release: 1.3.0-alpha * cp dines --------- Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com> Co-authored-by: sid-rl <siddarth@runloop.ai> Co-authored-by: Alexander Dines <alex@runloop.ai>

sid-rl requested a review from james-rl December 17, 2025 01:38

james-rl requested changes Dec 17, 2025

View reviewed changes

sid-rl added 3 commits December 17, 2025 10:17

update requirements-dev

db486de

pyproject formatting nit

3a88bc3

feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes

4650641

sid-rl force-pushed the siddarth/benchmark-sdk branch from 04b9a9c to 4650641 Compare December 17, 2025 18:17

sid-rl added 2 commits December 17, 2025 11:06

fixed smoketests

3dbc3ab

list_scenario_runs() now returns a list of ScenarioRun/AsyncScenari…

24b5387

…oRun objects

sid-rl requested a review from james-rl December 17, 2025 20:22

james-rl approved these changes Dec 17, 2025

View reviewed changes

sid-rl merged commit 1021c5a into next Dec 17, 2025
6 checks passed

sid-rl deleted the siddarth/benchmark-sdk branch December 17, 2025 20:37

stainless-app bot mentioned this pull request Dec 17, 2025

release: 1.3.0-alpha #708

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes#712

feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes#712
sid-rl merged 5 commits intonextfrom
siddarth/benchmark-sdk

sid-rl commented Dec 17, 2025

Uh oh!

james-rl left a comment

Uh oh!

james-rl Dec 17, 2025

Uh oh!

Uh oh!

james-rl Dec 17, 2025

Uh oh!

james-rl Dec 17, 2025

Uh oh!

james-rl left a comment

Uh oh!

james-rl Dec 17, 2025

Uh oh!

james-rl Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

sid-rl commented Dec 17, 2025

Uh oh!

james-rl left a comment

Choose a reason for hiding this comment

Uh oh!

james-rl Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

james-rl Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

james-rl Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

james-rl left a comment

Choose a reason for hiding this comment

Uh oh!

james-rl Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

james-rl Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants