feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes#712
Conversation
james-rl
left a comment
There was a problem hiding this comment.
There's a few areas of potential inconsistency or where the naming looks a bit weird, but this looks generally ok.
| # Get info | ||
| info = benchmark_run.get_info() | ||
| assert info.id == run_data.id | ||
| assert info.state in ["queued", "running", "completed", "canceled"] |
There was a problem hiding this comment.
queued doesn't exist as a state?
| def list_scenario_runs( | ||
| self, | ||
| **params: Unpack[SDKBenchmarkRunListScenarioRunsParams], | ||
| ) -> List[ScenarioRunView]: | ||
| """List all scenario runs for this benchmark run. | ||
|
|
||
| :param params: See :typeddict:`~runloop_api_client.sdk._types.SDKBenchmarkRunListScenarioRunsParams` for available parameters | ||
| :return: List of scenario run views | ||
| :rtype: List[ScenarioRunView] | ||
| """ | ||
| page = self._client.benchmarks.runs.list_scenario_runs( | ||
| self._id, | ||
| **params, | ||
| ) | ||
| return list(page) |
There was a problem hiding this comment.
Consider improving the interaction pattern to provide something like a paged iterator (or adding a separate iterator method to do this instead).
| async def get_info( | ||
| self, | ||
| **options: Unpack[BaseRequestOptions], | ||
| ) -> BenchmarkRunView: |
There was a problem hiding this comment.
Can we call this something else instead? get_state or refresh?
04b9a9c to
4650641
Compare
james-rl
left a comment
There was a problem hiding this comment.
it's a shame that get_info is now a widely supported convention but I agree that it's worse to make a change to only one place.
A minor nit for you in the comments, but this generally looks good
| self._id = run_id | ||
| self._benchmark_id = benchmark_id | ||
|
|
||
| @override |
There was a problem hiding this comment.
I don't think you want @override without a base class
| @override | ||
| def __repr__(self) -> str: | ||
| return f"<BenchmarkRun id={self._id!r}>" |
* update requirements-dev * pyproject formatting nit * feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes * fixed smoketests * `list_scenario_runs()` now returns a list of ScenarioRun/AsyncScenarioRun objects
* fix(types): allow pyright to infer TypedDict types within SequenceNotStr * chore: add missing docstrings * feat(devbox): added stdin streaming endpoint * chore(internal): add missing files argument to base client * feat(benchmarks): add `update_scenarios` method to benchmarks resource * fix(benchmarks): `update()` for benchmarks and scenarios replaces all provided fields and does not modify unspecified fields (#6702) * feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes (#712) * update requirements-dev * pyproject formatting nit * feat(sdk): add BenchmarkRun and AsyncBenchmarkRun classes * fixed smoketests * `list_scenario_runs()` now returns a list of ScenarioRun/AsyncScenarioRun objects * cleanup(agents): unified version parameter across agent sources (#713) * cleanup(agents): unified version parameter across agent sources * increase snapshot test timeout * reinsert version parameter into example code * fix: use async_to_httpx_files in patch method * codegen metadata * feat(sdk): add Benchmark and AsyncBenchmark classes (#714) * feat(sdk): add Benchmark and AsyncBenchmark classes (with some import and test id cleanup) * raise exceptions instead of skipping, more defensively run scenario * rename benchmark `run()` to `start_run()` * more helpful example docstrings * comments about params type splitting for developer clarity * remove low value unit tests * add smoketest TODOs * skip list_runs() smoketest when no available benchmark runs * create/update custom benchmark and scenarios for smoketest, remove benchmark retrieval smoketest * feat(sdk): add BenchmarkOps and AsyncBenchmarkOps to SDK (#716) * chore(internal): add `--fix` argument to lint script * chore(internal): codegen related update * feat(client): add support for binary request streaming * feat(devbox): remove this one * feat(network-policy): add network policies to api * chore(internal): update `actions/checkout` version * feat(blueprint): Set cilium network policy on blueprint build (#7006) * chore(devbox): Remove network policy from devbox view; use launch params instead (#7025) * refactor(benchmark): Deprecate /benchmark/{id}/runs in favor of /benchmark_runs (#7019) * release: 1.3.0-alpha * cp dines --------- Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com> Co-authored-by: sid-rl <siddarth@runloop.ai> Co-authored-by: Alexander Dines <alex@runloop.ai>
Add new SDK classes for managing benchmark runs:
BenchmarkRun: Synchronous class for benchmark run operations
AsyncBenchmarkRun: Async version with the same interface
SDKBenchmarkRunListScenarioRunsParams: TypedDict for list params
Unit tests for both sync and async classes
E2E smoketests validating against real API