Skip to content

feat(sdk): ScenarioRun class#681

Merged
sid-rl merged 10 commits intomainfrom
siddarth/sdk-catchup
Jan 20, 2026
Merged

feat(sdk): ScenarioRun class#681
sid-rl merged 10 commits intomainfrom
siddarth/sdk-catchup

Conversation

@sid-rl
Copy link
Copy Markdown
Contributor

@sid-rl sid-rl commented Jan 19, 2026

User description

⚠️ PR Title Must Follow Conventional Commits

Format: feat[optional scope]: <description>

Examples: feat: add new SDK method · feat(storage): support file uploads · feat!: breaking API change

Description

Motivation

Changes

Testing

  • Unit tests added
  • Integration tests added
  • Smoke Tests added/updated
  • Tested locally

Breaking Changes

Checklist

  • PR title follows Conventional Commits format (feat: or feat(scope):)
  • Documentation updated (if needed)
  • Breaking changes documented (if applicable)

CodeAnt-AI Description

Add ScenarioRun SDK class for managing scenario runs

What Changed

  • Introduces a ScenarioRun object in the SDK that represents a single scenario execution and exposes user-facing operations: view run info, wait for environment readiness, access the associated devbox, submit for scoring, wait for scoring, score+await, score+complete, complete, cancel, and retrieve scoring results.
  • Adds a downloadLogs action that saves the run's log archive to a file path provided by the caller.
  • Provides lazy access to the associated devbox so users can run commands against the devbox from the ScenarioRun instance.
  • Adds comprehensive unit and smoke tests that exercise lifecycle flows: environment readiness, separate and combined scoring flows, completion, cancellation, score retrieval, and log download.

Impact

✅ Easier devbox interaction for scenario runs
✅ Clearer scoring and completion workflows
✅ Easier retrieval of run logs to local files

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@codeant-ai
Copy link
Copy Markdown
Contributor

codeant-ai bot commented Jan 19, 2026

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added the size:XXL This PR changes 1000+ lines, ignoring generated files label Jan 19, 2026
@codeant-ai
Copy link
Copy Markdown
Contributor

codeant-ai bot commented Jan 19, 2026

CodeAnt AI finished reviewing your PR.

@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 98.18% 100%
Lines 85.41% - ℹ️
Branches 46.29% - ℹ️
Statements 83.97% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

Comment thread src/sdk/scenario-run.ts
Comment thread src/sdk/scenario-run.ts
Copy link
Copy Markdown
Contributor

@james-rl james-rl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions on the use of async and await here. You could return promises to some of the objects, but in some of these cases it seems more natural to return the awaited object instead.

Comment thread src/sdk/scenario-run.ts
Comment thread src/sdk/scenario-run.ts
Comment thread src/sdk/scenario-run.ts
Comment thread src/sdk/scenario-run.ts
Comment thread src/sdk/scenario-run.ts Outdated
@codeant-ai
Copy link
Copy Markdown
Contributor

codeant-ai bot commented Jan 20, 2026

CodeAnt AI is running Incremental review


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added size:XXL This PR changes 1000+ lines, ignoring generated files and removed size:XXL This PR changes 1000+ lines, ignoring generated files labels Jan 20, 2026
@codeant-ai
Copy link
Copy Markdown
Contributor

codeant-ai bot commented Jan 20, 2026

CodeAnt AI Incremental review completed.

@sid-rl sid-rl force-pushed the siddarth/sdk-catchup branch from 1c2d54d to 5e936de Compare January 20, 2026 19:08
@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 98.86% 100%
Lines 87.09% - ℹ️
Branches 65.11% - ℹ️
Statements 85.63% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 99.43% 100%
Lines 87.45% - ℹ️
Branches 65.11% - ℹ️
Statements 85.98% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 99.43% 100%
Lines 87.45% - ℹ️
Branches 65.11% - ℹ️
Statements 85.98% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 99.43% 100%
Lines 87.45% - ℹ️
Branches 65.11% - ℹ️
Statements 85.98% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

@sid-rl sid-rl requested review from dines-rl and james-rl January 20, 2026 20:00
@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 99.43% 100%
Lines 86.84% - ℹ️
Branches 65.11% - ℹ️
Statements 85.41% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

Copy link
Copy Markdown
Contributor

@james-rl james-rl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question for you about ScenarioRun.fromId but other than that this looks good

Comment thread src/sdk/scenario-run.ts Outdated
Comment thread src/sdk/scenario-run.ts Outdated
Comment on lines +57 to +70
/**
* Create a ScenarioRun instance from an ID.
*
* See the {@link ScenarioOps.fromId} method for calling this
* @private
*
* @param {Runloop} client - The Runloop client instance
* @param {string} id - The scenario run ID
* @param {string} devboxId - The associated devbox ID
* @returns {ScenarioRun} A {@link ScenarioRun} instance
*/
static fromId(client: Runloop, id: string, devboxId: string): ScenarioRun {
return new ScenarioRun(client, id, devboxId);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is confusing for a few reasons:

  • it's called fromId but you're passing in more than just an ID
  • more significantly, there's no link between the scenario run and the devbox that's passed in, but the scenario run is for a specific devbox ID. If the devbox ID isn't actually the same one as the scenario run is using under the hood the behavior is going to be unpredictable really hard to understand

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this method shouldn't exist, and doesn't on the python side. i must have confused this for the scenario.fromId that will be later implemented

expect(result).toBeDefined();
expect(['canceled', 'completed', 'failed']).toContain(result.state);
},
THIRTY_SECOND_TIMEOUT,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that you didn't add this, but I would discourage this kind of name for constants. It's better to give constants a name that reflects semantic meaning, like DEFAULT_TIMEOUT, so if 30 seconds is no longer a sensible default, then it's at least easy to change without making the constant work different to its name (eg. THIRTY_SECOND_TIMEOUT = 45s).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense!

@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 99.42% 100%
Lines 86.82% - ℹ️
Branches 65.11% - ℹ️
Statements 85.39% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 99.42% 100%
Lines 86.82% - ℹ️
Branches 65.11% - ℹ️
Statements 85.39% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

Copy link
Copy Markdown
Contributor

@james-rl james-rl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me!

@github-actions
Copy link
Copy Markdown

⚠️ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric Coverage Required Status
Functions 99.42% 100%
Lines 86.82% - ℹ️
Branches 65.11% - ℹ️
Statements 85.39% - ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

⚠️ Some object methods are not covered in smoke tests. Please add tests that call all public methods.

View detailed coverage report

Coverage reports are available in the workflow artifacts. Lines/branches/statements coverage is tracked but not required to be 100%.

📋 View workflow run

@sid-rl sid-rl merged commit 52a9275 into main Jan 20, 2026
9 checks passed
@sid-rl sid-rl deleted the siddarth/sdk-catchup branch January 20, 2026 23:37
This was referenced Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL This PR changes 1000+ lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants