Skip to content

chore(tests): remove flaky 10-turn travel-planning example test#423

Merged
Aryansharma28 merged 1 commit intomainfrom
chore/remove-flaky-multiturn-test
May 4, 2026
Merged

chore(tests): remove flaky 10-turn travel-planning example test#423
Aryansharma28 merged 1 commit intomainfrom
chore/remove-flaky-multiturn-test

Conversation

@Aryansharma28
Copy link
Copy Markdown
Contributor

Summary

  • Removes `javascript/examples/vitest/tests/multiturn-10-scripted.test.ts`.
  • The test runs ~21 sequential live LLM calls (10 user-sim + 10 assistant + 1 judge) under a 180s timeout.
  • It has been timing out on CI for unrelated PRs (e.g. dependency-pin PRs like fix(security): patch path-to-regexp DoS via sequential optional groups #416 that don't touch the execution path).
  • It's an example-suite test, not a correctness gate for the SDK.

Test plan

  • CI green without this test.

The test makes ~21 sequential live LLM calls under a 180s timeout and
intermittently breaks CI on unrelated PRs (e.g. dependency-pin PRs that
don't touch any execution path it exercises). It's an example-suite
test, not a correctness gate for the SDK.
@github-actions github-actions Bot added the low-risk-change PR qualifies as low-risk per policy and can be merged without manual review label May 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Automated low-risk assessment

This PR was evaluated against the repository's Low-Risk Pull Requests procedure.

  • Scope: Deleted file javascript/examples/vitest/tests/multiturn-10-scripted.test.ts — removes a 10-turn travel-planning example test that performed ~21 sequential live LLM calls under a 180s timeout.
  • Exclusions confirmed: no changes to auth, security settings, database schema, business-critical logic, or external integrations.
  • Classification: low-risk-change under the documented policy.

This PR only deletes an example-suite test file that ran many live LLM calls; it does not modify authentication/authorization, secrets, database schemas, business logic, or production integrations. Removing a flaky test is a test-configuration/example change that is low risk and easy to revert if needed. CI behavior may change (fewer timeouts), but no runtime or security surface is affected by this diff.

This classification allows merging without manual review once all required CI checks are passing and branch protection rules are satisfied.

@Aryansharma28 Aryansharma28 merged commit bbe86de into main May 4, 2026
7 checks passed
@Aryansharma28 Aryansharma28 deleted the chore/remove-flaky-multiturn-test branch May 4, 2026 13:29
Aryansharma28 added a commit that referenced this pull request May 4, 2026
chore(tests): remove flaky live-LLM travel-agent example test

The test runs 9 scripted live LLM turns with sequential weather +
accommodation tool calls under a 180s timeout. It has been timing out
or failing tool-call assertions on unrelated PRs (e.g. #394's
dependency overrides change).

Coverage is preserved: multi-tool agent flow is already demonstrated
deterministically by `database-tool-mocking.test.ts`, and the
live-LLM single-tool pattern is in `weather-agent.test.ts`.

Follow-up to #423 and #424.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

low-risk-change PR qualifies as low-risk per policy and can be merged without manual review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant