Skip to content

chore(tests): remove no-op example tests + audit notes#424

Merged
Aryansharma28 merged 1 commit intomainfrom
chore/remove-noop-example-tests
May 4, 2026
Merged

chore(tests): remove no-op example tests + audit notes#424
Aryansharma28 merged 1 commit intomainfrom
chore/remove-noop-example-tests

Conversation

@Aryansharma28
Copy link
Copy Markdown
Contributor

@Aryansharma28 Aryansharma28 commented May 4, 2026

Summary

Removes two example tests that produce no signal but consume CI time on every PR:

  • `javascript/examples/vitest/tests/vegetarian-recipe-realtime.test.ts` — entire file is a single `it.skip`. Nothing runs.
  • `multilingual-agent.test.ts` > `handles adversarial users and conversational chaos` — wrapped in `try { expect(...).toBe(true) } catch { expect(true).toBe(true) }` with a `// TODO: this test is flaky, let it pass for now` comment. Always passes; up to 360s timeout per run; ~12 turns of live LLM calls.

Follow-up to #423 (which removed the flaky `multiturn-10-scripted.test.ts`).

Audit notes — flagged for discussion (not in this PR)

These came up in the audit but I'm not removing them unilaterally:

  • `prefill-error-repro.test.ts` — bug-repro test sitting in examples. Skipped without `ANTHROPIC_API_KEY`. Probably belongs in a regression suite inside the SDK, not in user-facing examples.
  • Several live-LLM example tests (`travel-agent`, `vegetarian-recipe-agent`, `false-assumptions`, `multilingual-agent` happy-path test) run on every PR via `Test (Examples)` job. They're useful as docs but introduce flake risk on unrelated PRs (e.g. dependency-pin PRs). Worth considering moving the example job to a nightly/manual workflow.
  • ~30 `it.todo(...)` placeholders across the multimodal test files — harmless, but most have been there indefinitely with no implementation. Could prune for tidiness.

Test plan

  • CI green.
  • No reduction in meaningful coverage (both removed tests were pure no-ops).

- Drop `vegetarian-recipe-realtime.test.ts` — its only test is `it.skip`,
  so the whole file ran nothing.
- Remove the `handles adversarial users and conversational chaos` test
  in `multilingual-agent.test.ts` — its assertion was wrapped in a
  try/catch with `expect(true).toBe(true)` and a "TODO: flaky, let it
  pass for now" comment, so it always passed regardless of behaviour
  and consumed up to 360s of CI time on every PR.
@github-actions github-actions Bot added the low-risk-change PR qualifies as low-risk per policy and can be merged without manual review label May 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Automated low-risk assessment

This PR was evaluated against the repository's Low-Risk Pull Requests procedure.

  • Scope: Deletes javascript/examples/vitest/tests/vegetarian-recipe-realtime.test.ts and removes the 'handles adversarial users and conversational chaos' test block from javascript/examples/vitest/tests/multilingual-agent.test.ts (test/examples-only changes).
  • Exclusions confirmed: no changes to auth, security settings, database schema, business-critical logic, or external integrations.
  • Classification: low-risk-change under the documented policy.

The PR only removes example test code: a fully skipped realtime example and an always-passing/flaky adversarial scenario that consumed CI time. It does not modify auth/authorization, secrets, DB schemas, business logic, or external integrations and is limited to test/examples cleanup, so it meets the low-risk criteria.

This classification allows merging without manual review once all required CI checks are passing and branch protection rules are satisfied.

@Aryansharma28 Aryansharma28 merged commit 947f219 into main May 4, 2026
8 of 9 checks passed
@Aryansharma28 Aryansharma28 deleted the chore/remove-noop-example-tests branch May 4, 2026 13:36
Aryansharma28 added a commit that referenced this pull request May 4, 2026
chore(tests): remove flaky live-LLM travel-agent example test

The test runs 9 scripted live LLM turns with sequential weather +
accommodation tool calls under a 180s timeout. It has been timing out
or failing tool-call assertions on unrelated PRs (e.g. #394's
dependency overrides change).

Coverage is preserved: multi-tool agent flow is already demonstrated
deterministically by `database-tool-mocking.test.ts`, and the
live-LLM single-tool pattern is in `weather-agent.test.ts`.

Follow-up to #423 and #424.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

low-risk-change PR qualifies as low-risk per policy and can be merged without manual review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant