Skip to content

[Improve] Add deterministic xAI provider e2e coverage#149

Merged
edelauna merged 5 commits into
mainfrom
feature/add-xai-provider-e2e-3p85k8m13gigu
May 18, 2026
Merged

[Improve] Add deterministic xAI provider e2e coverage#149
edelauna merged 5 commits into
mainfrom
feature/add-xai-provider-e2e-3p85k8m13gigu

Conversation

@roomote
Copy link
Copy Markdown
Contributor

@roomote roomote Bot commented May 16, 2026

Opened on behalf of Elliott de Launay. View the task or mention @roomote for follow-up asks.

Related GitHub Issue

Not currently linked in GitHub.

Description

This PR adds deterministic mocked VS Code e2e coverage for the xAI Responses API provider and lands the follow-up fixes that were required to make that coverage trustworthy in CI.

The main implementation changes are:

  • add a dedicated xAI provider e2e suite that intercepts POST https://api.x.ai/v1/responses and verifies the read_file -> attempt_completion tool-use loop plus the expected request shape for grok-4.20
  • support optional local recording/replay of real xAI Responses API SSE events through the gitignored apps/vscode-e2e/fixtures/xai.json fixture file
  • fix the xAI mocked e2e request-capture leak by scoping probe diagnostics and assertions to the current probe tag, so delayed retries from earlier probes cannot contaminate the fast-model checks
  • document the xAI recording/replay workflow and fetch-interceptor hermetic-state guidance in apps/vscode-e2e/AGENTS.md
  • fix the Responses API stream fallback so a call_id-only delta does not incorrectly suppress the later response.output_item.done event that carries the first usable tool name
  • fix the latent Z.ai mocked e2e state leak by switching model/provider changes through profile activation so stale provider settings are cleared between cases

Reviewer focus:

  • the current CI fix is the xAI probe-isolation change in apps/vscode-e2e/src/suite/providers/xai.test.ts
  • the responses-api-stream.ts fix is still grounded in the current xAI provider path, which is the only caller of that helper today
  • xAI’s documented Responses API behavior says streamed function calls are returned whole in a single chunk, which supports the stricter fallback logic here

Test Procedure

I validated the current branch with:

  • pnpm --filter @roo-code/vscode-e2e test:run -- --file xai.test
  • pnpm --filter @roo-code/vscode-e2e test:ci:mock
  • pnpm --dir apps/web-roo-code test
  • curl -I http://127.0.0.1:3000

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes (if applicable).
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Not applicable. This PR changes mocked e2e coverage, stream-processing logic, and contributor guidance, but it does not ship a rendered UI change.

Documentation Updates

  • No documentation updates are required for user-facing docs.
  • Yes, documentation updates are required.

Contributor-facing e2e guidance was updated in apps/vscode-e2e/AGENTS.md for xAI recording/replay and hermetic fetch-interceptor suites.

Additional Notes

The latest CI failure on this PR was not a broad provider outage. It was an order-dependent xAI test-harness bug where delayed grok-4.20 follow-up requests could leak into later fast-model assertions. The fix here narrows the captured requests to the current probe instead of accepting every follow-up call globally.

I do not have a live xAI API key in this sandbox, so the provider grounding in this PR remains based on the checked-in xAI handler/test shape plus xAI’s current docs rather than a fresh live capture from this task.

Get in Touch

Discord username not provided in this task context.

@roomote roomote Bot added the roomote:auto-resolve-conflicts Allow Roomote to auto-resolve merge conflicts for this PR label May 16, 2026
@roomote
Copy link
Copy Markdown
Contributor Author

roomote Bot commented May 16, 2026

2 checks still pending after the review wait. See task

  • Assert that the xAI probe returns the exact marker from the workspace file instead of only checking for any completion_result event.
  • Responses API dedupe now drops tool calls when argument deltas include a call_id but omit the tool name.
  • e2e-mock is failing: the Z.ai GLM provider test still expects glm-5-turbo max_tokens 40,551, but the run reports 40,000.
  • CI still pending after the 10-minute wait: platform-unit-test (windows-latest). — stale: this sync review stopped once e2e-mock failed, before the pending-check timeout path applied.
  • The xAI request capture still leaks earlier probe traffic into the fast-model assertions, so e2e-mock fails when a stray grok-4.20 request is collected before the current fast-model probe.
  • CI still pending after the review wait: platform-unit-test (windows-latest).
  • CI still pending after the review wait: platform-unit-test (ubuntu-latest).

@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread apps/vscode-e2e/src/suite/providers/xai.test.ts Outdated
@edelauna edelauna changed the title [Chore] Add xAI provider e2e coverage Fix Responses API duplicate streamed tool calls May 17, 2026
@edelauna edelauna force-pushed the feature/add-xai-provider-e2e-3p85k8m13gigu branch from aa2085a to 39fb2c6 Compare May 17, 2026 19:42
@edelauna edelauna marked this pull request as ready for review May 17, 2026 19:42
Comment thread src/api/transform/responses-api-stream.ts
@roomote roomote Bot changed the title Fix Responses API duplicate streamed tool calls [Improve] Add deterministic xAI provider e2e coverage May 17, 2026
@roomote
Copy link
Copy Markdown
Contributor Author

roomote Bot commented May 17, 2026

Fixed on the PR branch in 94f5535.

Latest shipped changes:

  • isolated the xAI probe request assertions to the current probe tag so delayed earlier retries cannot contaminate the fast-model checks
  • reset xAI request capture in setup() and documented hermetic-state guidance for fetch-interceptor suites in apps/vscode-e2e/AGENTS.md

Validation:

  • pnpm --filter @roo-code/vscode-e2e test:run -- --file xai.test
  • pnpm --filter @roo-code/vscode-e2e test:ci:mock
  • pnpm --dir apps/web-roo-code test
  • curl -I http://127.0.0.1:3000

Browser proof was not applicable because the shipped diff is mocked e2e coverage plus contributor guidance, not rendered UI.

edelauna
edelauna previously approved these changes May 18, 2026
Copy link
Copy Markdown
Contributor

@edelauna edelauna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wanted to increase e2e provider coverage, noticed a small bug in the response stream, also highlighted an issue with zAI test harness state management.

@edelauna edelauna force-pushed the feature/add-xai-provider-e2e-3p85k8m13gigu branch from 9dd2fff to 1edbebe Compare May 18, 2026 14:13
Comment thread apps/vscode-e2e/src/suite/providers/xai.test.ts Outdated
@roomote
Copy link
Copy Markdown
Contributor Author

roomote Bot commented May 18, 2026

I validated a narrow fix locally for the failing e2e-mock job.

Root cause:
runXAIToolProbe() in apps/vscode-e2e/src/suite/providers/xai.test.ts was including any request with functionCallOutputIds, even if it belonged to an earlier probe. That lets delayed grok-4.20 follow-up traffic leak into later fast-model assertions.

Patch I validated locally:

requests: requests.filter(
  (request) =>
    request.probeTag === probeTag ||
    (request.model === modelId && request.functionCallOutputIds.length > 0),
),

Validation run in a disposable PR worktree:

  • TEST_FILE=xai.test pnpm --filter @roo-code/vscode-e2e test:ci:mock
  • pnpm --filter @roo-code/vscode-e2e test:ci:mock

Results:

  • xAI-only suite: 3 passing
  • full mocked e2e package: 50 passing, 7 pending

So this looks like the right fix for the CI failure rather than just masking the assertion.

@edelauna edelauna force-pushed the feature/add-xai-provider-e2e-3p85k8m13gigu branch from 94f5535 to 55bef60 Compare May 18, 2026 18:21
@edelauna edelauna merged commit ec204f9 into main May 18, 2026
9 checks passed
@edelauna edelauna deleted the feature/add-xai-provider-e2e-3p85k8m13gigu branch May 18, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

roomote:auto-resolve-conflicts Allow Roomote to auto-resolve merge conflicts for this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants