Skip to content

fix(e2e): make D10 coding task resilient to context overflow#74

Merged
emal-avala merged 1 commit intomainfrom
fix/e2e-d10-flaky
Apr 6, 2026
Merged

fix(e2e): make D10 coding task resilient to context overflow#74
emal-avala merged 1 commit intomainfrom
fix/e2e-d10-flaky

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Summary

  • D10 consistently fails with code=400 on small models (gpt-5-nano) because the accumulated session context from D1-D9 exceeds the model's limits
  • Restart serve before D10 to get a fresh session with empty context
  • Retry once on failure with another fresh session restart
  • Simplify prompt — provide exact script content instead of describing the logic

This has been failing on v0.13.0, v0.13.1, and v0.14.0 E2E runs.

Test plan

  • Add run-e2e label to trigger E2E suite — D10 should pass

🤖 Generated with Claude Code

@emal-avala emal-avala added the run-e2e Trigger E2E test suite on this PR label Apr 6, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

D10 consistently fails with code=400 on small models (gpt-5-nano)
because the accumulated session context from D1-D9 exceeds limits.

Fix:
- Restart serve before D10 to get a fresh session with empty context
- Retry once on failure with another fresh session restart
- Simplify the prompt to be more direct (provide exact script content
  instead of describing the logic)

This was failing on v0.13.0, v0.13.1, and v0.14.0 releases.
@emal-avala emal-avala merged commit ff6210f into main Apr 6, 2026
13 of 14 checks passed
@emal-avala emal-avala deleted the fix/e2e-d10-flaky branch April 6, 2026 08:41
@emal-avala emal-avala mentioned this pull request Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-e2e Trigger E2E test suite on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant