test: separate e2e tests and add backend adapter e2e coverage#16
test: separate e2e tests and add backend adapter e2e coverage#16
Conversation
Split e2e tests into a dedicated vitest config so `npm test` runs only unit tests (~7s) while `npm run test:e2e` exercises integration flows. Added `test:all` script for full verification. New e2e tests for the three untested backend adapters: - ACP: 8 tests (conversation flow, multi-turn, permissions, resume, crash) - Agent SDK: 8 tests (rich content, results, permissions, abort, errors) - Codex: 8 tests (streaming, multi-turn, approvals, cancel, WS errors) Shared test infrastructure in backend-test-utils.ts with MessageReader, mock helpers, and message factories extracted from compliance tests.
Summary of ChangesHello @teng-lin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the testing suite by restructuring how unit and end-to-end tests are executed and by expanding e2e coverage for critical backend adapters. The changes ensure that developers can run fast unit tests without e2e overhead, while also providing robust validation for the integration and behavior of the ACP, Agent SDK, and Codex adapters through comprehensive new e2e scenarios. This improves test maintainability and reliability across the project. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request successfully separates e2e tests from unit tests and adds comprehensive e2e coverage for the ACP, Agent SDK, and Codex backend adapters. The introduction of shared test infrastructure in backend-test-utils.ts is a positive step towards maintainability and consistency across e2e tests. The new test:e2e and test:all scripts provide clear execution paths for different testing needs.
| const { child, stdin, stdout } = createMockChild(); | ||
| setupStdin(stdin, stdout); | ||
| return child; | ||
| }) as unknown as SpawnFn; |
There was a problem hiding this comment.
|
|
||
| session.send(createUserMessage("trigger crash")); | ||
|
|
||
| const messages = await reader.collect(2, 2000); |
There was a problem hiding this comment.
|
|
||
| it("subprocess crash during handshake rejects connect()", async () => { | ||
| const adapter = createAdapter((_stdin, stdout) => { | ||
| setTimeout(() => stdout.emit("close"), 5); |
| } | ||
| abortSignal?.addEventListener("abort", () => resolve()); | ||
| // Also resolve after a timeout to not hang the test | ||
| setTimeout(resolve, 2000); |
|
|
||
| // First call throws — give a moment for error to propagate | ||
| session.send(createUserMessage("Trigger error")); | ||
| await new Promise((r) => setTimeout(r, 50)); |
| ws.emit("close"); | ||
|
|
||
| // Should get the partial message, then stream should end | ||
| const messages = await collectUnifiedMessages(session, 2, 1000); |
|
|
||
| // Stream should end | ||
| const messages = await collectUnifiedMessages(session, 1, 500); | ||
| // May or may not collect any messages, but stream shouldn't hang |
| timeoutMs, | ||
| ), | ||
| ), |
There was a problem hiding this comment.
The use of undefined as unknown as UnifiedMessage in the setTimeout callback for Promise.race is a common workaround for TypeScript's type system when a promise resolves with an undefined value on timeout. While functional, it could be slightly cleaner if the IteratorResult type could explicitly handle undefined for value when done is true, or if the timeout promise resolved with a distinct sentinel value that is then handled.
| () => r({ value: undefined as unknown as UnifiedMessage, done: true }), | ||
| timeoutMs, | ||
| ), |
There was a problem hiding this comment.
The use of undefined as unknown as UnifiedMessage in the setTimeout callback for Promise.race is a common workaround for TypeScript's type system when a promise resolves with an undefined value on timeout. While functional, it could be slightly cleaner if the IteratorResult type could explicitly handle undefined for value when done is true, or if the timeout promise resolved with a distinct sentinel value that is then handled.
| type: "user_message", | ||
| role: "user", | ||
| content: [{ type: "text", text }], | ||
| metadata: { sessionId, session_id: sessionId }, |
There was a problem hiding this comment.
The createUserMessage function includes both sessionId and session_id in the metadata. If sessionId is the canonical key, session_id could be redundant or potentially lead to confusion. If both are necessary for distinct purposes (e.g., one for internal use, one for external API), a clarifying comment would be beneficial.
Update docs/architecture.md: - Replace Tier 1 table with resolved summary (#1–4 all fixed) - Update Tier 2 #6 (pendingInitialize now set in correct post-reducer hook) - Update Tier 3 #14 (handleInboundCommand rejections fixed via SEND_TO_CONSUMER) - Update Tier 3 #16 (trySendRawToBackend now runtime-internal only) Also narrow SLASH_LOCAL_RESULT signal source type to "emulated" | "cli" to match ConsumerMessage contract (caught by typecheck).
Summary
npm testnow excludes*.e2e.test.tsfiles, keeping the fast feedback loop (~7s). Newtest:e2eandtest:allscripts added.backend-test-utils.tswithMessageReader, mock helpers, and message factories extracted from compliance tests.New e2e test coverage
Test plan
npm testexcludes all*.e2e.test.ts— 79 files, 1495 testsnpm run test:e2eincludes all e2e — 8 files, 59 testsnpm run test:allruns both — all green