feat: add end-to-end integration tests with mock agent backend (#11) (#11) by jafreck · Pull Request #28 · jafreck/CADRE

jafreck · 2026-02-22T23:31:51Z

Summary

This PR adds end-to-end integration tests for the full CADRE pipeline using a mock agent backend and mock platform provider, so there is no dependency on real GitHub credentials or network calls. A GitHub Actions workflow is also added so e2e tests run automatically on every push and pull request.

Closes #11

Changes

tests/e2e-pipeline.test.ts: New e2e test suite exercising the real IssueOrchestrator through four scenarios (happy path, retry, blocked task, resume) using an inline E2ELauncher and MockPlatformProvider. CommitManager is mocked via vi.mock to avoid real git operations.
tests/helpers/mock-agent-launcher.ts: Reusable MockAgentLauncher helper with per-agent and per-task handler registration and configurable failure injection.
tests/helpers/mock-platform-provider.ts: In-memory MockPlatformProvider implementing the full PlatformProvider interface without real credentials.
tests/e2e-workflow.test.ts: Unit tests validating the GitHub Actions workflow YAML file contents (11 test cases).
.github/workflows/e2e.yml: New workflow that triggers on push and pull_request, installs with npm ci, sets CADRE_E2E=1, and runs npm run test:e2e with a 10-minute timeout.

Implementation Details

The e2e tests wire IssueOrchestrator with real CheckpointManager and real filesystem I/O under os.tmpdir(), while replacing the two external boundaries (agent execution and platform API) with fast, deterministic in-process stubs. Each test creates a unique temp directory and cleans up in afterEach. The E2ELauncher writes synthetic Markdown outputs that match the schemas ResultParser expects, so all five orchestrator phases execute normally.

Testing

Happy path: 1 issue, 2-task plan, all agents succeed → result.success === true, 5 phases all pass, pr-content.md written to disk
Retry path: code-writer fails on first attempt for task-001, succeeds on second → result.success === true
Blocked task: 1 of 3 tasks always fails (exceeds maxRetriesPerTask) → pipeline still returns result.success === true, blocked task visible in checkpoint state
Resume: first run completes phases 1–2 then stops; second run skips those phases → token usage for phases 1 and 2 is 0
All 4 e2e scenario tests pass (npx vitest run tests/e2e-pipeline.test.ts)
11 workflow validation tests pass (tests/e2e-workflow.test.ts)
All other existing unit tests continue to pass

Integration Verification

Install: pass
Build: pass
Tests: 209 of 210 pass — 1 pre-existing failure in tests/github-issues.test.ts (unrelated to this PR; that test expects the old get_issue MCP tool name but the implementation was updated to issue_read)

Notes

The single failing test (GitHubAPI > getIssue > should fetch issue details via MCP) pre-dates this PR's changes. It asserts callTool('get_issue', ...) but the implementation now calls callTool('issue_read', { method: 'get', ... }). A fix-surgeon result file documenting the needed fix is included in the diff but the test file itself was not corrected in this PR to keep the change minimal.
The e2e tests do not cover the budget-exceeded scenario (tokens exceed budget → graceful halt); the issue listed it as one of five options and "at least 3" were required. The four implemented scenarios exceed the acceptance criteria.
Node version is hardcoded to 22 in the workflow (no .nvmrc found in the repo).

Cadre Process Challenges

This section is required for all CADRE-generated PRs (dogfooding data).
Document honestly what was difficult, confusing, or error-prone when CADRE processed this issue.

Issue clarity: The issue listed five test scenarios but said "at least 3 are required" without specifying which three are mandatory. This forced the implementation agent to make an arbitrary choice and resulted in a mismatch with the resume scenario acceptance criterion (the plan said to use dryRun: true to pause at phase 2, but it's unclear whether IssueOrchestrator actually supports that semantics).
Agent contracts: The MockAgentLauncher helper and the inline E2ELauncher class duplicated some logic. The planner requested a separate tests/helpers/mock-agent-launcher.ts file (task-001) and also a test that used its own inline launcher (task-003), which created confusion about which launcher the e2e tests should actually use.
Context limitations: Analysis noted that no file tree was provided, so the exact locations for helper files and the shapes of checkpoint/cost-report data structures had to be inferred from source code. The codebase-scout phase helped, but the agent still had to make guesses that led to some back-and-forth.
Git/worktree: Mocking CommitManager via vi.mock required knowing the exact module path at write time. Any path mismatch silently fails (the mock doesn't apply), which is hard to diagnose. A cleaner dependency-injection approach in IssueOrchestrator would make this simpler.
Parsing/output: The synthetic implementation-plan output in the e2e tests must exactly match the schema that ResultParser.parseImplementationPlan expects. Getting the heading format and dependency syntax right required iterating — minor deviations caused silent parse failures producing 0 tasks.
Retry behavior: No agent retries were needed in this run, but the fix-surgeon was invoked to address the pre-existing github-issues.test.ts failure. The fix-surgeon result file was committed to the worktree rather than automatically updating the test, which is an odd artifact.
Overall: The biggest friction in this run was the ambiguity between using the standalone MockAgentLauncher helper (task-001) vs. the inline E2ELauncher in the test file itself (task-003), and the lack of a clear spec for what ResultParser expects from synthetic plan output. Both caused multiple implementation iterations.

Closes #11

jafreck added 6 commits February 22, 2026 15:12

feat(#11): implement Create MockPlatformProvider helper

5418cf9

feat(#11): implement Create MockAgentLauncher helper

c9e8df2

feat(#11): implement Write e2e test suite

bbe3052

feat(#11): implement Add GitHub Actions e2e workflow

c8c1761

fix(#11): address integration issues

b912b6b

chore: remove cadre task artifacts

f7236ad

jafreck marked this pull request as ready for review February 23, 2026 00:07

jafreck merged commit 15c84ae into main Feb 23, 2026
2 checks passed

jafreck added the cadre-generated Pull request automatically generated by cadre label Feb 23, 2026

This was referenced Feb 23, 2026

Add baseline test snapshot to distinguish pre-existing failures from regressions #65

Closed

Act on detected ambiguities — gate, log, and post clarification comments #69

Closed

jafreck deleted the cadre/issue-11 branch February 25, 2026 01:18

jafreck mentioned this pull request Mar 4, 2026

fix(resume): reconcile stale checkpoint against actual PR state on resume #341

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add end-to-end integration tests with mock agent backend (#11) (#11)#28

feat: add end-to-end integration tests with mock agent backend (#11) (#11)#28
jafreck merged 6 commits intomainfrom
cadre/issue-11

jafreck commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jafreck commented Feb 22, 2026

Summary

Changes

Implementation Details

Testing

Integration Verification

Notes

Cadre Process Challenges

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant