Skip to content

refactor(test): extract shared SessionPrompt layer to test/lib/prompt-harness (F2)#5

Closed
tesdal wants to merge 1 commit intophase-ab-basefrom
audit-f2-prompt-harness
Closed

refactor(test): extract shared SessionPrompt layer to test/lib/prompt-harness (F2)#5
tesdal wants to merge 1 commit intophase-ab-basefrom
audit-f2-prompt-harness

Conversation

@tesdal
Copy link
Copy Markdown
Owner

@tesdal tesdal commented Apr 23, 2026

Summary

Audit finding F2 from 2026-04-22 diamond audit: prompt.test.ts and subagent-hang-regression.test.ts both composed the same ~85-line Effect Layer graph with identical MCP/LSP/Summary stubs. Regression test explicitly called this out with a "Copied verbatim" comment. Task 2 extracts the shared graph into a single helper.

Changes

  • New: packages/opencode/test/lib/prompt-harness.ts exporting makePromptLayer(). Contains the three inline stubs (summary/mcp/lsp) plus module-scope status/runState/infra constants and the full SessionPrompt.layer composition.
  • packages/opencode/test/session/prompt.test.ts: deleted inline stubs + constants + makeHttp() function; now imports makePromptLayer from the new harness.
  • packages/opencode/test/session/subagent-hang-regression.test.ts: same deletions, "Copied verbatim" comment removed, now imports makePromptLayer.

Net −120 LOC (147 added in harness, 267 removed across both test files).

Verification

  • bun typecheck — clean
  • bun test test/session/subagent-hang-regression.test.ts — 2/2 pass, verified 4/4 clean runs on this branch + 1 clean run on base for sanity
  • bun test test/session/prompt.test.ts — 42 pass / 2 fail. Both failures are in the known-flaky PT0 pattern (cancel tests spawning sleep 30 shells with 3s it.live timeout). Failure count is within / below the documented 3–11 range. Refactor cannot introduce new failures — it's pure Layer-composition reuse, no test-body changes. See anomalyco/opencode#24060.

Diamond review status

  • Spec compliance (codex-5.3): 9/9 ✅. Initially flagged a regression-test hook-timeout blocker; investigation (controller ran test 4 more times on branch + once on base) confirmed it was a non-reproducing runner artifact unrelated to the refactor. Verdict overridden to APPROVE.
  • Code quality (Opus): APPROVE WITH NITS. Zero semantic drift (byte-equivalent Layer wiring), clean import trimming, correct file location in test/lib/. One actionable nit (duplicate import from \"../../src/tool\") — fixed and amended into the commit. Other nits (naming, module-scope constant inlining) are judgement calls, left as specified.

Context

Part of 14-task audit remediation plan (docs/superpowers/plans/2026-04-23-audit-remediation.md). Base phase-ab-base is 2309cc89d (F1 merge on local/integration-v2). Draft PR opened for Copilot review only — merge target is local integration branch, not upstream.

…-harness

Both prompt.test.ts and subagent-hang-regression.test.ts composed the
same ~85-line Layer graph with identical MCP/LSP/Summary stubs. The
regression test's 'copied verbatim' comment explicitly called out the
duplication. Extracts to makePromptLayer() in test/lib so the
composition can't drift.

Addresses audit finding F2 (both diamond reviewers, 2026-04-22).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses audit finding F2 by extracting the shared Effect/Layer composition used by SessionPrompt loop tests into a single reusable test harness, eliminating duplicated stub/service wiring across multiple test suites.

Changes:

  • Added makePromptLayer() in a new shared test harness (test/lib/prompt-harness.ts) containing the common SessionPrompt Layer graph and MCP/LSP/Summary stubs.
  • Refactored prompt.test.ts to remove the inline Layer graph and import makePromptLayer() instead.
  • Refactored subagent-hang-regression.test.ts similarly, removing the prior “Copied verbatim” duplication and using the shared harness.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
packages/opencode/test/lib/prompt-harness.ts New shared helper exporting makePromptLayer() with consolidated stubs and SessionPrompt Layer composition.
packages/opencode/test/session/prompt.test.ts Removes duplicated Layer/stub setup; now uses makePromptLayer() from the harness.
packages/opencode/test/session/subagent-hang-regression.test.ts Removes duplicated Layer/stub setup; now uses makePromptLayer() from the harness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tesdal
Copy link
Copy Markdown
Owner Author

tesdal commented Apr 23, 2026

Copilot review: no comments. All three reviewers (codex-5.3 spec 9/9 after reproducibility override, Opus quality APPROVE WITH NITS fixed, Copilot clean) agree. Merging --no-ff into local/integration-v2. PR closed as review-only.

@tesdal tesdal closed this Apr 23, 2026
tesdal added a commit that referenced this pull request Apr 23, 2026
…harness

Diamond review: codex-5.3 spec 9/9 (initial false-positive hook-timeout flake verified non-reproducing 4/4 runs), Opus quality APPROVE WITH NITS (duplicate import merge applied), Copilot clean.
Closes audit finding F2. Copilot review PR #5 (closed as review-only).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants