feat(mcp): pause_session tool + MCP-aware pause() yield mode by DavertMik · Pull Request #5544 · codeceptjs/CodeceptJS

DavertMik · 2026-04-29T23:53:45Z

Summary

pause() now detects MCP context (CODECEPTJS_MCP=1, non-TTY stdin) and adapts: a skip mode that resolves immediately so leftover pause() calls don't deadlock CI runs invoked through MCP, and a yield mode (CODECEPTJS_MCP_PAUSE=1) that reads JSON-line commands on stdin and emits {__mcpPause:true,...} responses on stdout (paused / result / resumed / exited / error). Each run/snapshot response carries the same artifact bundle as run_code / snapshot (URL, ARIA, HTML, screenshot, console, storage).
New MCP server tool pause_session with sub-actions start / run / snapshot / step / resume / exit / status. Spawns a test subprocess in yield mode, multiplexes commands by id, and queues consumers waiting for the next paused event.
TTY path (npx codeceptjs run --debug at a terminal) is unchanged.

Why

Before this change, an agent driving CodeceptJS through MCP couldn't tolerate pause() in a test — readline blocked on stdin the agent couldn't supply, the subprocess hung, and MCP eventually timed out. There was also no way for the agent to drive the REPL itself. This PR makes both work without affecting the human TTY workflow.

Files

lib/pause.js — context detection, yield-mode session, persistent readline across pauseSession entries.
bin/mcp-server.js — pause_session tool, JSON-line subprocess multiplexer, line-buffered stdout/stderr classifier.
docs/mcp.md, docs/debugging.md — documented pause_session and pause()'s three modes.
test/unit/pause_test.js (new) — 10 cases: env detection, JSON envelope shape, protocol round-trip (paused/resumed/snapshot/invalid-JSON/unknown-type/exit-rejects).
test/unit/mcpServer_test.js — 6 new cases for the line classifier.

Test plan

npm run test:unit — 685 passed, 0 failed
npx mocha test/unit/pause_test.js test/unit/mcpServer_test.js — 48 passed
Syntax check on lib/pause.js and bin/mcp-server.js
Manual: npx codeceptjs run --debug with an in-test pause() still drops to the human REPL exactly as today
Manual: CODECEPTJS_MCP=1 npx codeceptjs run with a pause() test prints the skip notice and continues
Manual: through the MCP server, pause_session.start on a test with pause() returns a paused event; run calls return artifact bundles; resume lets the test finish

🤖 Generated with Claude Code

In-test pause() calls hung subprocess runs invoked through the MCP server because readline blocked on stdin that an agent can't supply. pause() now detects MCP context (CODECEPTJS_MCP=1, non-TTY stdin) and adapts: - Skip mode (CODECEPTJS_MCP=1 only): pause() prints a notice and resolves immediately so leftover pause() calls don't deadlock CI runs. - Yield mode (CODECEPTJS_MCP_PAUSE=1): pause() reads JSON-line commands on stdin and emits {__mcpPause:true,...} responses on stdout (paused, result, resumed, exited, error). Each run/snapshot response includes the artifact bundle from captureSnapshot. The new MCP server pause_session tool spawns a test subprocess in yield mode and multiplexes start/run/snapshot/step/resume/exit/status sub-actions over the JSON-line protocol. TTY behavior at a terminal is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drops the id-keyed message multiplexer and 7-action enum (run/snapshot/step/ resume/exit/status). The yield-mode subprocess now reads plain text lines from stdin (same shape as the TTY readline REPL) and emits one JSON line per input on stdout. The MCP server pause_session tool exposes only "start" and "run". A run takes a code string with the same conventions as the TTY pause REPL — "" steps, "resume" continues, "exit" aborts, otherwise treat as I.<expr> or =>raw_js. Each run returns the next protocol message. Net: 237 lines removed, 159 added. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

run_test now spawns its subprocess in pause yield mode and returns early with {status:"paused"} when the test hits pause(). The agent then drives the REPL through the new "pause" tool, which only takes a code string. Drops the standalone pause_session.start action — pause only makes sense when a test is already running. Resume / step / exit are just code values (matching the TTY pause REPL conventions). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…container Previously pause yield mode spawned a test subprocess and shuttled JSON-line messages through stdin/stdout. That was a lot of plumbing for something the existing run_step_by_step tool already does cleanly: run codecept in-process in the MCP server itself. Now lib/pause.js exposes setPauseHandler/setNextStep. The MCP server installs a handler at startup that turns pause() into a Promise the agent controls. run_test races bootstrap+run() vs that paused promise; on pause it returns {status:"paused"} with the test promise stashed at module level. The pause tool drives the REPL by running code through the same I that the test is using, no IPC. resume/exit await the test promise and return the final reporter result. Drops: pauseChild, pauseProtocolWaiters, pauseProcessChunk, mcpYieldSession, emitMcpProtocol, ensureMcpReadline, the CODECEPTJS_MCP* env detection in lib/pause.js. The TTY readline path is unchanged. Net: 270 added, 526 removed across pause/mcp files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The pause tool was duplicating the TTY pause REPL (empty/resume/exit magic strings, => prefix, default I.<expr>) when MCP already has run_code for running code against the live container. Both tools share the same I, so during a paused test, run_code is the right surface for code execution. Replace pause with a simple "continue" tool that just releases the paused test and returns the final reporter result. Drop setNextStep — no step-by-step mode for MCP (use run_step_by_step if needed). Net: 55 added, 152 removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous patch hijacked process.stdout.write at the start of run_test and only restored it inside collectRunCompletion (i.e., on continue). That muted the MCP SDK's own protocol writes during the pause window — any run_code or continue response would be lost. Reuse the existing withSilencedIO helper instead. Wrap run_test's race and continue's await-pending-run inside it, so stdout is muted while codecept is producing step output and restored before the tool returns its MCP response. The MCP SDK writes responses on a clean stdout. While paused, the test is suspended (handler promise unresolved), so no test output is being produced — no need to mute. run_code calls during pause go through the existing run_code handler, which has its own isolation pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

run_test now accepts an optional pauseAt (1-based step index). The MCP server tracks step.after events; when stepIndex matches pauseAt, it schedules pauseNow() through the recorder so the test pauses between steps. Useful as a programmatic breakpoint without editing the test — the agent gets step indices via the list CLI or run_step_by_step. The paused response now includes: - pausedAfter: { index, name, status } of the last completed step - page: { url, title, contentSize } via the live helper - suggestions: which tool to call next (snapshot / run_code / continue) lib/pause.js gains pauseNow() which schedules a one-shot pauseSession via recorder.add — the same mechanism as the in-test pause() but without re-attaching the global event listeners. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previously run_step_by_step ran the whole test to completion in one call and returned a fat blob of per-step artifacts. That's the aiTrace plugin's job, not an interactive tool's. Now it pauses after every step using the same pauseNow + handler machinery as run_test's pauseAt: agent calls run_step_by_step, gets back a paused payload after step 1, calls continue to advance to step 2, and so on. At any pause they can run_code / snapshot to inspect state. continue is unified: it races "test paused again" vs "test completed", so the same call works for run_step_by_step (re-pauses each time), pauseAt (runs to end), and explicit pause() in the test (runs to end). Module- level pendingTestFile / pendingStepInfo carry the paused-payload data through repeated continue cycles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

DavertMik and others added 8 commits April 30, 2026 02:53

DavertMik merged commit 7aef4e5 into 4.x Apr 30, 2026
10 checks passed

DavertMik deleted the feat/pause-mcp-yield-mode branch April 30, 2026 20:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(mcp): pause_session tool + MCP-aware pause() yield mode#5544

feat(mcp): pause_session tool + MCP-aware pause() yield mode#5544
DavertMik merged 8 commits into4.xfrom
feat/pause-mcp-yield-mode

DavertMik commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

DavertMik commented Apr 29, 2026

Summary

Why

Files

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant