feat(mcp): per-test plugin overrides + shell session lifecycle by DavertMik · Pull Request #5547 · codeceptjs/CodeceptJS

DavertMik · 2026-04-30T23:17:47Z

Summary

Plugins per test run. run_test and run_step_by_step now accept a plugins object mirroring the CLI -p flag — e.g. { "screencast": { "saveScreenshots": true }, "aiTrace": { "on": "fail" }, "pause": true }. Keys are plugin names from lib/plugin/, values are options (or true / {} for defaults). Plugins are merged into config.plugins[name] with enabled: true and the container is torn down + re-initialized whenever the plugin set changes between calls.
Real shell session lifecycle. start_browser now does what codeceptjs shell does: container init → codecept.bootstrap() → recorder.start() → emit suite.before and test.before. stop_browser emits the matching after events and runs codecept.teardown(). Plugins and listeners that hook into per-suite / per-test setup now actually fire during MCP usage.
run_code / snapshot require a session. Calling either without an active shell session (or a paused test) now returns a clear error pointing the agent at start_browser or run_test. Avoids the silent-broken-state issue where these tools used to "work" but with no plugin/listener setup behind them.

Test plan

Cold start → run_code errors with the session-required hint.
start_browser → run_code works; plugins enabled in config (e.g. screenshotOnFail) get their suite.before / test.before setup.
run_test with plugins: { screencast: { saveScreenshots: true } } produces a video and per-step screenshots in output/.
run_test with plugins: { aiTrace: { on: "fail" } } against a failing test produces an aiTrace trace.md with HTML/ARIA/screenshot.
Two sequential run_test calls with the same plugins payload do NOT re-init; calls with a different set do re-init (browser restarts).
start_browser followed by run_test (mocha-driven) does not double-emit suite.before / test.before; bootstrap hook runs once.
run_test with pause() in the test → run_code works (paused test path counts as active session).
No-arg run_test (no plugins) behaves as before.

🤖 Generated with Claude Code

- run_test / run_step_by_step accept a `plugins` object that mirrors the CLI `-p` flag (e.g. `{ screencast: { saveScreenshots: true }, aiTrace: { on: 'fail' }, pause: true }`). Container is re-initialized when the plugin set changes between calls. - start_browser / stop_browser now drive a full shell session like `codeceptjs shell`: bootstrap, recorder.start, suite.before / test.before on start; matching after events plus codecept.teardown on stop. - run_code / snapshot now require an active session (shell or paused test) and return a clear error pointing the agent at start_browser or run_test otherwise. Plugins and listeners that depend on suite.before / test.before now fire correctly during MCP usage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Move artifact-on-disk reading from mcp-server.js into a TraceReader class in lib/utils/trace.js. Python-style indexing via first / last / nth, kept generic across kinds (aria / html / screenshot / console / storage). Sort by filename — aiTrace's zero-padded step prefix means a lexical sort is chronological. run_code uses it to diff ARIA between the last aiTrace capture and the new one produced by the steps inside this call: const reader = new TraceReader(currentAiTraceDir) const before = reader.last('aria') // run code, aiTrace captures per step const after = reader.last('aria') if (before !== after) result.ariaDiff = ariaDiff(before, after) initCodecept now force-enables aiTrace whenever the MCP server initializes the container — it's the canonical per-step capture, no point in MCP doing its own grabAriaSnapshot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- docs/agents.md: new top-level page covering the MCP loop (open the page → read → run a CodeceptJS command → check → commit), how the agent reads page artifacts, and where MCP fits relative to pause(). - lib/aria.js: trim INTERACTIVE_ROLES to roles that actually take user input (drop container roles like grid/tablist/menubar); remove IGNORED_ROLES unwrap, icon-button auto-naming, and bool/null coercion in attribute values. Names are always emitted; attribute values are passed through as plain strings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

DavertMik and others added 12 commits April 26, 2026 22:27

update docs

f572b8e

updated docs, added browser plugin

1ae964b

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

4326bcc

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

82e760f

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

a14f976

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

3adc3f1

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

867f7e3

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

a57d587

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

af8de03

DavertMik merged commit 9a39deb into 4.x May 1, 2026
9 of 11 checks passed

DavertMik deleted the feat/mcp-plugins-and-shell-session branch May 1, 2026 02:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(mcp): per-test plugin overrides + shell session lifecycle#5547

feat(mcp): per-test plugin overrides + shell session lifecycle#5547
DavertMik merged 12 commits into4.xfrom
feat/mcp-plugins-and-shell-session

DavertMik commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

DavertMik commented Apr 30, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant