Skip to content

feat(mcp): per-test plugin overrides + shell session lifecycle#5547

Merged
DavertMik merged 12 commits into4.xfrom
feat/mcp-plugins-and-shell-session
May 1, 2026
Merged

feat(mcp): per-test plugin overrides + shell session lifecycle#5547
DavertMik merged 12 commits into4.xfrom
feat/mcp-plugins-and-shell-session

Conversation

@DavertMik
Copy link
Copy Markdown
Contributor

Summary

  • Plugins per test run. run_test and run_step_by_step now accept a plugins object mirroring the CLI -p flag — e.g. { "screencast": { "saveScreenshots": true }, "aiTrace": { "on": "fail" }, "pause": true }. Keys are plugin names from lib/plugin/, values are options (or true / {} for defaults). Plugins are merged into config.plugins[name] with enabled: true and the container is torn down + re-initialized whenever the plugin set changes between calls.
  • Real shell session lifecycle. start_browser now does what codeceptjs shell does: container init → codecept.bootstrap()recorder.start() → emit suite.before and test.before. stop_browser emits the matching after events and runs codecept.teardown(). Plugins and listeners that hook into per-suite / per-test setup now actually fire during MCP usage.
  • run_code / snapshot require a session. Calling either without an active shell session (or a paused test) now returns a clear error pointing the agent at start_browser or run_test. Avoids the silent-broken-state issue where these tools used to "work" but with no plugin/listener setup behind them.

Test plan

  • Cold start → run_code errors with the session-required hint.
  • start_browserrun_code works; plugins enabled in config (e.g. screenshotOnFail) get their suite.before / test.before setup.
  • run_test with plugins: { screencast: { saveScreenshots: true } } produces a video and per-step screenshots in output/.
  • run_test with plugins: { aiTrace: { on: "fail" } } against a failing test produces an aiTrace trace.md with HTML/ARIA/screenshot.
  • Two sequential run_test calls with the same plugins payload do NOT re-init; calls with a different set do re-init (browser restarts).
  • start_browser followed by run_test (mocha-driven) does not double-emit suite.before / test.before; bootstrap hook runs once.
  • run_test with pause() in the test → run_code works (paused test path counts as active session).
  • No-arg run_test (no plugins) behaves as before.

🤖 Generated with Claude Code

DavertMik and others added 12 commits April 26, 2026 22:27
- run_test / run_step_by_step accept a `plugins` object that mirrors
  the CLI `-p` flag (e.g. `{ screencast: { saveScreenshots: true },
  aiTrace: { on: 'fail' }, pause: true }`). Container is re-initialized
  when the plugin set changes between calls.
- start_browser / stop_browser now drive a full shell session like
  `codeceptjs shell`: bootstrap, recorder.start, suite.before /
  test.before on start; matching after events plus codecept.teardown
  on stop.
- run_code / snapshot now require an active session (shell or paused
  test) and return a clear error pointing the agent at start_browser
  or run_test otherwise. Plugins and listeners that depend on
  suite.before / test.before now fire correctly during MCP usage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move artifact-on-disk reading from mcp-server.js into a TraceReader class
in lib/utils/trace.js. Python-style indexing via first / last / nth, kept
generic across kinds (aria / html / screenshot / console / storage). Sort
by filename — aiTrace's zero-padded step prefix means a lexical sort is
chronological.

run_code uses it to diff ARIA between the last aiTrace capture and the
new one produced by the steps inside this call:

  const reader = new TraceReader(currentAiTraceDir)
  const before = reader.last('aria')
  // run code, aiTrace captures per step
  const after = reader.last('aria')
  if (before !== after) result.ariaDiff = ariaDiff(before, after)

initCodecept now force-enables aiTrace whenever the MCP server initializes
the container — it's the canonical per-step capture, no point in MCP doing
its own grabAriaSnapshot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docs/agents.md: new top-level page covering the MCP loop (open the
  page → read → run a CodeceptJS command → check → commit), how the
  agent reads page artifacts, and where MCP fits relative to pause().
- lib/aria.js: trim INTERACTIVE_ROLES to roles that actually take
  user input (drop container roles like grid/tablist/menubar);
  remove IGNORED_ROLES unwrap, icon-button auto-naming, and
  bool/null coercion in attribute values. Names are always
  emitted; attribute values are passed through as plain strings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@DavertMik DavertMik merged commit 9a39deb into 4.x May 1, 2026
9 of 11 checks passed
@DavertMik DavertMik deleted the feat/mcp-plugins-and-shell-session branch May 1, 2026 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant