Skip to content

refactor(revive): replace headless runner with copy-prompt UX#33

Merged
justi merged 8 commits into
mainfrom
feat/revive-copy-prompt
May 1, 2026
Merged

refactor(revive): replace headless runner with copy-prompt UX#33
justi merged 8 commits into
mainfrom
feat/revive-copy-prompt

Conversation

@justi
Copy link
Copy Markdown
Owner

@justi justi commented Apr 29, 2026

Summary

  • Drop the headless Claude Code orchestrator (revive_runner) and replace it with a copy-prompt UX.
  • "Scaffold (revive init)" for missing projects, "Copy suggest prompt" / "Copy audit prompt" for everything else; prompts render in st.code with Streamlit's built-in copy button so the user pastes them into their own Claude Code session.
  • Net removal of ~600 lines of subprocess + backup/rollback machinery and the opt-in e2e harness; 429 unit tests pass with no e2e tail.

Why

The headless path tried to imitate what Claude Code already does well (read repo, edit files, ask for approval) and misaligned with revive's "fresh agent session" requirement for audit. Running the prompt in the user's own session gives full visibility, no surprise cost, and matches revive's design.

Test plan

  • .venv/bin/python -m pytest -q — 429 passed
  • .venv/bin/ruff check . — clean
  • .venv/bin/ruff format --check . — clean
  • Open dashboard, click through Revive section in detail page for a missing / placeholder / stub / configured project; verify prompt renders with copy button

The headless Claude Code path (revive_runner) tried to imitate what
Claude Code itself already does well: read a repo, edit files, ask
the user for approval. It also misaligned with revive's documented
"fresh agent session" requirement for the audit pass — every headless
call started fresh anyway, but the user lost interactive visibility.

Switch to copy-prompt as the only UX:
- "Scaffold (revive init)" for missing projects — pure file op, no
  LLM call.
- "Copy suggest prompt" for placeholder/stub/configured — generates
  the prompt via revive suggest, renders it inline in st.code with
  Streamlit's built-in copy button. User pastes it into a fresh
  Claude Code session in the project, edits happen with the user's
  own approval flow.
- "Copy audit prompt" for configured projects — same pattern, but
  for the second-pass gap audit. The prompt is meant to be pasted
  into a NEW Claude Code session because revive's design depends on
  the clean context window to surface non-inferable facts.
- "Preview brief" remains for configured/stub projects.

Removed:
- src/armillary/revive_runner.py — headless orchestrator and rollback
  machinery (backup file, was_missing marker, accept/reject lifecycle)
- tests/test_revive_runner.py — 22 tests for runner internals
- tests/test_revive_runner_e2e.py — 3 opt-in e2e tests
- pyproject [tool.pytest.ini_options] e2e marker config
- CLAUDE.md e2e usage section

Added:
- generate_suggest_prompt(path) and generate_audit_prompt(path) in
  revive_service — return the prompt string for UI rendering
- run_revive_init(path) in revive_service — pure file-op scaffold
- 9 new unit tests covering the three new helpers (success, missing
  binary, nonzero exit, timeout)

Net: 429 tests pass, full removal of subprocess Claude orchestration
and ~600 lines of runner/test scaffolding. Bulk auto-fill is no
longer in scope for v1; if it returns later it should build on the
prompt-generation primitives instead of re-spawning headless agents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 29, 2026 05:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the “Revive” integration away from a headless Claude Code runner to a copy-prompt workflow, where the dashboard scaffolds .revive/static.md (via revive init) and then renders revive suggest / revive audit prompts for the user to paste into their own Claude Code sessions.

Changes:

  • Remove the revive_runner headless orchestration module and its unit/e2e test coverage.
  • Add generate_suggest_prompt, generate_audit_prompt, and run_revive_init helpers to revive_service.
  • Update the Streamlit detail UI to scaffold and copy prompts (stored in st.session_state) instead of spawning Claude.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_revive_service.py Adds unit tests for the new prompt/init helpers in revive_service.
tests/test_revive_runner_e2e.py Removes opt-in e2e tests tied to the deleted headless runner.
tests/test_revive_runner.py Removes unit tests tied to the deleted headless runner.
src/armillary/ui/detail_revive.py Reworks Revive UI to scaffold + copy prompts and display them via st.code.
src/armillary/revive_service.py Introduces prompt generation helpers and revive init runner; factors shared subprocess logic.
src/armillary/revive_runner.py Deletes the headless Claude orchestration implementation.
pyproject.toml Removes pytest e2e marker/addopts configuration (no longer needed).
CLAUDE.md Removes documentation about e2e revive_runner tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +116 to +121
success, output = run_revive_init(project_path)
if success:
st.success("Scaffolded `.revive/static.md`.")
else:
st.error(f"`revive init` failed: {output}")
st.rerun()
Comment thread src/armillary/revive_service.py Outdated
except OSError as exc:
raise ReviveError(f"revive {subcommand} failed: {exc}") from exc
if result.returncode != 0:
raise ReviveError(f"revive {subcommand} failed: {result.stderr.strip()}")
Comment thread src/armillary/ui/detail_revive.py Outdated
Comment on lines +119 to +120
else:
st.error(f"`revive init` failed: {output}")
justi and others added 4 commits April 29, 2026 14:29
Two helper actions that turn the copy-prompt UX into a one-two click
flow: click Copy to write the prompt to the system clipboard, click
Launch Claude (yolo) to spawn an iTerm tab in the project running
`claude --dangerously-skip-permissions`. Paste happens immediately
in the freshly opened session.

revive_service:
- copy_to_clipboard(text) — pbcopy subprocess wrapper, returns
  False (never raises) when pbcopy is missing so non-mac users still
  see the inline st.code block as a fallback.
- launch_claude_yolo(project_path) — osascript invocation that
  mirrors the iterm-claude-yolo builtin launcher. Stays Project-free
  so the revive UI module does not need Config / Project plumbing.

UI:
- _render_prompt_actions adds a horizontal row under each prompt
  block: Copy to clipboard (primary), Launch Claude (yolo)
  (secondary), Dismiss (tertiary). Replaces the old standalone
  Dismiss button.

Tests: 5 new — pbcopy success / missing binary, osascript
invocation captures the project path and the dangerously-skip-
permissions flag, missing osascript, nonzero exit. 434 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three feedback items, all valid:

1. Inline `st.success` / `st.error` followed by `st.rerun()` in
   `_render_init_button` was wiped before the user could read the
   message. Switch to `st.toast`, which Streamlit persists across
   the rerun for ~4 seconds.

2. `_run_prompt_command` only put `result.stderr.strip()` into the
   ReviveError message. revive subcommands occasionally write
   diagnostics to stdout instead of stderr, and stderr can also be
   empty on some failure modes. Stitch stderr → stdout → exit code
   so the surfaced error is never blank.

3. `_render_init_button` prepended its own "`revive init` failed:"
   prefix on top of `run_revive_init`'s already-prefixed message
   ("revive init failed: ..." in the OSError path), producing
   awkward double-prefixed text. Surface the service message
   verbatim instead.

Two new tests pin the stdout-fallback and exit-code-fallback paths
in the prompt error stitching. 436 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous flow rendered the prompt body as a big st.code block and
required a second click on a Copy / Launch / Dismiss row beneath it.
The intermediate block dominated the section and made it unclear
which button to press. Replace it with three top-level buttons that
each do one thing in one click:

- Copy suggest prompt → fetch via `revive suggest` AND write to
  clipboard in the same handler. Toast confirms.
- Copy audit prompt → same shape for `revive audit`.
- Launch Claude (yolo) → open an iTerm tab running
  `claude --dangerously-skip-permissions`. Independent button,
  shown for any non-missing state, paired with the copy buttons.

The session_state-stored prompt content and the dedicated
_render_pending_prompts / _render_prompt_actions helpers are gone;
the prompt no longer lingers in the UI. User flow is now: click Copy
suggest, click Launch Claude, paste in the new tab. Two clicks, no
visual clutter.

Stub-state info banner copy follows: drop "below", clarify that the
prompt goes via clipboard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six findings from codex review, applied:

HIGH 1. iTerm command was unquoted and AppleScript-injectable.
  launch_claude_yolo now shlex-quotes the project path inside the
  inner shell command and escapes backslashes / double quotes for
  the surrounding AppleScript string literal. New test exercises a
  path containing spaces, quotes, semicolons and a $HOME expansion.

MED 2. Capability probe accepted a revive missing the very subcommands
  this PR introduces. _REQUIRED_SUBCOMMANDS now includes init,
  suggest and audit; the cache test fixture is updated and a new
  test pins the regression.

MED 3. Clipboard fallback dropped the prompt when pbcopy was missing.
  When copy_to_clipboard returns False, the UI now stores the prompt
  in session_state and renders it in a manual-select text_area with
  a Dismiss button, so non-mac users still have a way to grab it.

LOW 5. unknown brief state used to render only the Launch button.
  Add the Preview button for unknown plus a warning explaining that
  static.md exists but could not be parsed (encoding / permissions).

LOW 6. Launch said success even when claude was missing on PATH.
  Preflight shutil.which("claude") in launch_claude_yolo and bail
  with a clear message if it is missing. Toast text now says "Sent
  to iTerm — paste the prompt in the new tab" so it does not imply
  the agent actually started, and a new test covers the preflight.

MED 7. Test gaps around the riskiest new paths. Three new tests:
  - launch_claude_yolo with a path full of metacharacters
  - launch_claude_yolo with claude missing
  - probe_capability requiring init/suggest/audit

Deferred (not in this commit):
  - codex MED 4 (toast vs persistent status). Streamlit toasts
    survive a single rerun by design and stay on screen ~4s; the
    inline-status alternative bloats the section. Revisit if real
    users miss feedback.
  - codex LOW 8 (revive_service is 486 lines, ADR target is 400).
    Splitting into status / actions modules is a clean follow-up
    but expands this PR's diff; tracked as separate work.

Tests: 439 pass, ruff check + format clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the Revive integration to remove the headless Claude Code runner and replace it with a UI that scaffolds via revive init and generates copy/paste prompts (revive suggest / revive audit) for the user to run in their own Claude Code session.

Changes:

  • Removes the revive_runner headless orchestrator and its unit/e2e tests.
  • Adds prompt-generation, clipboard, and iTerm-launch helpers to revive_service, plus corresponding unit tests.
  • Updates the Streamlit detail-page Revive section to drive the new scaffold/copy/audit flow.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/armillary/ui/detail_revive.py Replaces “generate brief” runner UX with scaffold/copy/audit/preview actions.
src/armillary/revive_service.py Adds suggest/audit prompt generation, revive init runner, clipboard helper, and iTerm “yolo” launcher.
tests/test_revive_service.py Adds unit coverage for new revive_service functions and updated capability probing.
src/armillary/revive_runner.py Removed (no longer orchestrating headless Claude runs).
tests/test_revive_runner.py Removed (runner tests no longer applicable).
tests/test_revive_runner_e2e.py Removed (opt-in paid e2e runner tests removed with runner).
pyproject.toml Removes pytest e2e marker/addopts config (no e2e suite remains).
CLAUDE.md Removes documentation about the now-deleted e2e revive runner tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +210 to +217
def generate_suggest_prompt(project_path: Path, *, timeout: float = 10.0) -> str:
"""Run `revive suggest` and return its stdout (an LLM-ready prompt).

The user pastes this into a fresh Claude Code session in the target
project to fill INVARIANTS / GOTCHAS interactively. Raises
``ReviveError`` on missing binary, timeout, or nonzero exit.
"""
return _run_prompt_command(project_path, "suggest", timeout=timeout)
Comment on lines +269 to +299
if shutil.which("osascript") is None:
return False, "osascript not on PATH (macOS-only feature)"
if shutil.which("claude") is None:
# The iTerm tab will still open, but `claude` will fail there.
# Surface the missing binary up front so the user does not see
# a 🚀 toast followed by a `command not found` in their tab.
return False, "`claude` CLI not on PATH — install Claude Code first"
quoted_path = shlex.quote(str(project_path))
inner_shell = f"cd {quoted_path} && claude --dangerously-skip-permissions"
inner_applescript = _escape_applescript_string(inner_shell)
args = [
"osascript",
"-e",
'tell application "iTerm"',
"-e",
"activate",
"-e",
"tell current window",
"-e",
"create tab with default profile",
"-e",
"tell current session",
"-e",
f'write text "{inner_applescript}"',
"-e",
"end tell",
"-e",
"end tell",
"-e",
"end tell",
]
Comment on lines +171 to +179
def _generate_and_copy(project_path: Path, fetch, label: str) -> None:
"""Run a prompt-producing revive subcommand, then push to clipboard.

if result.skipped_reason:
reason_label = _SKIP_REASON_LABELS.get(
result.skipped_reason, result.skipped_reason
)
st.warning(f"Skipped: {reason_label}", icon=":material/warning:")
if st.button("Dismiss", key=f"revive_dismiss_skip_{project_path}"):
st.session_state.pop(key, None)
st.rerun()
Single-click UX: no intermediate render of the prompt, no second
button. Toast on success/failure so the message survives Streamlit
reruns. When ``pbcopy`` is unavailable (Linux/Windows), we keep the
generated prompt in session_state so the user still has a way to
grab it from a fallback text area instead of losing the result.
"""
justi and others added 2 commits April 29, 2026 16:12
The status caption used to read "✓ configured · no hook" — fine on
its own, but it gave the user no signal about how stale the brief
was. To know whether the saved brief was a week old or a year old
they had to click Preview, scan the markdown, and infer.

Add `last_modified` (datetime | None) to ReviveStatus, populated
from `static_path.stat().st_mtime` whenever the file exists. Render
it inline as the third dot-separated chunk: e.g.

    ✓ configured · global hook · updated 3d ago

Reuses the existing `_format_age` helper from detail_work for
consistent formatting (just-now / Xmin / Xh / Xd / Xmo).

Tests cover both the no-static-md case (last_modified is None) and
a populated-static-md case with a pinned mtime so the assertion is
deterministic. 440 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the entire actions row was a single horizontal container
with up to four buttons crammed side-by-side. In the configured
state that meant Copy suggest, Copy audit, Launch Claude, and
Preview brief all sharing one row, each button squeezed narrow
enough that the labels truncated and the parent column scrolled
horizontally.

Restructure the layout so each Copy button sits next to its own
Launch Claude (yolo) on a 2-column row, then Preview brief lives on
its own full-width row beneath:

  configured  → [ Copy suggest | Launch ]
                [ Copy audit   | Launch ]
                [ Preview brief        ]
  stub        → [ Copy suggest | Launch ]
                [ Preview brief        ]
  placeholder → [ Copy suggest | Launch ]
  unknown     → [ Preview brief        ]
  missing     → [ Scaffold             ]

Each button uses width="stretch" so it fills its column. The two
Launch buttons get distinct slot keys ("suggest" / "audit") to
avoid Streamlit's duplicate-widget-key error.

No behavior change beyond layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the Revive integration to remove the headless claude subprocess orchestrator and replace it with a copy/paste prompt workflow driven from the Streamlit UI (plus a convenience “launch Claude” helper).

Changes:

  • Replaces revive_runner headless generation/backup lifecycle with revive init scaffolding + revive suggest/revive audit prompt generation in revive_service.
  • Updates the project detail Revive UI to scaffold, copy prompts, optionally launch Claude in iTerm, and show brief age.
  • Removes the old runner unit tests + opt-in e2e runner tests, and drops the pytest e2e marker config.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_revive_service.py Expands coverage for new prompt/init/clipboard/launch helpers and last-modified status.
tests/test_revive_runner_e2e.py Deletes opt-in e2e coverage tied to the removed headless runner.
tests/test_revive_runner.py Deletes unit tests for the removed headless runner module.
src/armillary/ui/detail_revive.py Replaces “Generate brief” headless UX with scaffold + copy-prompt + launch actions and age display.
src/armillary/revive_service.py Adds new Revive actions: suggest/audit prompt generation, init scaffolding, clipboard copy, and iTerm launch helper.
src/armillary/revive_runner.py Removes the headless Claude orchestration implementation entirely.
pyproject.toml Removes pytest e2e marker/addopts config since e2e tests are removed.
CLAUDE.md Removes documentation about the deleted e2e runner tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/armillary/ui/detail_revive.py Outdated
revive_show,
run_revive_init,
)
from armillary.ui.detail_work import _format_age
Comment on lines +228 to +233
if copy_to_clipboard(prompt):
st.session_state.pop(fallback_key, None)
st.toast(
f"{label.capitalize()} prompt copied — paste in a Claude tab.",
icon="📋",
)
Comment thread src/armillary/revive_service.py Outdated
Comment on lines +248 to +251
Returns True on success, False if pbcopy is missing or errors. The
caller is responsible for surfacing that to the UI; this helper
intentionally never raises so a missing pbcopy on Linux/Windows
gracefully degrades to "select and copy from st.code".
Comment on lines +293 to +296
args = [
"osascript",
"-e",
'tell application "iTerm"',
Two findings worth applying from review 4207441936:

- detail_revive imported the underscore-prefixed `_format_age` from
  detail_work, coupling two UI modules through a private helper.
  Move it to ui/helpers.py where shared formatters belong (CLAUDE.md
  already lists helpers.py as the home for cross-module utilities).
- copy_to_clipboard's docstring claimed the fallback was "select and
  copy from st.code", but the UI renders a st.text_area in
  _render_fallback_prompt. Update the docstring to point at the real
  fallback so future readers don't chase a non-existent st.code path.

Skipped: the st.code-vs-pbcopy comment (deliberate single-click UX,
already discussed in the codex round) and the osascript which()
versus argv hardcoding nit (single-threaded TOCTOU, premature
hardening).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@justi justi merged commit b6f2199 into main May 1, 2026
3 checks passed
@justi justi deleted the feat/revive-copy-prompt branch May 1, 2026 06:38
justi added a commit that referenced this pull request May 1, 2026
PR #33 shipped the Revive integration as a first-class dashboard
surface (detail page section with scaffold + copy-prompt + iTerm
launch actions) but the README still listed only the pre-Revive
feature set. Add a Features bullet pointing at the dashboard surface
and noting the `revive` CLI prerequisite, and bump the test-count
comment from 375 to 440+ (current local + CI run) with `revive`
appended to the covered-areas list so future readers see Revive in
the same scope as the other integration domains.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
justi added a commit that referenced this pull request May 4, 2026
All five inline comments from copilot's review were valid:

1. revive_enhanced.py:55 — `block.path` is absolute in production
   (scanner stores `str(abs_path)` in code_block_service.py:210), so
   the bullet line was rendering `invoicer//Users/.../invoicer/src/x`.
   New `_display_path` helper strips the `repo_path` prefix when
   present, falling back to the absolute string on any structural
   surprise so we never lose the file pointer.

2. revive_enhanced.py:44 — `steal()` can raise on missing/corrupt
   code_index.db or a SQLite build without FTS5. Vanilla revive
   (PR #33) is a complete tier on its own, so the helper now wraps
   the steal call in `try/except` and returns the brief verbatim if
   STEAL_HITS computation fails. Reverses the MN8 line in the
   original ADR 0030 contract — graceful degradation matters more
   than learning about steal-side incidents through a broken MCP
   call.

3. mcp_tools.py:379 — `armillary_revive` was the only MCP tool
   without a response-size budget. A pathological brief or a giant
   indexed code block could blow up the transport. Added a clamp at
   the wrapper boundary to `_RESPONSE_MAX_CHARS` (the same 20k limit
   the other tools use) with a clear `[truncated…]` marker.

4. tests/test_revive_enhanced.py:316 — the happy-path e2e was
   storing `path="src/price.py"` (relative), which masked the bug
   in #1. Switched to an absolute path under the temp `other_repo`
   dir (with the file actually created on disk) so the test exercises
   the same prefix-stripping logic that prod hits.

5. mcp_server.py:48 — module docstring still said "Six tools".
   Updated to seven and added the `armillary_revive` bullet.

New regression tests:
- `test_generate_enhanced_brief_falls_back_when_steal_raises` —
  proves vanilla brief survives a steal-side exception.
- `test_generate_enhanced_brief_renders_relative_block_path` —
  feeds an absolute `block.path` and asserts only the relative
  segment appears in the bullet.
- `test_armillary_revive_clamps_to_response_budget` — feeds a
  helper output > 20k and asserts truncation + marker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants