Skip to content

feat: add resume session feature (Claude Code-inspired append-only JSONL)#467

Open
ahmedk20 wants to merge 1 commit into
usestrix:mainfrom
ahmedk20:feat/resume-session
Open

feat: add resume session feature (Claude Code-inspired append-only JSONL)#467
ahmedk20 wants to merge 1 commit into
usestrix:mainfrom
ahmedk20:feat/resume-session

Conversation

@ahmedk20
Copy link
Copy Markdown

Summary

This PR adds a resume session feature that lets users continue a past penetration test scan from where it left off — including the full LLM conversation history, iteration state, and
agent context.

Rather than a snapshot checkpoint (which loses progress on crashes), this uses an append-only conversation.jsonl file that writes every LLM message to disk in real-time, inspired by how
Claude Code persists sessions.

  • strix --list-sessions — list all past scans in a table
  • strix --continue / strix -c — resume the most recent session
  • strix --resume <run_name> — resume a specific session by name
  • strix --resume — open an interactive picker (TUI modal or CLI numbered list)
  • Resuming a completed scan reopens it and asks what to investigate next

Why not the checkpoint approach (from test-resume branch)?

The previous test-resume attempt saved a checkpoint.json snapshot periodically. The fatal flaw: a crash between saves loses all progress since the last checkpoint. It also blocked
resuming completed scans and only worked in --non-interactive mode.

Architecture

Two new files per run:
strix_runs/{run_name}/conversation.jsonl │ Full LLM conversation, appended in real-time
strix_runs/{run_name}/session_meta.json │ Lightweight sidecar for fast listing (no need to parse the full log)
conversation.jsonl entry types:
{"type": "session_start", "scan_config": {...}, "schema_version": 1, ...}
{"type": "message", "role": "user", "content": "...", "iteration": 1, ...}
{"type": "message", "role": "assistant", "content": [...], "iteration": 1, ...}
{"type": "iteration_end", "iteration": 1, "context": {...}, "completed": false, ...}
{"type": "session_end", "completed": true, ...}

On resume: replay message entries → AgentState.messages, latest iteration_end → iteration/context.

Crash safety: even a hard kill (kill -9) leaves all messages written. The worst case is a missing iteration_end marker — the iteration count is inferred from message count instead.

Changes

New files:

  • strix/telemetry/conversation_log.py — ConversationLog class: real-time JSONL append + replay() for reconstruction
  • strix/telemetry/session_meta.py — atomic read/write of session_meta.json sidecar
  • strix/sessions/listing.py — enumerate strix_runs/, return sorted SessionRow list with legacy fallback for old runs
  • strix/sessions/resume.py — load_resume_bundle() + apply_resume_to_args() + merge_into_agent_config()
  • strix/sessions/init.py — public API surface
  • strix/interface/session_picker_cli.py — rich.Table + Prompt.ask interactive picker for --non-interactive mode
  • strix/interface/session_picker_tui.py — Textual ModalScreen with live search, DataTable, Resume/Cancel buttons

Modified files:

  • strix/agents/state.py — added PrivateAttr _conversation_log; hooked add_message to append to JSONL in real-time
  • strix/agents/base_agent.py — wires ConversationLog at init (root agent only); calls append_iteration_end after each loop iteration
  • strix/telemetry/tracer.py — writes session_meta.json on set_scan_config and finalizes it in cleanup(); calls write_session_end on shutdown
  • strix/interface/main.py — adds --resume [RUN_NAME], -c/--continue, --list-sessions; makes --target optional when resume flags present
  • strix/interface/cli.py — injects restored AgentState into agent_config["state"]; skips startup banner on resume
  • strix/interface/tui.py — pushes SessionPickerScreen on mount when --resume (no arg); skips splash on direct resume

Test plan

  • 19 unit tests added in tests/test_resume_feature.py — all passing
    • ConversationLog: roundtrip, thinking_blocks, corrupt line skipping, crash-safe partial replay, empty file
    • session_meta: write/read, merge preserves user fields (title/tags), atomic write, corrupt/missing file
    • listing: empty root, returns rows, most_recent with/without conv_log, query filter, get by name

Manual QA checklist:

  • strix --list-sessions with 0 / 1 / many runs
  • strix -c resumes most recent; works after completed scan
  • strix --resume <run_name> resumes directly
  • strix --resume opens TUI picker; search filters, Enter resumes
  • Hard-kill mid-scan then resume — all messages recovered
  • Completed scan reopens and asks for follow-up
  • Old runs (no session_meta.json) appear via legacy fallback
  • Corrupt conversation.jsonl shows error, doesn't crash

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 26, 2026

Greptile Summary

This PR introduces an append-only JSONL session-persistence layer (conversation.jsonl + session_meta.json) and the corresponding CLI/TUI commands (--resume, -c/--continue, --list-sessions) for resuming past penetration test scans. The architecture is well-thought-out and crash-safe by design, with good unit test coverage for the new modules.

  • P1 – _scan_mode_explicit guard is never set (strix/sessions/resume.py): the branch meant to respect an explicit --scan-mode override on resume is always taken, so the bundle's saved mode silently overwrites whatever the user passes at the command line.
  • Three additional issues carried over from prior review threads remain open: the "reopen" synthetic message not being persisted to JSONL, the broken iteration_count expression in _finalize_session_meta, and the non-atomic read-modify-write race in write_session_meta.

Confidence Score: 3/5

Not ready to merge — one new P1 plus three unresolved P1s from prior review rounds.

A new P1 (_scan_mode_explicit never set, so --scan-mode on resume is silently ignored) joins three previously identified P1s (broken iteration_count, 'reopen' message not persisted, non-atomic session-meta write) that are still open. Multiple P1s push the score below the 4/5 ceiling.

strix/sessions/resume.py (new P1), strix/telemetry/tracer.py (broken iteration_count), strix/telemetry/session_meta.py (race condition)

Important Files Changed

Filename Overview
strix/sessions/resume.py Core resume logic; _scan_mode_explicit guard is never set so the bundle's scan_mode always silently overwrites the user's CLI choice (new P1 finding).
strix/telemetry/tracer.py Adds session-meta writing and conversation-log finalisation; iteration_count generator is broken (flagged in prior thread), otherwise additions look correct.
strix/telemetry/session_meta.py Atomic metadata sidecar; the read→merge→write cycle has a known race condition under concurrent callers (flagged in prior review thread).
strix/telemetry/conversation_log.py New append-only JSONL logger with thread-safe writes and crash-safe replay; implementation is clean and well-tested.
strix/interface/main.py Adds resume/list-sessions CLI flags and bootstrap logic; "__PICK__" magic sentinel can collide with a real run name.
strix/interface/session_picker_tui.py New Textual modal for session picking; contains dead code in compose() that should be removed.
strix/interface/session_picker_cli.py Interactive CLI picker with rich table; straightforward and correct.
strix/agents/base_agent.py Wires ConversationLog at init and appends iteration_end after each loop; correct for non-resume path, but the 'reopen' synthetic message ordering issue is noted in prior thread.
strix/agents/state.py Adds PrivateAttr _conversation_log and hooks add_message to write to JSONL in real-time; clean implementation.
strix/interface/cli.py Skips startup banner on resume and merges restored AgentState into agent_config; correct.
strix/interface/tui.py TUI session picker integration; scan_config is rebuilt after picker resolves and set_run_name exists on Tracer.
strix/sessions/init.py Public API surface re-export; correct.
strix/sessions/listing.py Clean session enumeration with legacy fallback; filter, sort, and limit logic is correct.
tests/test_resume_feature.py 19 unit tests covering roundtrip, crash safety, corrupt-line skipping, and listing; good coverage of the new modules.

Comments Outside Diff (1)

  1. strix/sessions/resume.py, line 933-935 (link)

    P1 _scan_mode_explicit is never set — user's --scan-mode override silently ignored

    _scan_mode_explicit is never written anywhere in the codebase, so getattr(args, "_scan_mode_explicit", False) always returns False, making not False always True. The branch therefore always overwrites args.scan_mode with the bundle's saved value — even when the user explicitly passes --scan-mode quick on a resume command. The user's intent is silently discarded.

    The fix is to set args._scan_mode_explicit = True in parse_args() when --scan-mode is present, or to drop the guard entirely if overriding with the bundle's mode is always the desired behaviour.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: strix/sessions/resume.py
    Line: 933-935
    
    Comment:
    **`_scan_mode_explicit` is never set — user's `--scan-mode` override silently ignored**
    
    `_scan_mode_explicit` is never written anywhere in the codebase, so `getattr(args, "_scan_mode_explicit", False)` always returns `False`, making `not False` always `True`. The branch therefore **always** overwrites `args.scan_mode` with the bundle's saved value — even when the user explicitly passes `--scan-mode quick` on a resume command. The user's intent is silently discarded.
    
    The fix is to set `args._scan_mode_explicit = True` in `parse_args()` when `--scan-mode` is present, or to drop the guard entirely if overriding with the bundle's mode is always the desired behaviour.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: strix/sessions/resume.py
Line: 933-935

Comment:
**`_scan_mode_explicit` is never set — user's `--scan-mode` override silently ignored**

`_scan_mode_explicit` is never written anywhere in the codebase, so `getattr(args, "_scan_mode_explicit", False)` always returns `False`, making `not False` always `True`. The branch therefore **always** overwrites `args.scan_mode` with the bundle's saved value — even when the user explicitly passes `--scan-mode quick` on a resume command. The user's intent is silently discarded.

The fix is to set `args._scan_mode_explicit = True` in `parse_args()` when `--scan-mode` is present, or to drop the guard entirely if overriding with the bundle's mode is always the desired behaviour.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: strix/interface/main.py
Line: 247-248

Comment:
**`"__PICK__"` sentinel blocks resuming a session literally named `__PICK__`**

`nargs="?"` with `const="__PICK__"` means `--resume __PICK__` sets `args.resume = "__PICK__"`, which is then treated as the "open picker" signal rather than a run name. Any user who happened to have a session folder named `__PICK__` cannot resume it by name and would get an unexpected interactive picker instead.

Consider using a dedicated flag (e.g. `--resume-pick` / `--resume-interactive`) instead of a magic sentinel to avoid the collision entirely.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: strix/interface/session_picker_tui.py
Line: 507-508

Comment:
**Dead code block in `compose`**

The `with self.app.focused if False else self:` construct is always `with self:` (the condition is a compile-time constant `False`) and the body is `pass`, so this block is entirely inert. The `# noqa: SIM210` comment suppresses the linter warning that would otherwise flag the pointless ternary. The two lines should be removed.

```suggestion
    def compose(self) -> ComposeResult:
        from textual.containers import Container, Horizontal
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (2): Last reviewed commit: "feat: add Claude Code-inspired resume se..." | Re-trigger Greptile

Comment thread strix/sessions/resume.py
Comment on lines +58 to +63
if mode == "reopen":
state.add_message(
"user",
"The previous scan session has been reopened. Please summarize the key "
"findings so far and ask what to investigate or test next.",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 "Reopen" synthetic message not persisted to JSONL

When mode == "reopen", state.add_message(...) is called on a freshly-constructed AgentState whose _conversation_log is still Noneset_conversation_log isn't called until BaseAgent.__init__ later. The synthetic "user" message is added to the in-memory state.messages list but never written to conversation.jsonl.

The LLM then responds to it, and that assistant message IS written (the log is set by then). On a subsequent resume of this reopened session, replay() reconstructs a conversation that starts with an assistant message with no preceding user turn — an invalid message sequence that will likely cause an LLM API error or confuse the model.

Fix: write the synthetic message to the log explicitly after the log is wired up (e.g., in BaseAgent.__init__ after set_conversation_log, check for a pending synthetic message), or pass a flag so ConversationLog.append_message can be called immediately after the log is attached.

Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/sessions/resume.py
Line: 58-63

Comment:
**"Reopen" synthetic message not persisted to JSONL**

When `mode == "reopen"`, `state.add_message(...)` is called on a freshly-constructed `AgentState` whose `_conversation_log` is still `None``set_conversation_log` isn't called until `BaseAgent.__init__` later. The synthetic `"user"` message is added to the in-memory `state.messages` list but never written to `conversation.jsonl`.

The LLM then responds to it, and that assistant message IS written (the log is set by then). On a subsequent resume of this reopened session, `replay()` reconstructs a conversation that starts with an assistant message with no preceding user turn — an invalid message sequence that will likely cause an LLM API error or confuse the model.

Fix: write the synthetic message to the log explicitly after the log is wired up (e.g., in `BaseAgent.__init__` after `set_conversation_log`, check for a pending synthetic message), or pass a flag so `ConversationLog.append_message` can be called immediately after the log is attached.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread strix/telemetry/tracer.py
Comment on lines +896 to +905
"status": "completed" if completed else "errored",
"ended_at": datetime.now(UTC).isoformat(),
"iteration_count": max(
(a.get("iteration", 0) for ag in self.agents.values()
for a in ag.get("tool_executions", [])), default=0
),
"vulnerability_count": len(self.vulnerability_reports),
"agent_count": len(self.agents),
},
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 iteration_count expression crashes silently — always writes 0

self.agents[agent_id]["tool_executions"] is a list[int] (execution IDs appended in log_tool_execution_start), not a list of dicts. The generator expression calls .get("iteration", 0) on each int, which raises AttributeError. The surrounding try/except Exception: pass swallows the error silently, so iteration_count is never written to session_meta.json and the listing table always shows "?" for every session.

# current (broken) — iterates integers, not dicts
"iteration_count": max(
    (a.get("iteration", 0) for ag in self.agents.values()
     for a in ag.get("tool_executions", [])), default=0
),

The tracer doesn't store AgentState.iteration directly, so a straightforward fix is to read the final iteration from the last iteration_end entry written to the conversation log, or store the iteration on the tracer when append_iteration_end is called in BaseAgent.

Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/telemetry/tracer.py
Line: 896-905

Comment:
**`iteration_count` expression crashes silently — always writes 0**

`self.agents[agent_id]["tool_executions"]` is a `list[int]` (execution IDs appended in `log_tool_execution_start`), not a list of dicts. The generator expression calls `.get("iteration", 0)` on each `int`, which raises `AttributeError`. The surrounding `try/except Exception: pass` swallows the error silently, so `iteration_count` is never written to `session_meta.json` and the listing table always shows `"?"` for every session.

```python
# current (broken) — iterates integers, not dicts
"iteration_count": max(
    (a.get("iteration", 0) for ag in self.agents.values()
     for a in ag.get("tool_executions", [])), default=0
),
```

The tracer doesn't store `AgentState.iteration` directly, so a straightforward fix is to read the final iteration from the last `iteration_end` entry written to the conversation log, or store the iteration on the tracer when `append_iteration_end` is called in `BaseAgent`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +10 to +23
def write_session_meta(run_dir: Path, meta: dict[str, Any]) -> None:
"""Atomically merge *meta* into session_meta.json, preserving user fields."""
path = run_dir / "session_meta.json"
existing = _read_raw(path) or {}

merged = {**existing, **meta}
# Always preserve user-editable fields from existing file
merged["title"] = existing.get("title", merged.get("title"))
merged["tags"] = existing.get("tags", merged.get("tags", []))
merged["last_updated"] = datetime.now(UTC).isoformat()

tmp = path.with_suffix(".tmp")
tmp.write_text(json.dumps(merged, ensure_ascii=False, indent=2), encoding="utf-8")
os.replace(tmp, path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-atomic read-modify-write in write_session_meta

os.replace makes the final rename atomic, but the read→merge→write cycle is not protected by a lock. If two threads call write_session_meta concurrently (e.g., _write_initial_session_meta from set_scan_config and _finalize_session_meta from cleanup racing on shutdown), both can read the same old state and the later write silently discards the earlier one's updates. Consider adding a per-file lock (similar to the one used in ConversationLog._append) around the entire read-merge-write block.

Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/telemetry/session_meta.py
Line: 10-23

Comment:
**Non-atomic read-modify-write in `write_session_meta`**

`os.replace` makes the final rename atomic, but the read→merge→write cycle is not protected by a lock. If two threads call `write_session_meta` concurrently (e.g., `_write_initial_session_meta` from `set_scan_config` and `_finalize_session_meta` from `cleanup` racing on shutdown), both can read the same old state and the later write silently discards the earlier one's updates. Consider adding a per-file lock (similar to the one used in `ConversationLog._append`) around the entire read-merge-write block.

How can I resolve this? If you propose a fix, please make it concise.

@ahmedk20 ahmedk20 closed this Apr 28, 2026
@ahmedk20 ahmedk20 reopened this Apr 28, 2026
seanturner83 added a commit to seanturner83/strix that referenced this pull request Apr 29, 2026
Pulls usestrix#467 (feat: add resume
session feature) ahead of upstream merge. Enables unattended scans
to survive crashes or forced termination (e.g. AWS STS session
expiry mid-scan) by appending every LLM message to
strix_runs/<run>/conversation.jsonl and replaying on --resume.

New CLI:
  strix --list-sessions         # tabulate past scans
  strix --continue (or -c)      # resume most recent
  strix --resume <run_name>     # resume by name
  strix --resume                # open interactive picker

Motivates:
  https://github.com/seedcx/strix-scan-workflow/actions/runs/25116247499
  — trade-api flag-aware scan exited at 71min with
    APIConnectionError ("security token included in the request is
    expired"). Session had produced 10 findings; all lost once the
    report upload also hit ExpiredToken. With this commit merged,
    the CI composite action can wrap strix in a re-assume + --continue
    loop so each sub-session fits inside the 60min STS budget and
    total scan duration is bounded only by the overall workflow
    timeout.

Upstream integration:
  - Once usestrix#467 merges, drop from the seedcx-build recipe
  - Until then, include alongside usestrix#454 (memory compression), usestrix#460
    (truncate retry), usestrix#468 (thinking retry), this one, and the
    adaptive-thinking commit on this branch

Clean 3-way merge, no conflicts. Smoke tests:
  - uv sync succeeds
  - strix --help surfaces --resume / --continue / --list-sessions
  - imports (strix.sessions.resume, strix.telemetry.conversation_log) OK

Follow-up (separate): strix-scan-workflow README recipe + composite
action consumer changes to invoke resume on re-dispatch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant