Skip to content

Fix checkpoint/session path-traversal: validate IDs at read/dispatch boundaries#1365

Merged
Soph merged 13 commits into
mainfrom
soph/fix-checkpoint-path-traversal
Jun 4, 2026
Merged

Fix checkpoint/session path-traversal: validate IDs at read/dispatch boundaries#1365
Soph merged 13 commits into
mainfrom
soph/fix-checkpoint-path-traversal

Conversation

@Soph
Copy link
Copy Markdown
Collaborator

@Soph Soph commented Jun 4, 2026

https://entire.io/gh/entireio/cli/trails/512

Summary

Closes a path-traversal / arbitrary-file-write class across the checkpoint, session, and agent-lifecycle paths. The root issue: identifiers read from the shared entire/checkpoints/v1 branch (attacker-influenceable) or from agent hook input flowed into filesystem paths without validation at the read/restore boundary.

Original vulnerability: entire session resume / entire checkpoint rewind used the metadata SessionID to build the transcript write path with no validation. A crafted absolute or ../-laden session ID (or, for Codex/Pi, an absolute ID returned verbatim) let attacker-controlled transcript bytes overwrite arbitrary files (e.g. ~/.bashrc) → RCE on the next resume, with no prompt.

What changed (11 commits)

Core fix

  • Validate session IDs at the two restore choke points (resolveTranscriptPath, RestoreLogsOnly) before any path construction.

Validator hardening

  • ValidateSessionID now also rejects ., .., and absolute paths (closes the bare-.. dir-component vector and the Windows drive edge).

Lifecycle dispatch (right altitude)

  • Validate SessionID, ToolUseID, and SubagentID once in DispatchLifecycleEvent, before routing — so no handler (e.g. handleLifecycleTurnEnd's .entire/metadata/<id>/ write, handleLifecycleSubagentEnd's subagent-transcript read) can forget.

os.Root containment (defense in depth)

  • Rewind working-tree restore creates dirs via new osroot.MkdirAll (kernel-enforced containment), matching the already-rooted file write.
  • .git/entire-sessions/ writes (state Save + hint/marker files) scoped to os.Root.

Other boundaries

  • Pi captureTranscript validates before writing.
  • GetNextCheckpointSequence (posttodo hook) validates before the directory read.
  • ExtractSpawnedAgentIDs (claudecode + factoryaidroid) drops non-path-safe subagent IDs, so agent-<id>.jsonl reads are path-safe by construction.
  • Documented the ResolveSessionFile security contract on the Agent interface + Codex/Copilot contract regression tests.

Audit results

  • All CommittedMetadata fields read from checkpoints were traced: CheckpointID (12-hex at unmarshal) and InvestigateRunID (validateRunID) are validated at read; every other field is display/logic-only. ToolUseID has a write/read validation asymmetry but no live read→path sink.
  • All os.WriteFile/Open/OpenFile/Rename/Symlink/Create sinks were swept: every attacker-influenceable path is now validated, hash-derived, or os.Root-contained.

Testing

  • New regression tests at each boundary (resume/rewind, dispatcher, Pi, osroot, validators, agent contracts).
  • mise run check green (fmt + lint + unit + integration + Vogon canary). 388 integration tests pass; all touched packages lint clean.

🤖 Generated with Claude Code


Note

High Risk
Changes authentication-adjacent filesystem behavior (resume/rewind transcript writes, lifecycle metadata paths) using attacker-influenceable checkpoint metadata; mistakes could break legitimate sessions or leave traversal gaps.

Overview
This PR closes a path-traversal / arbitrary-file-write class of bugs where session, tool-use, and subagent IDs from checkpoint metadata (entire/checkpoints/v1) or agent hooks were joined into filesystem paths without checks—enabling crafted IDs to overwrite files outside agent/session directories (e.g. on resume or rewind).

Validation is tightened at read/dispatch boundaries: ValidateSessionID now rejects ., .., and absolute paths; DispatchLifecycleEvent validates SessionID, ToolUseID, and SubagentID once before any handler; restore paths resolveTranscriptPath and RestoreLogsOnly reject unsafe IDs from metadata; Pi captureTranscript, GetNextCheckpointSequence, and subagent ID extraction (Claude Code / Factory Droid) add matching guards. The ResolveSessionFile security contract is documented, with Codex/Copilot contract tests.

Defense in depth: rewind working-tree restore uses osroot.MkdirAll under a repo os.Root; .git/entire-sessions/ state save, hints, markers, and cleanup route through os.Root. Regression tests cover resume, dispatcher, Pi, osroot, and validators.

Reviewed by Cursor Bugbot for commit 98f8e37. Configure here.

Soph and others added 11 commits June 4, 2026 12:16
…write

Session IDs read from checkpoint metadata on the shared
entire/checkpoints/v1 branch flowed into agent.ResolveSessionFile +
WriteSession during `entire session resume` and `entire checkpoint
rewind` with no validation. A crafted absolute or "../"-laden session
ID escaped the agent session directory (and for Codex/Pi, which return
absolute IDs verbatim, landed anywhere), letting attacker-controlled
transcript bytes overwrite arbitrary files such as ~/.bashrc — RCE on
the next resume, with no prompt shown to the victim.

Validate the session ID with validation.ValidateSessionID at the two
restore choke points before any path construction:

  - resolveTranscriptPath (covers resume-single, rewind restore, attach)
  - RestoreLogsOnly write + status loops (multi-session resume/rewind)

This mirrors the invariant already enforced when checkpoints are
written, so it cannot reject a legitimately-created checkpoint while
closing every separator/absolute traversal. Unsafe IDs are rejected
(resolveTranscriptPath) or skipped with a warning (RestoreLogsOnly).

Adds end-to-end and choke-point regression tests proving a traversal
session ID writes nothing outside the agent session directory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 6963a4a931b1
ValidateSessionID previously rejected only path separators and empty
input. A bare "." or ".." is separator-free yet still traverses when an
agent uses the ID as a path segment (e.g. Copilot CLI builds
<dir>/<id>/events.jsonl), and the separator check can miss
platform-specific absolute forms (Windows drive paths). Reject both.

Because the same validator guards both checkpoint writes and the
resume/rewind restore boundaries, this tightens the whole class without
affecting UUID-style IDs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 388c3d8a3b31
When `entire checkpoint rewind` restores files from a checkpoint tree,
the file write already used os.Root (kernel-enforced containment), but
the preceding directory creation used a raw
os.MkdirAll(filepath.Join(repoRoot, f.Name)). A crafted tree entry name
containing ".." from an untrusted checkpoint could thus create empty
directories outside the repo before the guarded write refused to escape.

Add osroot.MkdirAll, which creates each level via os.Root.Mkdir so the
kernel rejects any path that escapes the root, and use it in the restore
loop. Both filesystem operations in that loop are now containment-safe.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f52cf53648bf
The PostToolUse (TodoWrite) hook passed input.SessionID and the task
tool-use ID straight into SessionMetadataDirFromSessionID /
TaskMetadataDir and then os.ReadDir, with no validation — unlike every
other hook entry point. A crafted "../.." could redirect the directory
listing. Validate both IDs at this choke point; an invalid value simply
starts the checkpoint sequence at 1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: ddf2fe4a50fe
…tests

Document on the Agent.ResolveSessionFile interface that some agents use
agentSessionID as a directory component or (Codex/Pi) return it verbatim
when absolute, so callers sourcing the ID from untrusted data MUST
validate it first — the resume/rewind choke points do.

Add contract-pinning tests for the two unusual agents: Codex (absolute
returned verbatim) and Copilot (ID as directory component). Each asserts
both the agent behavior and that ValidateSessionID rejects the dangerous
shape, so the guard cannot regress out from under these agents.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 383105db9469
The session-state writers key their filenames on the session ID. The ID
is already validated, so this is belt-and-suspenders: route the writes
through an os.Root scoped to the entire-sessions directory so the kernel
makes escaping the directory impossible even if a validation gap were
ever introduced.

  - session/state.go: Save's atomic rename now goes through os.Root
    (Load and Clear already did). Drops the now-unused stateFilePath.
  - strategy/session_state.go: StoreModelHint, StoreAgentTypeHint,
    ClaimSessionStartBanner, LoadModelHint, LoadAgentTypeHint and
    ClearSessionState now operate via os.Root. Two small helpers
    (openSessionStateRoot / ...ForRead) remove the repeated preamble.

Go 1.26's os.Root natively supports Rename/OpenFile(O_EXCL)/WriteFile,
so the atomic-rename and first-writer-wins semantics are preserved.

The sibling entire-session-locks/ dir is intentionally left as-is: its
flock semantics are inode-bound and it is a separate directory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f95ae94b9edc
Go 1.26's os.Root has a native MkdirAll, so the hand-rolled per-segment
Mkdir loop (with its ToSlash/Trim/path.Join handling and extra imports)
collapses to a one-line delegation — battle-tested and stats before
creating. Also:

  - drop the dead `dir != ""` guard in the rewind restore loop
    (path.Dir never returns "")
  - drop TestMkdirAll_EmptyOrDotIsNoop, which exercised the removed
    Trim/no-op logic; the call site never passes ""/"."
  - correct the now-stale osroot package doc that listed MkdirAll as
    unsupported

Pure cleanup from a /simplify review; no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 38a921fb7470
Pi's ParseHookEvent calls captureTranscript on agent_end, which builds
the destination path from the hook-supplied session ID and writes the
transcript — all before the lifecycle dispatcher validates the ID in
DispatchLifecycleEvent. A "../"-laden ID could thus write outside the
.entire/tmp/pi cache directory.

Validate the session ID at the top of captureTranscript (the choke point
where it becomes a path, covering both call sites); an unsafe ID returns
"" — the function's existing "no capture" signal — so dispatch behavior
is unchanged. Also corrects the gosec comment on the write that wrongly
claimed the ID was already validated.

Taint is the local Pi hook payload (same-privilege), so severity is low;
this is defense-in-depth consistent with the other hook hardening in this
branch. The sibling cacheSessionID writes a fixed filename, not an
ID-derived path, so it needs no guard.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: ddfd40f73b22
handleLifecycleTurnEnd builds .entire/metadata/<session_id>/ via
os.MkdirAll + os.WriteFile directly from the hook-supplied event.SessionID
with no validation — unlike its siblings (SessionStart/TurnStart/ToolUse),
which each validate. A "../"-laden ID could write the transcript outside
the metadata directory. The ModelUpdate/Compaction/SubagentEnd handlers
were also unguarded, surviving only because the strategy layer
(MutateSessionState/StoreModelHint) validates internally.

Per-handler validation is the wrong altitude: it is exactly the kind of
check a new (or existing) handler forgets, which is what happened here.
Validate non-empty event.SessionID once in DispatchLifecycleEvent, before
routing, so every handler — current and future — is covered uniformly.
Empty IDs still pass through to each handler's own empty-handling (e.g.
TurnEnd's fallback to the "unknown" constant).

The existing per-handler validations are now redundant but left in place
as harmless defense-in-depth; they could be consolidated separately.

Taint is local hook input (same-privilege), so severity is low; this
completes the hook-path hardening theme on this branch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 0a614e131bab
CalculateTotalTokenUsage and ExtractAllModifiedFiles build
agent-<id>.jsonl from subagent IDs and read that file, with no local
validation of the ID. Today the IDs are path-safe only because their
sole source, extractAgentIDFromText, happens to accept just [a-zA-Z0-9].
That makes the path-safety of a file read depend on the incidental
character set of a parser two calls away — a fragile coupling: relaxing
extractAgentIDFromText (e.g. to accept hyphenated UUIDs) would silently
open a traversal-read of arbitrary agent-*.jsonl-shaped paths.

Enforce the invariant at the choke point: ExtractSpawnedAgentIDs now
drops any ID that fails validation.ValidateAgentID, so every downstream
agent-<id>.jsonl consumer (claudecode + factoryaidroid, token usage and
modified-files) is path-safe by construction.

Not a live vulnerability — the current parser already constrains the ID,
so the guard is a no-op for today's inputs and adds no behavior change.
No test is added: the guarded branch is unreachable through the real
extractor (it can't emit a non-path-safe ID), so a test would only
exercise dead input.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: c56a01dc268e
…cleEvent

handleLifecycleSubagentEnd builds a subagent transcript path from
event.SubagentID (AgentTranscriptPath -> filepath.Join), then stats and
reads it via ExtractModifiedFilesFromOffset — before the value is
validated. A "../"-laden SubagentID escapes transcriptDir for a file
read. The task-checkpoint *write* paths (CapturePreTaskState,
WriteTemporaryTask) already validate ToolUseID/AgentID, but this read
ran first and unguarded.

Extend the central dispatcher guard (added for SessionID) to also reject
path-unsafe event.ToolUseID (ValidateToolUseID) and event.SubagentID
(ValidateAgentID) before routing, so every handler that turns these
hook-supplied identifiers into a path is covered at one choke point.
Empty values pass through to each handler's own empty-handling.

Taint is local hook input (same-privilege) and the sink is a read, so
severity is low; this closes the last identifier in the lifecycle path
family.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 92ddb0963aa9
Copilot AI review requested due to automatic review settings June 4, 2026 14:02
@Soph Soph requested a review from a team as a code owner June 4, 2026 14:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens filesystem boundaries against path traversal/arbitrary write by validating attacker-influenceable identifiers (session/tool-use/subagent IDs) at restore/dispatch choke points and by routing more sensitive writes through os.Root for kernel-enforced containment.

Changes:

  • Tightens ID validation (notably SessionID) and applies it at transcript restore + lifecycle dispatch boundaries.
  • Adds os.Root-scoped helpers/usages to prevent directory-escape during rewind restore and session-state writes/cleanup.
  • Introduces regression tests for traversal attempts across resume/rewind, dispatcher, Pi transcript capture, osroot helpers, and agent contracts.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
cmd/entire/cli/validation/validators.go Hardens ValidateSessionID to reject additional traversal primitives.
cmd/entire/cli/validation/validators_test.go Adds tests for new ValidateSessionID hardening cases.
cmd/entire/cli/transcript.go Validates session IDs at transcript restore path resolution.
cmd/entire/cli/transcript_test.go Adds regression tests ensuring transcript path resolution rejects traversal.
cmd/entire/cli/strategy/session_state.go Scopes session hint/marker IO via os.Root; updates cleanup logic.
cmd/entire/cli/strategy/manual_commit_rewind.go Uses os.Root-contained directory creation during rewind restore; validates IDs in restore paths.
cmd/entire/cli/state.go Validates IDs before using them as filesystem path components for checkpoint sequencing.
cmd/entire/cli/session/state.go Uses os.Root for the final atomic rename when saving session state.
cmd/entire/cli/resume_test.go Adds E2E-style regression test preventing arbitrary file writes during resume.
cmd/entire/cli/osroot/osroot.go Adds osroot.MkdirAll wrapper around rooted directory creation.
cmd/entire/cli/osroot/osroot_test.go Tests rooted MkdirAll behavior including traversal rejection.
cmd/entire/cli/lifecycle.go Centralizes lifecycle identifier validation before routing to handlers.
cmd/entire/cli/lifecycle_test.go Adds tests ensuring dispatcher rejects unsafe IDs across event types.
cmd/entire/cli/agent/pi/lifecycle.go Validates session ID before Pi transcript capture/write.
cmd/entire/cli/agent/pi/lifecycle_test.go Adds regression test preventing traversal via Pi capture path.
cmd/entire/cli/agent/factoryaidroid/transcript.go Filters extracted spawned-agent IDs through path-safe validation.
cmd/entire/cli/agent/copilotcli/security_contract_test.go Pins Copilot ResolveSessionFile layout assumptions to validator behavior.
cmd/entire/cli/agent/codex/security_contract_test.go Pins Codex “absolute path verbatim” behavior to validator behavior.
cmd/entire/cli/agent/claudecode/transcript.go Filters extracted spawned-agent IDs through path-safe validation.
cmd/entire/cli/agent/agent.go Documents the ResolveSessionFile security contract and required validation boundary.

Comment thread cmd/entire/cli/validation/validators.go
Comment thread cmd/entire/cli/validation/validators_test.go
Comment thread cmd/entire/cli/strategy/session_state.go Outdated
Three issues from PR #1365 review:

- ValidateSessionID accepted Windows drive-relative paths ("C:foo"):
  separator-free and not reported absolute by filepath.IsAbs, yet
  filepath.Join drops the base dir on a volume name. Reject ":" (the
  volume separator) — portable and testable; no legitimate session ID
  contains a colon.

- ValidateSessionID accepted glob metacharacters ("*", "?", "["). Session
  IDs are interpolated into filepath.Glob patterns (agent transcript
  lookup in gemini/pi/codex, session-state cleanup), so a "*" could match
  unrelated files. Reject them at the validator choke point.

- StateStore.Clear and ClearSessionState globbed "<sessionID>.*" to find
  files to DELETE. With the permissive validator a session ID of "*"
  would have matched and removed every session's state (DoS/data-loss
  within the state dir). Replace globbing with literal prefix matching so
  deletion can never depend on pattern interpretation, even if a future
  caller skips validation.

Adds regression tests for the drive-relative and glob-metacharacter cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: ed740e10abe5
pjbgf
pjbgf previously approved these changes Jun 4, 2026
Comment thread cmd/entire/cli/validation/validators.go Outdated
Co-authored-by: Paulo Gomes <paulo@entire.io>
@Soph Soph enabled auto-merge June 4, 2026 16:28
@Soph Soph merged commit 1cf0e84 into main Jun 4, 2026
9 checks passed
@Soph Soph deleted the soph/fix-checkpoint-path-traversal branch June 4, 2026 16:31
@cursor cursor Bot mentioned this pull request Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants