Fix checkpoint/session path-traversal: validate IDs at read/dispatch boundaries#1365
Merged
Conversation
…write Session IDs read from checkpoint metadata on the shared entire/checkpoints/v1 branch flowed into agent.ResolveSessionFile + WriteSession during `entire session resume` and `entire checkpoint rewind` with no validation. A crafted absolute or "../"-laden session ID escaped the agent session directory (and for Codex/Pi, which return absolute IDs verbatim, landed anywhere), letting attacker-controlled transcript bytes overwrite arbitrary files such as ~/.bashrc — RCE on the next resume, with no prompt shown to the victim. Validate the session ID with validation.ValidateSessionID at the two restore choke points before any path construction: - resolveTranscriptPath (covers resume-single, rewind restore, attach) - RestoreLogsOnly write + status loops (multi-session resume/rewind) This mirrors the invariant already enforced when checkpoints are written, so it cannot reject a legitimately-created checkpoint while closing every separator/absolute traversal. Unsafe IDs are rejected (resolveTranscriptPath) or skipped with a warning (RestoreLogsOnly). Adds end-to-end and choke-point regression tests proving a traversal session ID writes nothing outside the agent session directory. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6963a4a931b1
ValidateSessionID previously rejected only path separators and empty input. A bare "." or ".." is separator-free yet still traverses when an agent uses the ID as a path segment (e.g. Copilot CLI builds <dir>/<id>/events.jsonl), and the separator check can miss platform-specific absolute forms (Windows drive paths). Reject both. Because the same validator guards both checkpoint writes and the resume/rewind restore boundaries, this tightens the whole class without affecting UUID-style IDs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 388c3d8a3b31
When `entire checkpoint rewind` restores files from a checkpoint tree, the file write already used os.Root (kernel-enforced containment), but the preceding directory creation used a raw os.MkdirAll(filepath.Join(repoRoot, f.Name)). A crafted tree entry name containing ".." from an untrusted checkpoint could thus create empty directories outside the repo before the guarded write refused to escape. Add osroot.MkdirAll, which creates each level via os.Root.Mkdir so the kernel rejects any path that escapes the root, and use it in the restore loop. Both filesystem operations in that loop are now containment-safe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: f52cf53648bf
The PostToolUse (TodoWrite) hook passed input.SessionID and the task tool-use ID straight into SessionMetadataDirFromSessionID / TaskMetadataDir and then os.ReadDir, with no validation — unlike every other hook entry point. A crafted "../.." could redirect the directory listing. Validate both IDs at this choke point; an invalid value simply starts the checkpoint sequence at 1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ddf2fe4a50fe
…tests Document on the Agent.ResolveSessionFile interface that some agents use agentSessionID as a directory component or (Codex/Pi) return it verbatim when absolute, so callers sourcing the ID from untrusted data MUST validate it first — the resume/rewind choke points do. Add contract-pinning tests for the two unusual agents: Codex (absolute returned verbatim) and Copilot (ID as directory component). Each asserts both the agent behavior and that ValidateSessionID rejects the dangerous shape, so the guard cannot regress out from under these agents. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 383105db9469
The session-state writers key their filenames on the session ID. The ID
is already validated, so this is belt-and-suspenders: route the writes
through an os.Root scoped to the entire-sessions directory so the kernel
makes escaping the directory impossible even if a validation gap were
ever introduced.
- session/state.go: Save's atomic rename now goes through os.Root
(Load and Clear already did). Drops the now-unused stateFilePath.
- strategy/session_state.go: StoreModelHint, StoreAgentTypeHint,
ClaimSessionStartBanner, LoadModelHint, LoadAgentTypeHint and
ClearSessionState now operate via os.Root. Two small helpers
(openSessionStateRoot / ...ForRead) remove the repeated preamble.
Go 1.26's os.Root natively supports Rename/OpenFile(O_EXCL)/WriteFile,
so the atomic-rename and first-writer-wins semantics are preserved.
The sibling entire-session-locks/ dir is intentionally left as-is: its
flock semantics are inode-bound and it is a separate directory.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f95ae94b9edc
Go 1.26's os.Root has a native MkdirAll, so the hand-rolled per-segment
Mkdir loop (with its ToSlash/Trim/path.Join handling and extra imports)
collapses to a one-line delegation — battle-tested and stats before
creating. Also:
- drop the dead `dir != ""` guard in the rewind restore loop
(path.Dir never returns "")
- drop TestMkdirAll_EmptyOrDotIsNoop, which exercised the removed
Trim/no-op logic; the call site never passes ""/"."
- correct the now-stale osroot package doc that listed MkdirAll as
unsupported
Pure cleanup from a /simplify review; no behavior change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 38a921fb7470
Pi's ParseHookEvent calls captureTranscript on agent_end, which builds the destination path from the hook-supplied session ID and writes the transcript — all before the lifecycle dispatcher validates the ID in DispatchLifecycleEvent. A "../"-laden ID could thus write outside the .entire/tmp/pi cache directory. Validate the session ID at the top of captureTranscript (the choke point where it becomes a path, covering both call sites); an unsafe ID returns "" — the function's existing "no capture" signal — so dispatch behavior is unchanged. Also corrects the gosec comment on the write that wrongly claimed the ID was already validated. Taint is the local Pi hook payload (same-privilege), so severity is low; this is defense-in-depth consistent with the other hook hardening in this branch. The sibling cacheSessionID writes a fixed filename, not an ID-derived path, so it needs no guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ddfd40f73b22
handleLifecycleTurnEnd builds .entire/metadata/<session_id>/ via os.MkdirAll + os.WriteFile directly from the hook-supplied event.SessionID with no validation — unlike its siblings (SessionStart/TurnStart/ToolUse), which each validate. A "../"-laden ID could write the transcript outside the metadata directory. The ModelUpdate/Compaction/SubagentEnd handlers were also unguarded, surviving only because the strategy layer (MutateSessionState/StoreModelHint) validates internally. Per-handler validation is the wrong altitude: it is exactly the kind of check a new (or existing) handler forgets, which is what happened here. Validate non-empty event.SessionID once in DispatchLifecycleEvent, before routing, so every handler — current and future — is covered uniformly. Empty IDs still pass through to each handler's own empty-handling (e.g. TurnEnd's fallback to the "unknown" constant). The existing per-handler validations are now redundant but left in place as harmless defense-in-depth; they could be consolidated separately. Taint is local hook input (same-privilege), so severity is low; this completes the hook-path hardening theme on this branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 0a614e131bab
CalculateTotalTokenUsage and ExtractAllModifiedFiles build agent-<id>.jsonl from subagent IDs and read that file, with no local validation of the ID. Today the IDs are path-safe only because their sole source, extractAgentIDFromText, happens to accept just [a-zA-Z0-9]. That makes the path-safety of a file read depend on the incidental character set of a parser two calls away — a fragile coupling: relaxing extractAgentIDFromText (e.g. to accept hyphenated UUIDs) would silently open a traversal-read of arbitrary agent-*.jsonl-shaped paths. Enforce the invariant at the choke point: ExtractSpawnedAgentIDs now drops any ID that fails validation.ValidateAgentID, so every downstream agent-<id>.jsonl consumer (claudecode + factoryaidroid, token usage and modified-files) is path-safe by construction. Not a live vulnerability — the current parser already constrains the ID, so the guard is a no-op for today's inputs and adds no behavior change. No test is added: the guarded branch is unreachable through the real extractor (it can't emit a non-path-safe ID), so a test would only exercise dead input. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c56a01dc268e
…cleEvent handleLifecycleSubagentEnd builds a subagent transcript path from event.SubagentID (AgentTranscriptPath -> filepath.Join), then stats and reads it via ExtractModifiedFilesFromOffset — before the value is validated. A "../"-laden SubagentID escapes transcriptDir for a file read. The task-checkpoint *write* paths (CapturePreTaskState, WriteTemporaryTask) already validate ToolUseID/AgentID, but this read ran first and unguarded. Extend the central dispatcher guard (added for SessionID) to also reject path-unsafe event.ToolUseID (ValidateToolUseID) and event.SubagentID (ValidateAgentID) before routing, so every handler that turns these hook-supplied identifiers into a path is covered at one choke point. Empty values pass through to each handler's own empty-handling. Taint is local hook input (same-privilege) and the sink is a read, so severity is low; this closes the last identifier in the lifecycle path family. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 92ddb0963aa9
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens filesystem boundaries against path traversal/arbitrary write by validating attacker-influenceable identifiers (session/tool-use/subagent IDs) at restore/dispatch choke points and by routing more sensitive writes through os.Root for kernel-enforced containment.
Changes:
- Tightens ID validation (notably
SessionID) and applies it at transcript restore + lifecycle dispatch boundaries. - Adds
os.Root-scoped helpers/usages to prevent directory-escape during rewind restore and session-state writes/cleanup. - Introduces regression tests for traversal attempts across resume/rewind, dispatcher, Pi transcript capture, osroot helpers, and agent contracts.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/entire/cli/validation/validators.go | Hardens ValidateSessionID to reject additional traversal primitives. |
| cmd/entire/cli/validation/validators_test.go | Adds tests for new ValidateSessionID hardening cases. |
| cmd/entire/cli/transcript.go | Validates session IDs at transcript restore path resolution. |
| cmd/entire/cli/transcript_test.go | Adds regression tests ensuring transcript path resolution rejects traversal. |
| cmd/entire/cli/strategy/session_state.go | Scopes session hint/marker IO via os.Root; updates cleanup logic. |
| cmd/entire/cli/strategy/manual_commit_rewind.go | Uses os.Root-contained directory creation during rewind restore; validates IDs in restore paths. |
| cmd/entire/cli/state.go | Validates IDs before using them as filesystem path components for checkpoint sequencing. |
| cmd/entire/cli/session/state.go | Uses os.Root for the final atomic rename when saving session state. |
| cmd/entire/cli/resume_test.go | Adds E2E-style regression test preventing arbitrary file writes during resume. |
| cmd/entire/cli/osroot/osroot.go | Adds osroot.MkdirAll wrapper around rooted directory creation. |
| cmd/entire/cli/osroot/osroot_test.go | Tests rooted MkdirAll behavior including traversal rejection. |
| cmd/entire/cli/lifecycle.go | Centralizes lifecycle identifier validation before routing to handlers. |
| cmd/entire/cli/lifecycle_test.go | Adds tests ensuring dispatcher rejects unsafe IDs across event types. |
| cmd/entire/cli/agent/pi/lifecycle.go | Validates session ID before Pi transcript capture/write. |
| cmd/entire/cli/agent/pi/lifecycle_test.go | Adds regression test preventing traversal via Pi capture path. |
| cmd/entire/cli/agent/factoryaidroid/transcript.go | Filters extracted spawned-agent IDs through path-safe validation. |
| cmd/entire/cli/agent/copilotcli/security_contract_test.go | Pins Copilot ResolveSessionFile layout assumptions to validator behavior. |
| cmd/entire/cli/agent/codex/security_contract_test.go | Pins Codex “absolute path verbatim” behavior to validator behavior. |
| cmd/entire/cli/agent/claudecode/transcript.go | Filters extracted spawned-agent IDs through path-safe validation. |
| cmd/entire/cli/agent/agent.go | Documents the ResolveSessionFile security contract and required validation boundary. |
Three issues from PR #1365 review: - ValidateSessionID accepted Windows drive-relative paths ("C:foo"): separator-free and not reported absolute by filepath.IsAbs, yet filepath.Join drops the base dir on a volume name. Reject ":" (the volume separator) — portable and testable; no legitimate session ID contains a colon. - ValidateSessionID accepted glob metacharacters ("*", "?", "["). Session IDs are interpolated into filepath.Glob patterns (agent transcript lookup in gemini/pi/codex, session-state cleanup), so a "*" could match unrelated files. Reject them at the validator choke point. - StateStore.Clear and ClearSessionState globbed "<sessionID>.*" to find files to DELETE. With the permissive validator a session ID of "*" would have matched and removed every session's state (DoS/data-loss within the state dir). Replace globbing with literal prefix matching so deletion can never depend on pattern interpretation, even if a future caller skips validation. Adds regression tests for the drive-relative and glob-metacharacter cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ed740e10abe5
pjbgf
previously approved these changes
Jun 4, 2026
pjbgf
reviewed
Jun 4, 2026
Co-authored-by: Paulo Gomes <paulo@entire.io>
pjbgf
approved these changes
Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://entire.io/gh/entireio/cli/trails/512
Summary
Closes a path-traversal / arbitrary-file-write class across the checkpoint, session, and agent-lifecycle paths. The root issue: identifiers read from the shared
entire/checkpoints/v1branch (attacker-influenceable) or from agent hook input flowed into filesystem paths without validation at the read/restore boundary.Original vulnerability:
entire session resume/entire checkpoint rewindused the metadataSessionIDto build the transcript write path with no validation. A crafted absolute or../-laden session ID (or, for Codex/Pi, an absolute ID returned verbatim) let attacker-controlled transcript bytes overwrite arbitrary files (e.g.~/.bashrc) → RCE on the next resume, with no prompt.What changed (11 commits)
Core fix
resolveTranscriptPath,RestoreLogsOnly) before any path construction.Validator hardening
ValidateSessionIDnow also rejects.,.., and absolute paths (closes the bare-..dir-component vector and the Windows drive edge).Lifecycle dispatch (right altitude)
SessionID,ToolUseID, andSubagentIDonce inDispatchLifecycleEvent, before routing — so no handler (e.g.handleLifecycleTurnEnd's.entire/metadata/<id>/write,handleLifecycleSubagentEnd's subagent-transcript read) can forget.os.Root containment (defense in depth)
osroot.MkdirAll(kernel-enforced containment), matching the already-rooted file write..git/entire-sessions/writes (state Save + hint/marker files) scoped toos.Root.Other boundaries
captureTranscriptvalidates before writing.GetNextCheckpointSequence(posttodo hook) validates before the directory read.ExtractSpawnedAgentIDs(claudecode + factoryaidroid) drops non-path-safe subagent IDs, soagent-<id>.jsonlreads are path-safe by construction.ResolveSessionFilesecurity contract on theAgentinterface + Codex/Copilot contract regression tests.Audit results
CommittedMetadatafields read from checkpoints were traced:CheckpointID(12-hex at unmarshal) andInvestigateRunID(validateRunID) are validated at read; every other field is display/logic-only.ToolUseIDhas a write/read validation asymmetry but no live read→path sink.os.WriteFile/Open/OpenFile/Rename/Symlink/Createsinks were swept: every attacker-influenceable path is now validated, hash-derived, or os.Root-contained.Testing
mise run checkgreen (fmt + lint + unit + integration + Vogon canary). 388 integration tests pass; all touched packages lint clean.🤖 Generated with Claude Code
Note
High Risk
Changes authentication-adjacent filesystem behavior (resume/rewind transcript writes, lifecycle metadata paths) using attacker-influenceable checkpoint metadata; mistakes could break legitimate sessions or leave traversal gaps.
Overview
This PR closes a path-traversal / arbitrary-file-write class of bugs where session, tool-use, and subagent IDs from checkpoint metadata (
entire/checkpoints/v1) or agent hooks were joined into filesystem paths without checks—enabling crafted IDs to overwrite files outside agent/session directories (e.g. on resume or rewind).Validation is tightened at read/dispatch boundaries:
ValidateSessionIDnow rejects.,.., and absolute paths;DispatchLifecycleEventvalidatesSessionID,ToolUseID, andSubagentIDonce before any handler; restore pathsresolveTranscriptPathandRestoreLogsOnlyreject unsafe IDs from metadata; PicaptureTranscript,GetNextCheckpointSequence, and subagent ID extraction (Claude Code / Factory Droid) add matching guards. TheResolveSessionFilesecurity contract is documented, with Codex/Copilot contract tests.Defense in depth: rewind working-tree restore uses
osroot.MkdirAllunder a repoos.Root;.git/entire-sessions/state save, hints, markers, and cleanup route throughos.Root. Regression tests cover resume, dispatcher, Pi, osroot, and validators.Reviewed by Cursor Bugbot for commit 98f8e37. Configure here.