image: harden host-side walks against symlink components#70
Merged
Conversation
Introduces image.SafeWalk(root, rel) which resolves rel under root while Lstat-ing every parent component and refusing symlink links or non-directory intermediates. The leaf itself is not inspected — callers restrict leaf type as needed (for example before ReadDir, or not at all before RemoveAll, which never follows a symlink leaf). This is the primitive Phase 2 will use to guard host-side operations (whiteout RemoveAll/ReadDir) against a malicious OCI layer that plants a symlink as an intermediate directory component in the rootfs. Table-driven tests cover: root itself, normal paths, nonexistent leaves (allowed), symlink leaves (allowed and not inspected), mid-path symlinks (refused), parent symlinks even when leaf exists through them (refused), missing parents (refused), non-directory parents (refused), and paths escaping the root (refused). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Route applyWhiteout's parent-directory validation through image.SafeWalk so a symlink planted as an intermediate rootfs component cannot redirect the subsequent os.RemoveAll onto the host filesystem. RemoveAll itself does not follow a symlink leaf, so whiteout-on-symlink — a legitimate OCI pattern — still works. Also wrap SafeWalk's "missing parent" error with fs.ErrNotExist so callers can use errors.Is to detect the no-op case without string matching. applyWhiteout treats a missing parent as a no-op since the target cannot exist. New regression test stages etc as a symlink to an outside directory containing a victim file, applies a whiteout on etc/.wh.passwd, and asserts both the error and that the victim is untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
os.ReadDir follows symlinks, so an opaque whiteout whose target path resolves to a symlink (leaf or via a parent component) would enumerate and subsequently remove files outside the rootfs. Route the path through SafeWalk for parent validation, then Lstat the leaf and refuse if it is a symlink or not a directory. Add two regression tests: one where an intermediate component is a symlink, and one where the leaf itself is a symlink pointing at an outside directory containing a victim file. Both assert an error is returned and the victim file is untouched. The original "no-op for nonexistent" behavior is preserved by checking errors.Is(err, fs.ErrNotExist) at the SafeWalk and Lstat call sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extend the existing decompression-bomb defenses (maxExtractSize byte cap) with a complementary entry-count cap. Millions of zero-byte files fit trivially under the 30 GiB byte budget but exhaust the host's inode allocator, dentry cache, and walk-time memory during extraction. Default cap of 1,000,000 entries is well above the largest legitimate container images observed in practice (a few hundred thousand). Applied to both extractTar and extractTarSharedLimit so single-layer and layered extraction share the same bound. maxExtractEntries is declared as a var so tests can scope-override it via t.Cleanup; no production caller should mutate it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 of a multi-phase hardening pass on
image/. Host-sideoperations that walk a path inside an extracted rootfs now validate
every parent component via a new
SafeWalkhelper, so a symlinkplanted as an intermediate rootfs component cannot redirect the
operation onto the host filesystem.
Scope of Phase 2 is deliberately narrow: we keep absolute symlinks
in the extracted rootfs (they are common in real images —
bin/ash -> /bin/busyboxand the like in every Alpine layer). Thedefense lives in the host-side operations that would otherwise follow
those symlinks, not in the extraction step.
Commits are bisectable; each lands with its own tests and
task verifyis green at every commit:Add SafeWalk helper that rejects symlink components— introducesthe primitive and exercises it with a table-driven test.
Use SafeWalk in applyWhiteout to block symlink parents— routesthe whiteout
os.RemoveAllthroughSafeWalk;RemoveAllitselfdoes not follow a symlink leaf, so whiteout-on-symlink continues to
work. Also wraps the "missing parent" error with
fs.ErrNotExistso callers can
errors.Isthe no-op case.Harden applyOpaqueWhiteout against symlink leaf and parents—os.ReadDirfollows symlinks, so the opaque-whiteout path additionallyLstats the leaf and refuses if it is a symlink or not a directory.Cap tar entry count to bound inode exhaustion— complements theexisting
maxExtractSizebyte cap with a 1,000,000-entry cap acrossextractTarandextractTarSharedLimit. Bounds inode exhaustionvia many small files.
Test plan
task verifygreen at every commitapplyWhiteoutandapplyOpaqueWhiteout:each stages a symlink parent/leaf pointing at an outside directory
with a victim file, and asserts the victim is untouched after the
whiteout errors out.
maxExtractEntriesvar override inboth
extractTarandextractTarSharedLimit.brood-boxwith a localreplace:bbox rebuilt, VM boot + hook injection + headless claude-code
workspace round-trip all succeed on Phase 2.
Notes
A stricter approach — rejecting absolute symlinks at extraction time —
was considered and dropped. It would break most real OCI images
(
alpine:3.19alone ships several hundred absolute symlinks). Thehost-side walk hardening achieves the same security property without
that compatibility cost.
🤖 Generated with Claude Code