Skip to content

image: harden host-side walks against symlink components#70

Merged
JAORMX merged 4 commits intomainfrom
jaosorior/image-symlink-safe
Apr 17, 2026
Merged

image: harden host-side walks against symlink components#70
JAORMX merged 4 commits intomainfrom
jaosorior/image-symlink-safe

Conversation

@JAORMX
Copy link
Copy Markdown
Contributor

@JAORMX JAORMX commented Apr 17, 2026

Summary

Phase 2 of a multi-phase hardening pass on image/. Host-side
operations that walk a path inside an extracted rootfs now validate
every parent component via a new SafeWalk helper, so a symlink
planted as an intermediate rootfs component cannot redirect the
operation onto the host filesystem.

Scope of Phase 2 is deliberately narrow: we keep absolute symlinks
in the extracted rootfs
(they are common in real images —
bin/ash -> /bin/busybox and the like in every Alpine layer). The
defense lives in the host-side operations that would otherwise follow
those symlinks, not in the extraction step.

Commits are bisectable; each lands with its own tests and task verify is green at every commit:

  • Add SafeWalk helper that rejects symlink components — introduces
    the primitive and exercises it with a table-driven test.
  • Use SafeWalk in applyWhiteout to block symlink parents — routes
    the whiteout os.RemoveAll through SafeWalk; RemoveAll itself
    does not follow a symlink leaf, so whiteout-on-symlink continues to
    work. Also wraps the "missing parent" error with fs.ErrNotExist
    so callers can errors.Is the no-op case.
  • Harden applyOpaqueWhiteout against symlink leaf and parents
    os.ReadDir follows symlinks, so the opaque-whiteout path additionally
    Lstats the leaf and refuses if it is a symlink or not a directory.
  • Cap tar entry count to bound inode exhaustion — complements the
    existing maxExtractSize byte cap with a 1,000,000-entry cap across
    extractTar and extractTarSharedLimit. Bounds inode exhaustion
    via many small files.

Test plan

  • task verify green at every commit
  • New regression tests for applyWhiteout and applyOpaqueWhiteout:
    each stages a symlink parent/leaf pointing at an outside directory
    with a victim file, and asserts the victim is untouched after the
    whiteout errors out.
  • Entry-count cap tested via maxExtractEntries var override in
    both extractTar and extractTarSharedLimit.
  • End-to-end verified against brood-box with a local replace:
    bbox rebuilt, VM boot + hook injection + headless claude-code
    workspace round-trip all succeed on Phase 2.

Notes

A stricter approach — rejecting absolute symlinks at extraction time —
was considered and dropped. It would break most real OCI images
(alpine:3.19 alone ships several hundred absolute symlinks). The
host-side walk hardening achieves the same security property without
that compatibility cost.

🤖 Generated with Claude Code

JAORMX and others added 4 commits April 17, 2026 08:57
Introduces image.SafeWalk(root, rel) which resolves rel under
root while Lstat-ing every parent component and refusing symlink
links or non-directory intermediates. The leaf itself is not
inspected — callers restrict leaf type as needed (for example
before ReadDir, or not at all before RemoveAll, which never
follows a symlink leaf).

This is the primitive Phase 2 will use to guard host-side
operations (whiteout RemoveAll/ReadDir) against a malicious OCI
layer that plants a symlink as an intermediate directory
component in the rootfs.

Table-driven tests cover: root itself, normal paths,
nonexistent leaves (allowed), symlink leaves (allowed and not
inspected), mid-path symlinks (refused), parent symlinks even
when leaf exists through them (refused), missing parents
(refused), non-directory parents (refused), and paths escaping
the root (refused).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Route applyWhiteout's parent-directory validation through
image.SafeWalk so a symlink planted as an intermediate rootfs
component cannot redirect the subsequent os.RemoveAll onto the
host filesystem. RemoveAll itself does not follow a symlink leaf,
so whiteout-on-symlink — a legitimate OCI pattern — still works.

Also wrap SafeWalk's "missing parent" error with fs.ErrNotExist
so callers can use errors.Is to detect the no-op case without
string matching. applyWhiteout treats a missing parent as a no-op
since the target cannot exist.

New regression test stages etc as a symlink to an outside
directory containing a victim file, applies a whiteout on
etc/.wh.passwd, and asserts both the error and that the victim
is untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
os.ReadDir follows symlinks, so an opaque whiteout whose target
path resolves to a symlink (leaf or via a parent component) would
enumerate and subsequently remove files outside the rootfs. Route
the path through SafeWalk for parent validation, then Lstat the
leaf and refuse if it is a symlink or not a directory.

Add two regression tests: one where an intermediate component is
a symlink, and one where the leaf itself is a symlink pointing at
an outside directory containing a victim file. Both assert an
error is returned and the victim file is untouched.

The original "no-op for nonexistent" behavior is preserved by
checking errors.Is(err, fs.ErrNotExist) at the SafeWalk and Lstat
call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extend the existing decompression-bomb defenses (maxExtractSize
byte cap) with a complementary entry-count cap. Millions of
zero-byte files fit trivially under the 30 GiB byte budget but
exhaust the host's inode allocator, dentry cache, and walk-time
memory during extraction.

Default cap of 1,000,000 entries is well above the largest
legitimate container images observed in practice (a few hundred
thousand). Applied to both extractTar and extractTarSharedLimit
so single-layer and layered extraction share the same bound.

maxExtractEntries is declared as a var so tests can scope-override
it via t.Cleanup; no production caller should mutate it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JAORMX JAORMX merged commit df4d4a7 into main Apr 17, 2026
7 checks passed
@JAORMX JAORMX deleted the jaosorior/image-symlink-safe branch April 17, 2026 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant