Summary
scan currently has multiple boundary/resilience gaps that can cause unexpected file access, skipped data, or memory pressure.
Underlying problems covered by this issue:
- Symlinks are followed to targets outside the requested scan tree.
- Archive ingestion can bypass
--max-file-size via symlink target mismatch and then os.ReadFile large targets.
- A single
WalkDir callback error returns filepath.SkipDir, which can prune an entire subtree.
Why this matters
Users expect scan <path> to stay inside scope and degrade gracefully. Current behavior can miss certificates, process unintended files, or consume excessive memory.
Evidence
- Symlink follow via
os.Stat and processing resolved path: cmd/certkit/scan.go:121, cmd/certkit/scan.go:123, cmd/certkit/scan.go:162
- Archive read uses
os.ReadFile after size gate path that can be symlink-sensitive: cmd/certkit/scan.go:122, cmd/certkit/scan.go:133, cmd/certkit/scan.go:141
- Directory walk error handling prunes subtree:
cmd/certkit/scan.go:109, cmd/certkit/scan.go:112
Acceptance criteria
- Scans are constrained to the requested root by default (or require explicit opt-in for following external symlink targets).
- Size checks are enforced against the actual file content source for archive ingestion.
- Walk errors skip only the affected entry where possible, and continue scanning siblings/subtrees safely.
- Tests cover: external symlink target, symlink-to-large-archive, and partial walk permission error behavior.
Suggested approach
- Canonicalize and validate resolved paths against scan root.
- Use open/stat strategy that validates target size before full read for archives.
- Replace subtree-pruning error behavior with per-entry skip + structured debug logging.
Dedupe notes
Checked existing issues before creating:
gh issue list --state open --limit 200 --json number,title,url,labels -> no open issues
gh search issues "is:open repo:sensiblebit/certkit" --limit 100 --json number,title,url,state -> no matches
Classified as: new.
Summary
scancurrently has multiple boundary/resilience gaps that can cause unexpected file access, skipped data, or memory pressure.Underlying problems covered by this issue:
--max-file-sizevia symlink target mismatch and thenos.ReadFilelarge targets.WalkDircallback error returnsfilepath.SkipDir, which can prune an entire subtree.Why this matters
Users expect
scan <path>to stay inside scope and degrade gracefully. Current behavior can miss certificates, process unintended files, or consume excessive memory.Evidence
os.Statand processing resolved path:cmd/certkit/scan.go:121,cmd/certkit/scan.go:123,cmd/certkit/scan.go:162os.ReadFileafter size gate path that can be symlink-sensitive:cmd/certkit/scan.go:122,cmd/certkit/scan.go:133,cmd/certkit/scan.go:141cmd/certkit/scan.go:109,cmd/certkit/scan.go:112Acceptance criteria
Suggested approach
Dedupe notes
Checked existing issues before creating:
gh issue list --state open --limit 200 --json number,title,url,labels-> no open issuesgh search issues "is:open repo:sensiblebit/certkit" --limit 100 --json number,title,url,state-> no matchesClassified as:
new.