Skip to content

fix(parsers): apply ADR 0004 security hardening across 10 parsers#695

Merged
abraemer merged 3 commits intomainfrom
fix/adr-0004-security-hardening
Apr 15, 2026
Merged

fix(parsers): apply ADR 0004 security hardening across 10 parsers#695
abraemer merged 3 commits intomainfrom
fix/adr-0004-security-hardening

Conversation

@abraemer
Copy link
Copy Markdown
Collaborator

@abraemer abraemer commented Apr 15, 2026

Summary

Audited all ~90 parser source files against ADR 0004 (Security-First Parsing), remediated every divergence found, and extracted a shared RecursionGuard<K> utility now adopted across 17 parsers. Two parser families (Ruby/Cocoa/Haskell and Python/Conda/Pixi after fixes) were already fully compliant.

Changes by Parser

bun_lockb.rs — Recursion depth tracking + cycle detection (Critical)

  • Introduced RecursionGuard struct encapsulating depth counter and HashSet<usize> visited set
  • build_dependencies_for_package and build_resolved_package now accept &mut RecursionGuard<usize> instead of raw depth/visited params
  • Both functions return early with warn!() when depth exceeds MAX_RECURSION_DEPTH (50)
  • Cycle detection: before recursing into a package index, guard.enter(pkg_idx) checks if already visited; if so, skips with warn!() and returns None
  • On return, guard.leave(pkg_idx) removes the index from visited, allowing diamond dependencies in different branches

clojure.rs — Unbounded recursion in AST conversion (High)

  • form_to_json() and format_license(): added depth tracking; return early with warn!() when depth exceeded
  • Refactored to use shared RecursionGuard<()> via descend()/ascend() in struct and free functions

gradle.rs — Unbounded nesting depth in delimiter matching (Low)

  • Added depth cap checks in 5 bracket/paren/brace matching functions, logging warn!() and breaking with partial result

python.rs / conda.rs — Unbounded iteration

  • Added .take(MAX_ITERATION_COUNT) to iteration loops in parse_installed_files_txt(), parse_sources_txt(), and extract_pip_dependencies()

rpm_db.rspanic!() in library code

  • Changed return type to Result<..., String>; replaced panic!() with Err(...)

rpm_specfile.rs / cpan_makefile_pl.rs.unwrap() on compile-time regexes

  • Replaced .unwrap() with .expect("valid regex: ...")

nix.rs.unwrap() on Vec::pop()

  • Replaced with unwrap_or_else(|| Expr::Symbol(String::new()))

nuget.rs — Archive safety gaps

  • Added path-traversal check in read_nupkg_license_file
  • Added 1 GB uncompressed size accumulator in extract_nupkg_archive

RecursionGuard Extraction (follow-up commit)

Extracted the RecursionGuard from bun_lockb.rs into src/parsers/utils.rs as a generic shared utility and adopted it across 17 parsers:

  • RecursionGuard<K: Hash + Eq> — depth + cycle detection (keyed by usize, String, or PathBuf)
    • guard.enter(key) / guard.leave(key) — visited-set tracking + depth increment/decrement
    • guard.exceeded() — checks depth against MAX_RECURSION_DEPTH (50)
  • RecursionGuard<()> — depth-only tracking (no cycle detection)
    • guard.descend() / guard.ascend() — lightweight increment/decrement, returns true if exceeded

Parsers refactored to RecursionGuard<K> (depth + cycle detection): bun_lockb.rs (usize), nuget.rs (PathBuf), sbt.rs (String), swift_show_dependencies.rs (String + ()), requirements_txt.rs (PathBuf)

Parsers refactored to RecursionGuard<()> (depth-only): dart.rs, bazel.rs, julia.rs, cargo.rs, uv_lock.rs, pylock_toml.rs, license_normalization.rs, npm_lock.rs, clojure.rs (struct + free functions), hex_lock.rs, meson.rs, nix.rs

Removed 17 duplicate MAX_RECURSION_DEPTH constants; all parsers now use the shared MAX_RECURSION_DEPTH from utils.rs.

Skipped: maven.rs, conan.rs, python.rs — these have domain-specific extras (caches, node counts, resolution stacks) beyond simple depth+visited tracking.

Skill file update

Updated .opencode/skills/add-parser/SKILL.md to document RecursionGuard in the security utilities section so future parser implementations use it from the start.

Testing

  • cargo check passes
  • cargo clippy passes with zero warnings
  • Pre-commit hooks (clippy, rustfmt, generate-supported-formats) all pass

ADR Compliance

After this PR, all ~90 parser files comply with ADR 0004 across all 6 requirements:

  1. No code execution
  2. DoS protection (file size, recursion depth, iteration count, string length)
  3. Archive safety (where applicable)
  4. Input validation
  5. Circular dependency detection (where applicable)
  6. No .unwrap() in library code

Audit all ~90 parsers against ADR 0004 and remediate divergences:
- Add RecursionGuard with depth tracking and cycle detection to bun_lockb.rs
- Add depth parameters to clojure.rs form_to_json/format_license
- Add MAX_RECURSION_DEPTH caps to gradle.rs bracket/paren/brace matching
- Add .take(MAX_ITERATION_COUNT) to python.rs and conda.rs iteration loops
- Replace .unwrap() with .expect() in cpan_makefile_pl.rs and rpm_specfile.rs
- Replace panic! with Result return in rpm_db.rs native_kind_for_datasource
- Replace .unwrap() with safe alternative in nix.rs Vec::pop()
- Add path-traversal check to nuget.rs read_nupkg_license_file
- Add total uncompressed size accumulation to nuget.rs extract_nupkg_archive
@abraemer abraemer force-pushed the fix/adr-0004-security-hardening branch from 98f30c1 to 0e1c187 Compare April 15, 2026 14:12
@abraemer abraemer force-pushed the fix/adr-0004-security-hardening branch from 0e1c187 to 7ef680e Compare April 15, 2026 14:18
@abraemer abraemer enabled auto-merge (rebase) April 15, 2026 14:40
@abraemer abraemer merged commit e5befb9 into main Apr 15, 2026
14 checks passed
@abraemer abraemer deleted the fix/adr-0004-security-hardening branch April 15, 2026 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant