Skip to content

v1.0.6 — Security hardening release

Choose a tag to compare

@rocklambros rocklambros released this 27 Apr 13:49
· 22 commits to main since this release
336f3e7

Security hardening release. Closes 8 actionable findings from the 2026-04-27 multi-tool audit (semgrep + trivy + gitleaks + bandit + pip-audit + manual review). No public API changes; behavior unchanged for legitimate inputs.

Highlights

  • F1 (HIGH) — new any2md/_http.py does manual redirect walking with per-hop host revalidation; defeats DNS rebind and redirect-based SSRF. Applied to URL fetching, HEAD-for-Last-Modified, and arxiv lookup.
  • F5 (MED)--meta integrity: reserved keys (content_hash, extracted_via, source_file, lane, token_estimate, recommended_chunk_level) are now silently filtered with a stderr WARN.
  • F3 (MED) — TOML auto-discovery now bounded at project markers (.git, pyproject.toml, etc.). New --no-config flag.
  • F4 (MED) — DOCX zip-bomb guard: 1 MB cap on declared uncompressed size of core.xml/app.xml.
  • F-CVE — bumps lxml>=6.1.0 (CVE-2026-41066), pillow>=12.2.0 (CVE-2026-25990, CVE-2026-40192), urllib3>=2.6.3 (CVE-2026-21441). All ranges acquire upper bounds.
  • F6/F2/F11 — atomic-write + pre-existing-symlink rejection across all converters; control-char sanitizer for forwarded Docling warnings; PDF stem hardening for image dir.
  • XXE defense-in-depth via defusedxml.

Audit summary

ID Title Severity
F1 SSRF: DNS rebind + redirect bypass High
F5 --meta clobbers reserved frontmatter Medium
F3 TOML discovery walks above project root Medium
F4 DOCX zip-bomb amplification Medium
F-CVE Transitive deps with known CVEs Medium
F6 Symlink-following on output writes Low
F2 Control-char passthrough in Docling logs Low
F11 PDF stem .. corner case Low

Verification

  • 348 tests pass (was 290; added 58 new tests across 7 unit-test files).
  • Bandit: 9 findings → 1 low (legitimate try/except/pass).
  • Real-world 5-DOCX regression: word counts identical to v1.0.5 (978/6726/15566/7811/1477); fallback fires correctly with sanitized warnings.