Skip to content

v0.8.0

Latest

Choose a tag to compare

@vthakore23 vthakore23 released this 26 May 14:58
· 17 commits to main since this release

DicomLock 0.8.0 brings the security fixes from the S8 and S9 falsification rounds into the published package. pip install dicomlock is now at 0.8.0.

Highlights in 0.8.0

  • CDR escape closed. A payload hidden under an allowlisted vendor creator (e.g. GEMS_IDEN_01) no longer survives disarm. The exe-signature override now matches across a 4 KiB window plus a high-entropy gate, and the same classifier (scanner.file_security._private_payload_threat) is used by both the scanner and the CDR so detection and disarm cannot disagree.
  • Polyglot detection broadened. OLE/CFBF (MSI / Office-macro), CAB, Zstd, plus WASM, DEX, RAR, 7z, Lua from the prior round.
  • File Meta length validation. A bomb in group 0002 used to push the byte-walk past EOF unseen; now caught.
  • Tiered T2 in the decompression-bomb check. Warns at 100 to 1000x amplification, blocks above 1000x.
  • Mixed-compression FP fixes (S8). 0 false positives on conformant files across 12 transfer syntaxes. Four real FPs surfaced and fixed (explicit/implicit VR-mismatch length-walk desync, 1-bit packing, YBR_FULL_422 subsampling, legal trailing padding).
  • Sandboxed codec decode (S7). The one third-party codec step runs in a resource-limited subprocess; the tool quarantines on crash, OOM, or hang.
  • Statistical rigor in the bench (S9). Wilson 95% CIs, rule-of-three FP upper bound, McNemar paired test vs the parser matrix (pydicom, GDCM, dcmtk).

Validated on the bench corpus

  • Detection 80/80, neutralization 80/80, CDR fidelity 72/72 bit-exact.
  • False positives 0/605 (575 real CTs plus 30 curated; one-sided 95% upper bound 0.50%).
  • Differentiation 39/62; McNemar chi-square 49.0, p < 1e-6 (51 files DicomLock flags that every toolkit accepts; 0 DicomLock blind spots vs the matrix).
  • Pinned codec: a fuzzer-found malformed JPEG 2000 OOM-kills OpenJPEG 2.3.0 + ASan raw; CDR quarantines the carrier DICOM. DoS class, not memory corruption.

Additions on main since 0.8.0 (not yet republished)

  • Diverse-modality validation (bench.diverse_check): 0 false positives, 270/270 bit-exact across 120 brain MR (UPENN-GBM) + 150 chest radiographs (LIDC-IDRI). Across all real public TCIA data on disk now: 0 FP across 845 files / 3 modalities.
  • CDR fidelity at scale (bench.fidelity): 623/623 native and lossless bit-exact across 13 transfer syntaxes; 20/20 lossy preserved as decoded.
  • De-id first-class. CLI --deid renders a colored re-identification score bar with a per-channel breakdown; the web UI shows a teal re-identification risk card.
  • B1 (bench.reid_vs_anonymizer): on 60 brain MR, dicognito 0.19 re-pseudonymizes 120/120 direct identifiers but leaves the pixels byte-identical 60/60, so the facial-geometry and burned-in re-id channels are provably unchanged by tag anonymization.
  • B3 (bench.reid_audit): residual re-identification-risk audit across 845 public files. Facial-geometry channel fires on 96.7% of head MR (the Mayo concern); burned-in pixel text on 89.3% of chest radiographs; CT body imaging is low pixel-domain risk (face 0.3%, burn 8%).
  • PREPRINT.md complete with 24 web-verified references and the diverse-modality numbers folded in. REPRODUCE.md maps each headline number to its exact command.

Install

pip install dicomlock

License

Apache-2.0