Skip to content

Releases: vthakore23/dicomlock

v0.8.0

26 May 14:58

Choose a tag to compare

DicomLock 0.8.0 brings the security fixes from the S8 and S9 falsification rounds into the published package. pip install dicomlock is now at 0.8.0.

Highlights in 0.8.0

  • CDR escape closed. A payload hidden under an allowlisted vendor creator (e.g. GEMS_IDEN_01) no longer survives disarm. The exe-signature override now matches across a 4 KiB window plus a high-entropy gate, and the same classifier (scanner.file_security._private_payload_threat) is used by both the scanner and the CDR so detection and disarm cannot disagree.
  • Polyglot detection broadened. OLE/CFBF (MSI / Office-macro), CAB, Zstd, plus WASM, DEX, RAR, 7z, Lua from the prior round.
  • File Meta length validation. A bomb in group 0002 used to push the byte-walk past EOF unseen; now caught.
  • Tiered T2 in the decompression-bomb check. Warns at 100 to 1000x amplification, blocks above 1000x.
  • Mixed-compression FP fixes (S8). 0 false positives on conformant files across 12 transfer syntaxes. Four real FPs surfaced and fixed (explicit/implicit VR-mismatch length-walk desync, 1-bit packing, YBR_FULL_422 subsampling, legal trailing padding).
  • Sandboxed codec decode (S7). The one third-party codec step runs in a resource-limited subprocess; the tool quarantines on crash, OOM, or hang.
  • Statistical rigor in the bench (S9). Wilson 95% CIs, rule-of-three FP upper bound, McNemar paired test vs the parser matrix (pydicom, GDCM, dcmtk).

Validated on the bench corpus

  • Detection 80/80, neutralization 80/80, CDR fidelity 72/72 bit-exact.
  • False positives 0/605 (575 real CTs plus 30 curated; one-sided 95% upper bound 0.50%).
  • Differentiation 39/62; McNemar chi-square 49.0, p < 1e-6 (51 files DicomLock flags that every toolkit accepts; 0 DicomLock blind spots vs the matrix).
  • Pinned codec: a fuzzer-found malformed JPEG 2000 OOM-kills OpenJPEG 2.3.0 + ASan raw; CDR quarantines the carrier DICOM. DoS class, not memory corruption.

Additions on main since 0.8.0 (not yet republished)

  • Diverse-modality validation (bench.diverse_check): 0 false positives, 270/270 bit-exact across 120 brain MR (UPENN-GBM) + 150 chest radiographs (LIDC-IDRI). Across all real public TCIA data on disk now: 0 FP across 845 files / 3 modalities.
  • CDR fidelity at scale (bench.fidelity): 623/623 native and lossless bit-exact across 13 transfer syntaxes; 20/20 lossy preserved as decoded.
  • De-id first-class. CLI --deid renders a colored re-identification score bar with a per-channel breakdown; the web UI shows a teal re-identification risk card.
  • B1 (bench.reid_vs_anonymizer): on 60 brain MR, dicognito 0.19 re-pseudonymizes 120/120 direct identifiers but leaves the pixels byte-identical 60/60, so the facial-geometry and burned-in re-id channels are provably unchanged by tag anonymization.
  • B3 (bench.reid_audit): residual re-identification-risk audit across 845 public files. Facial-geometry channel fires on 96.7% of head MR (the Mayo concern); burned-in pixel text on 89.3% of chest radiographs; CT body imaging is low pixel-domain risk (face 0.3%, burn 8%).
  • PREPRINT.md complete with 24 web-verified references and the diverse-modality numbers folded in. REPRODUCE.md maps each headline number to its exact command.

Install

pip install dicomlock

License

Apache-2.0