Skip to content

v2.0.0: 93% Recall at 8% Noise Rate (SecretBench), "Dark Matter" Coverage and Performance

Latest

Choose a tag to compare

@ntoskernel ntoskernel released this 04 Jun 15:26
c1e0202

DeepSecrets 2.0 is here! 🚀

Now tested against the SecretBench Benchmark with:

  • 93% Recall
  • 8% Noise Rate
  • ~9K Extra Findings outside the benchmark scope

Improvements

Coverage

  • Nested Formats Parsing: LexerTokenizer is now able to detect code nesting and correctly parse situations like "inline YAML inside YAML inside Markdown".
  • CheapVarDetector: Detects potential variable declarations even in "unlexable" code.
  • Edge-cases: Better variable extraction for edge cases in Shell, JavaScript, Markdown, PHP, and C#.
  • Deeper Language Support: Expanded native tracking surface for R(d), Ruby, and Nix configurations.
  • Tighter Regexes: Improved "classic" backup regexes for secrets from AWS, Stripe, MailChimp, and our favorite -----BEGIN constructions.

Precision

  • Confidence Scoring: Every finding candidate is dynamically scored using a system that evaluates naming layouts, value entropy, and semantic "naturalness" to drop false positive rate.

Performance and Stability

  • ~30% faster and more reliable for large files (up to 200 MB) with rich semantics.
  • The UI now shows the progress and estimates for each file, as well as the overall progress.

Important Changes

Switching to SARIF reports

af9853a32028a345cf5dc8949360100be80e98c23977e8d843cb3e67540978bed437e7146cdd60f37ee5df3ab48d0b59d390f77a7e9b6448cb65cc177e9fe5585abd1267fde69f98907ee4cfa730cdd272cb31467ea5c2e70af41bda5dac202f3640151c

Warning

The legacy JSON report format is now deprecated and will be removed in the next major release. For now, you can still select it via --outformat json, which will trigger a deprecation warning.

We are switching to the industry-standard SARIF (v2.1.0) format to provide seamless integration with orchestration, CI/CD pipelines, and ASPM systems like GitHub Security and DefectDojo:

  • Virtual Subrules: (e.g., S105-LOW) to enforce proper precision and security-severity inside third-party dashboards.
  • Smart Tracking (partialFingerprints): Populates fingerprints of findings. Even with automatic masking enabled, downstream systems can track moving secrets without duplicating them.

Warning

Breaking Change for Existing Alerts:
If you have used DeepSecrets before and already have a set of deduplicated findings in GitHub or DefectDojo, switching to Virtual Subrules will dynamically alter rule IDs (e.g., changing them to S105-LOW or S105-HIGH). This will likely cause your platforms to treat them as new issues, creating a one-time wave of duplicate alerts. I am truly sorry for this temporary inconvenience, but this change is vital for proper semantic precision mapping going forward.

📦 Quick Upgrade

pip install --upgrade deepsecrets

Full Changelog: v1.4.0...v2.0.0