Skip to content

perf(detectors): quick-reject pre-screen on auth detectors (-31% detector CPU)#111

Merged
aksOps merged 1 commit into
mainfrom
perf/auth-detector-pre-screen
Apr 29, 2026
Merged

perf(detectors): quick-reject pre-screen on auth detectors (-31% detector CPU)#111
aksOps merged 1 commit into
mainfrom
perf/auth-detector-pre-screen

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented Apr 29, 2026

Summary

Three cross-cutting auth detectors (CertificateAuthDetector,
SessionHeaderAuthDetector, LdapAuthDetector) burn 55% of all detector CPU
on real-world polyglot scans because they run a lines × patterns double loop
on every supported-language file — even files with zero auth keywords.

This PR adds a per-detector PRE_SCREEN Pattern: one regex pass over file
content; if no distinctive literal substring of any underlying pattern is
present, the file cannot match — short-circuit before the line loop.

Measured impact

JFR ExecutionSample profile, JDK 25 Temurin, on a kept 30K-file polyglot
fixture (12 repos under ~/projects/polyglot-bench/: spring-petclinic-ms,
airflow, istio, eShop, angular/components, nuxt, actix/examples, ktor-samples,
nlohmann/json, play-samples, PSScriptAnalyzer, terraform-aws-eks; 14 distinct
languages active including Python, TS, Java, Go, C#, Rust, Kotlin, Scala, etc.):

Detector Before After Δ samples Δ ~CPU
CertificateAuthDetector 244 147 -39.8% -0.97s
SessionHeaderAuthDetector 206 43 -79.1% -1.63s
LdapAuthDetector 47 25 -46.8% -0.22s
Auth subtotal 497 215 -56.7% -2.82s
All detectors 902 624 -30.8% -2.78s

(Each sample ≈ 10ms at JFR's profile setting.)

Why this is safe

PRE_SCREEN is constructed as a regex alternation of every distinctive literal
substring drawn from the existing patterns in ALL_PATTERNS / LANGUAGE_PATTERNS.
Files that don't contain any of those substrings cannot match any underlying
pattern by construction — so the early return DetectorResult.empty() is
identical in observable behavior to running the existing line loop and emitting
zero nodes.

Detection semantics unchanged for files that DO contain at least one keyword:
pre-screen passes, the existing line × patterns logic runs unmodified, same
nodes emitted with the same IDs/labels/properties/line numbers.

Tests

3689 / 0 failures / 0 errors / 32 skipped — same as baseline. All 65 auth
detector tests pass without modification (they all use keyword-bearing
fixtures, which pre-screen lets through). The "no match on plain code"
negative tests still pass — pre-screen rejects (faster path), result is the
same empty DetectorResult.

What's NOT in this PR

  • No changes to non-auth detectors. The other top-15 are either AST-based
    (where the bottleneck is the tree walk, not regex) or already use
    single-pass Matcher.find(). Pre-screen's gain is small on those and
    the regression risk on AST code paths isn't justified.
  • No abstract base class refactor. Per-detector PRE_SCREEN keeps blast
    radius minimal and the optimization explicit at each call site. If a
    pattern emerges across many regex detectors, a follow-up PR can hoist
    to AbstractRegexDetector.

Test plan

  • mvn test -Dtest='*Auth*Test,AuthDetectorsCoverageTest' — 65/65 pass
  • mvn test (full suite) — 3689/0/0/32 skipped
  • JFR re-profile on polyglot-bench — verifies the -31% detector CPU
  • CI green
  • Auto-merge on green

🤖 Generated with Claude Code

…ctor CPU)

Profiling on a 30K-file polyglot fixture (kept at ~/projects/polyglot-bench:
spring-petclinic-microservices, airflow, istio, eShop, angular/components,
nuxt, actix/examples, ktor-samples, nlohmann/json, play-samples,
PSScriptAnalyzer, terraform-aws-eks; 14 distinct languages) showed the three
cross-cutting auth detectors burning 55% of all detector CPU because they
ran the lines × patterns double loop on every supported-language file —
even files with zero auth keywords.

Fix: per-detector PRE_SCREEN Pattern with all distinctive literal substrings
of the underlying patterns. One regex pass over file content; if no keyword
present, the file cannot match — short-circuit before the line loop.

Measured impact (JFR ExecutionSample, JDK 25, polyglot fixture):

  CertificateAuthDetector:  244 → 147 samples  (-39.8%, -0.97s CPU)
  SessionHeaderAuthDetector: 206 →  43 samples  (-79.1%, -1.63s CPU)
  LdapAuthDetector:           47 →  25 samples  (-46.8%, -0.22s CPU)
  Auth subtotal:             497 → 215 samples  (-56.7%, -2.82s)
  All detectors total:       902 → 624 samples  (-30.8%, -2.78s)

Detection semantics unchanged — pre-screen rejects only files where no
underlying pattern can match (keyword absent). Tests covering keyword-bearing
fixtures pass through pre-screen and run the existing logic byte-for-byte.

Tests: 3689 / 0 failures / 0 errors / 32 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aksOps aksOps enabled auto-merge (squash) April 29, 2026 13:34
@aksOps aksOps merged commit 3bc3ebf into main Apr 29, 2026
13 checks passed
@aksOps aksOps deleted the perf/auth-detector-pre-screen branch April 29, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant