perf(scorer): precompute log scores in NewRankScorer#16
Conversation
Native Math.log (libmLog) was 5.46% of CPU in the post-PR#15 Astral profile. The call sites in NewRankScorer.getErrorScore and getNodeScore / getMissingIonScore compute log(x/y) over frequency arrays that are immutable after scorer load; the denominator scale factor min(ionType.getCharge(), numSegments) is also load-time constant. Cache the resulting float values once at the end of readFromInputStream and replace the runtime Math.log calls with direct array indexing. Scoring results are bit-identical: same expressions, same operand ordering, same float rounding; the only difference is that the cast to float happens once per cell at load instead of per call. Both hot-path methods keep a fallback to the original computation so legacy callers that populate the tables without going through readFromInputStream still work. Benchmarks on the current machine state (baseline = dev HEAD jar, same run, same thermal state): PXD001819 LFQ: 122.7s -> 110.4s (-10.0%), 2410 -> 2292 MB (-4.9%) TMT: 295.7s -> 277.9s ( -6.0%), 2793 -> 2818 MB (+0.9%) Astral: 1002.9s -> 883.5s (-12.0%), 7707 -> 7351 MB (-4.6%) PSM@1%FDR and SII counts match exactly on all three datasets. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR optimizes NewRankScorer hot-path scoring by precomputing the log-score tables at scorer load time, reducing repeated Math.log calls and repeated HashMap lookups during node/edge scoring.
Changes:
- Added precomputed log-score caches (
errorLogTable,nodeLogTable) and populated them viaprecomputeLogScoreTables()after parameter loading. - Updated
getNodeScore,getMissingIonScore, andgetErrorScoreto use the precomputed tables with a fallback to the original computation path. - Hooked precomputation into
readFromInputStreamto ensure caches are built for the standard loading path.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (Map.Entry<IonType, Float[]> ie : ionTable.entrySet()) { | ||
| IonType ionType = ie.getKey(); | ||
| Float[] frequencies = ie.getValue(); | ||
| if (frequencies == null) continue; | ||
| int n = Math.min(frequencies.length, noiseFrequencies.length); | ||
| int chargeOrSeg = Math.min(ionType.getCharge(), numSegments); | ||
| float[] logs = new float[n]; | ||
| for (int i = 0; i < n; i++) { | ||
| float ionFrequency = frequencies[i]; | ||
| float noiseFrequency = noiseFrequencies[i] * chargeOrSeg; | ||
| // Match getScoreFromTable semantics exactly: guard against non-positive only in assertions. | ||
| logs[i] = (float) Math.log(ionFrequency / noiseFrequency); | ||
| } |
Summary
libmLog(nativeMath.log) showed up at 5.46 % of CPU in the Astral profile taken after PR #15 merged. The call sites inNewRankScorer—getErrorScore,getNodeScore, andgetMissingIonScore— computelog(x/y)over frequency arrays that are immutable after scorer load. The denominator scale factormin(ionType.getCharge(), numSegments)is also load-time constant.This PR precomputes those logs once in
precomputeLogScoreTables()at the end ofreadFromInputStream, stores them asfloat[]per(partition, …), and replaces the runtimeMath.logcalls with direct array indexing. Each scored edge/node saves oneMath.logplus twoHashMap.getcalls (theNOISElookup and one of the per-type lookups collapse into the cache).Scoring is bit-identical: same expressions, same operand order, same float rounding; the cast to
floatjust happens once per cell at load time. Both hot-path methods keep a fallback to the original path, so any legacy callers that populate the raw tables without going throughreadFromInputStreamkeep working.This PR in isolation (same machine state,
devHEAD jar vs this branch)Cumulative impact — all three merged + pending perf PRs together (this branch vs pre-PR#15
dev)devPSM@1%FDR and SII remain bit-exact at every step. Cumulative contributions on Astral (for context):
b1a1498— port primitive CSR graph + flat-array GF (regressed when alone, absorbed by next step)9d07047— streaming mass-index GF merger → 2155.9 s → 1330.3 s (-38.3 %)8442f2c— dropjava.util.HashtableforHashMap/ConcurrentHashMap→ ~-28 pp further (removed monitor contention)cafbf73) — precompute log scores → another -12 % vs same-statedevHEADMachine: macOS, 11 CPUs, 18 GB RAM, Java 21.0.6,
-Xmx8192m(TMT/Astral) /-Xmx4096m(PXD001819), 4 threads.Test plan
mvn -B verify— 141 tests pass (57 skipped for missing external data)devHEAD; RSS neutral or betterdev): 37–59 % wall-time reduction, RSS within ±9 %Non-goals
getIonExistenceScore— its denominator depends on per-spectrumprobPeakand is not precomputable.Map.copyOf/ immutable-map conversion (tried on PR perf(msgf): CSR graph + streaming GF merge + drop Hashtable sync #15, regressed Astral 2.2×, reverted).🤖 Generated with Claude Code