stable_sort events in Result::Finalize for cpp-mirror parity (#375)#20
Open
matsen wants to merge 3 commits into
Open
stable_sort events in Result::Finalize for cpp-mirror parity (#375)#20matsen wants to merge 3 commits into
matsen wants to merge 3 commits into
Conversation
…gedQuery - glomerator.cc ReadCacheFile: stof→stod for log_probs_ and naive_hfracs_ (both maps are map<string,double>; cache files are written with setprecision(20)) - glomerator.cc GetMergedQuery: double(n)*mute_freq_ so products use double arithmetic instead of float (size_t*float→float was losing precision) - text.cc Floatify: stof→stod (return type is vector<double>) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pace (#19) Replaces the glibc log + exp pair inside AddInLogSpace with a single 65 K-entry linear-interpolation lookup of softplus(d) = log(1+exp(d)). On modern glibc 2.29+ with FMA-optimized scalar log/exp, this saves 20% of bcrham wall and 15% of partis wall on a 5k paired partition (median-of-3 on AMD Zen 5; bigger wins on older Intel/AMD per cross-arch microbench at psathyrella/partis#366 item 3.2). Implementation in src/fast_math.c, declared via extern "C" in mathutils.h. Linear-interp arithmetic dominates over the table size in cycle count, so picking 64 K is essentially free; 64 K nodes give 6.5e-9 max abs error over [-30, 0] — five orders of magnitude under the partis-test 1e-5 gate. Verified bit-identical partis output vs glibc baseline at 5k partition (3797 + 3802 events). Built with -O3 -ffp-contract=off -fno-builtin so the linear-interp is bit-deterministic across compilers, matching the Zig mirror at packages/zig-core/src/ham/fast_math.c on the partis side. For psathyrella/partis#366 item 3.2.
5 tasks
RecoEvent::score_ is float (single-precision); two ksets whose f64 viterbi scores differ by ~1e-6 collapse to identical float values. The previous sort+reverse used std::sort (unstable introsort), which picks an algorithm-dependent winner among tied events. The Zig backend uses std.sort.block (stable), so the two backends could pick different best_events for the same query and parameter set. Switch to stable_sort with a descending comparator. Push order in DPHandler::Run is k_v=vmax-1..vmin, k_d=dmax-1..dmin, so the first-pushed tied event wins; both backends now agree. The divergence was visible in unbounded paired-loci partition at 30k on /tmp/paired-paper-all-seqs.fa — different best_events propagated through naive_seq -> hfrac -> cluster merge decisions, giving 919 different paired clusters per locus despite identical cluster counts. Smaller workloads (<=20k) only triggered the float-tie pattern rarely enough that the divergence didn't surface. See partis issue #375. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d354b76 to
35bf4a9
Compare
This was referenced May 8, 2026
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace
sort + reversewithstable_sort+ descending comparator inResult::Finalize. Tied events (kets with f64 viterbi scores within ~1e-6 that collapse to identical f32 values when stored inRecoEvent::score_) are now resolved deterministically by push order — first-pushed kset wins.This is the C++ side of the cpp-mirror fix for partis issue #375. Companion partis PR: psathyrella/partis#376.
Why
RecoEvent::score_isfloat;Result::Finalizesorts events by score and picksevents_[0]asbest_event_. When two ksets compute f64 viterbi scores within ~1e-6 of each other, the f32 narrowing collapses them to identical float values.std::sortis unstable (libstdc++ uses introsort) and picks an algorithm-dependent winner. The Zig backend usesstd.sort.block(stable), so the two backends could pick differentbest_events for the same query and parameter set — different (v, d, j) gene calls, differentnaive_seq, differenthfrac, different cluster merges. Latent until ≥25k where float-tied ksets become frequent enough.Push order in DPHandler::Run
C++ pushes events in nested reverse-iteration order (
dphandler.cc:101-119):Zig matches this order. Both backends with stable sort + descending comparator pick
events_[0]= first-pushed tied event = highest-(k_v, k_d) tied kset.Test plan
scons bcrham)(d_3p_del=3, dj_insertion='')as Zig (was(4, 'A')pre-fix)partis-test.py --paired --no-simupasses (both with and without--zig)🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com