Skip to content

Phase 3 close-out: main merge, missing-kernel ports, fused annot/splice haps, seqpro 0.20#246

Merged
d-laub merged 22 commits into
rust-migrationfrom
phase-3-reconstruction
Jun 25, 2026
Merged

Phase 3 close-out: main merge, missing-kernel ports, fused annot/splice haps, seqpro 0.20#246
d-laub merged 22 commits into
rust-migrationfrom
phase-3-reconstruction

Conversation

@d-laub

@d-laub d-laub commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Phase 3 close-out

Brings phase-3-reconstruction to an honest, fully-rust-default state. Closes out the Phase 3 reconstruction + track-realignment migration.

What changed (7 tasks, each parity-gated)

  1. Merged origin/main — pulls in the Tracks + max_jitter>0: silent wrong track output (numba) / panic (rust) from interval/query coordinate mismatch #242 intervals_to_tracks sub-query-start clip fix (PR fix(intervals): clip sub-query interval starts in both kernels (#242) #244, both backends) and the fix: SpliceIndexer double-applies sample-subset map (spliced+subset sample scramble) #243 SpliceIndexer subset double-apply fix (PR fix: SpliceIndexer double-applies sample-subset map (spliced+subset sample scramble) #243). The fused tracks kernel inherits the clip fix via the shared intervals::intervals_to_tracks core.
  2. Lifted the now-obsolete Tracks + max_jitter>0: silent wrong track output (numba) / panic (rust) from interval/query coordinate mismatch #242 xfails — removed 5 _REASON_242 constants + 10 @pytest.mark.xfail decorators; the max_jitter>0 interval domain is now real, asserted coverage on both backends. The genuine trailing-under-write assume(False) parity guards (a separate, still-live numba-undefined domain) were correctly retained.
  3. Rerouted Reference.fetch through the dispatched rust get_reference; deleted the 3 zero-caller _fetch_* numba functions. New tests/parity/test_reference_fetch_parity.py backstop.
  4. Fused the annotated-haps path — new reconstruct_annotated_haplotypes_fused (rust): the plain fused kernel + two i32 annotation buffers, reusing the shared reconstruct core. Byte-identical to the composed numba oracle (haps + var_idxs + ref_coords).
  5. Fused the spliced-haps path — new reconstruct_haplotypes_spliced_fused (rust): the plain fused kernel minus the diff/offset computation (takes the precomputed splice out_offsets). New tests/parity/test_spliced_haplotypes_parity.py. (The annotated+spliced intersection remains on the unfused dispatched rust core — parity-gated and rust-by-default — with fusion deferred to Phase 5.)
  6. Bumped seqpro 0.18 → 0.20.0 and adopted to_numpy(validate=False) at the one guaranteed-uniform read-path site (_reference.py fixed-length branch). The seqpro-core 0.1.0 Ragged layout remains compatible (cargo + parity green).
  7. Roadmap honesty pass + full-tree verification — checked off the reconciled Phase 3 items, resolved the ✅-header/unchecked-box contradiction, added a dated decisions-log entry.

Verification

Phase 5 carry-forward (non-blocking)

  • Thread an explicit output_length through ReconstructionRequest (retire the _out_per == hap_lengths ragged/fixed heuristic).
  • Route the three os.environ["GVL_BACKEND"] reads in _haps.py through a single dispatch-aware helper.
  • Complete-or-drop genvarloader.pyi (currently stubs only the 3 bigwig functions).
  • Fuse the annotated+spliced intersection.

🤖 Generated with Claude Code

d-laub and others added 22 commits June 24, 2026 18:55
Kernel-clip both backends (numba + rust) so intervals starting before
the per-read jittered query window paint correctly. Rejects the two
upstream fixes in the issue (write-clip breaks left-jitter; read-side
expanded-query is a redesign). Adds oracle test + widens parity strategy
to cover negative interval starts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tracks + max_jitter>0 store intervals against a jitter-expanded window,
but the read path queries the original chromStart, so left-edge intervals
have start < query_start. numba wrapped the negative index (silent wrong
output); rust hit debug_assert / bounds panic. Both kernels now clip to
the query window (s=max(start,0), e=min(end,length)), which is correct and
jitter-preserving. Adds cross-backend oracle tests + rust unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…242)

Widen the hypothesis strategy so the first interval may start before the
query (negative relative start), exercising the case that previously
violated the kernel contract. numba↔rust parity holds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The module doc-comment described end-clamping but not the new left-clamp
(s = max(start, 0)) added for sub-query interval starts. Sync it with the
inline comment so the documented semantics match the code.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…k-jitter-clip

fix(intervals): clip sub-query interval starts in both kernels (#242)
When a Dataset has been sample-subsetted (non-identity sample_subset_idxs)
AND output is spliced, SpliceIndexer.parse_idx applied the sample-subset
map twice:

1. Via self.dsi._s_idx[s_idx] within parse_idx, which maps output-space
   sample positions to on-disk positions.
2. Again via self.dsi.parse_idx((r_idx, s_idx)) at the end, which applies
   _s_idx a second time to what are already on-disk positions.

With the MMRF consensus dataset this caused sample MMRF_2702 (svar pos 54,
GVL sorted 625) to receive MMRF_1395's NRAS G12D mutation because
sample_subset_idxs[625] mapped to MMRF_1395.

Fix: after the unravel/exon-expand step, r_idx and s_idx already hold
full-dataset positions. Compute the flat storage index directly via
ravel_multi_index using full_region_idxs, bypassing the second _s_idx
application that parse_idx would otherwise inject.

Adds regression test test_splice_indexer_subset.py that constructs a
SpliceIndexer with sample_subset_idxs=[2,4] over 5 on-disk samples,
verifies all three index paths (slice×slice, scalar×scalar, no-subset)
return the correct on-disk sample positions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ramble

fix: SpliceIndexer double-applies sample-subset map (spliced+subset sample scramble)
…pro 0.20

Design for: merge origin/main (#242/#244 clip fix + #243 splice-subset fix)
into the branch, lift the now-obsolete #242 xfails, port Reference.fetch to
rust, fuse the annotated/splice haps paths, bump seqpro 0.18->0.20 with
to_numpy(validate=False) adoption, and reconcile the roadmap honestly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7 tasks: merge origin/main (#242/#243), lift obsolete #242 xfails, reroute
Reference.fetch through rust get_reference, fuse annotated + spliced haps
kernels, bump seqpro 0.20 + validate=False, roadmap honesty pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…op dead _fetch_* numba

Reroute Reference.fetch to build a (n,3) regions array and call the
module-level get_reference dispatcher (rust-default) instead of the
private _fetch_impl_par/_fetch_impl_ser numba pair. Delete the now-dead
_fetch_row, _fetch_impl_par, _fetch_impl_ser functions and update the
unit test that directly imported them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…arity)

Adds `reconstruct_annotated_haplotypes_fused` Rust FFI entry that combines
diff-computation, output-length allocation, and reconstruction into one
crossing, returning (out_data, annot_v, annot_pos, out_offsets). Routes the
non-splice annotated haplotypes Python branch to this kernel when
GVL_BACKEND=rust (default); numba branch unchanged. Parity test updated to
spy the new fused entry and verify byte-identical (haps + var_idxs + ref_coords)
across both backends.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ity)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…orm read-path sites

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…splice fusion scope

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…imization targets

Replace the stale 500-batch-script numbers (~37 haps / ~20 tracks) with same-harness
pytest-benchmark e2e results at HEAD on both backends: rust now within ~10-17% of numba
on haps/tracks (0.85-0.90x), 0.65x on the new annotated path. py-spy --native profile of
the rust annotated ds[r,s] (43k samples) ranks Phase 5 targets: (1) hoist per-batch
ascontiguousarray of dataset-static arrays (~21%), (2) skip output-buffer zeroing (~8%),
(3) scratch-pool the per-call allocs (~6%), (4) fold reverse_complement into the kernel.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… is a rust-only scalability defect

Profiling + a per-batch ascontiguousarray copy-trace revealed the ~20% self-time leaf is NOT
static-array churn but the fused track path materializing the full per-sample-scale interval
record store every batch: intervals are an array-of-structs memmap ({start:i4,end:i4,value:f4},
itemsize 12), so .starts/.ends/.values are strided field views; np.ascontiguousarray copies the
whole store (GB-scale / OOM at >1M samples). The numba path reads the strided views with no copy,
so this is a rust regression. Fix: Rust reads the contiguous record buffer directly (zero-copy).
Genotype memmap is the same pattern but currently benign (contiguous int32 -> no-op). Per-variant
arrays (sub-linear in samples) may be cached; per-sample-scale memmaps must never be materialized.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@d-laub d-laub merged commit 100ee7b into rust-migration Jun 25, 2026
1 check passed
@d-laub d-laub deleted the phase-3-reconstruction branch June 25, 2026 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant