Skip to content

feat: configurable track insertion fill strategies#160

Merged
d-laub merged 18 commits into
mainfrom
feat/track-insertion-options
May 12, 2026
Merged

feat: configurable track insertion fill strategies#160
d-laub merged 18 commits into
mainfrom
feat/track-insertion-options

Conversation

@d-laub
Copy link
Copy Markdown
Collaborator

@d-laub d-laub commented May 12, 2026

Summary

  • Adds 5 user-selectable strategies for filling track values at insertion sites: Repeat5p (default, byte-identical to prior behavior), Repeat5pNormalized, Constant, FlankSample, and Interpolate (linear/quadratic/cubic Lagrange).
  • New Dataset.with_insertion_fill(strategy_or_dict) lets users set a strategy globally or per active track. Returns a new lazy view; original dataset unchanged.
  • Strategy dispatch happens inside the numba kernel shift_and_realign_track_sparse. FlankSample uses an inline xorshift64 hash of (base_seed, query, hap, out_idx) — parallel-safe, no np.random globals. deterministic=True derives a reproducible base_seed from the idx array.

Design + Plan

  • Spec: docs/superpowers/specs/2026-05-11-track-insertion-options-design.md
  • Plan: docs/superpowers/plans/2026-05-11-track-insertion-options.md

Test Plan

  • Kernel-level unit test for every strategy (tests/dataset/test_insertion_fill.py)
  • Default-path regression guard — Repeat5p produces byte-identical output to pre-change behavior
  • FlankSample determinism across calls + sensitivity to (query, hap) distinguishing seeds
  • Edge clamping when flank pool extends beyond the track
  • Per-track dict form falls back to Repeat5p for unspecified tracks
  • with_tracks prunes insertion_fill to active tracks
  • End-to-end through Dataset[...] with the dummy dataset
  • Rejection when reconstructor isn't HapsTracks
  • Full tests/dataset/ suite — 381 passed, 5 skipped, 1 xfailed, no regressions

Follow-ups (non-blocking, called out in final review)

  • The deterministic base_seed uses bitwise_xor.reduce(idx), which is permutation-invariant. The per-position (query, hap, pos) mixing in the kernel mitigates this for fills, but a position-sensitive hash would be stronger.
  • The _REPEAT_5P arm inside _apply_insertion_fill is unreachable from the outer kernel (which short-circuits) and is commented as such.

🤖 Generated with Claude Code

d-laub and others added 18 commits May 11, 2026 22:19
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-sample tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `insertion_fill` field to `Tracks`, `with_insertion_fill` method, and prune
fills in sync with `with_tracks`. Defaults to `Repeat5p` for every active track.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Dataset.with_insertion_fill() to configure per-track insertion fill
strategies, re-export all InsertionFill strategy classes from the top-level
package, and add e2e tests for the full plumbing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@d-laub d-laub force-pushed the feat/track-insertion-options branch from 165144e to dbdaca4 Compare May 12, 2026 05:19
@d-laub d-laub merged commit 670d939 into main May 12, 2026
5 checks passed
@d-laub d-laub deleted the feat/track-insertion-options branch May 12, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant