v0.7.0
v0.7.0 is the next minor release.
👀 Highlights
🦘 Three-set kangaroos for 3× collision pairs, --mod-step/--mod-start for modular constraints, and GPU-side optimizations (dual-point eval, branchless carries, negation map tuning). Fixed signed distance handling that silently dropped valid collisions on larger puzzles.
🦘 Multi-Set Kangaroos
Three kangaroo sets instead of two. A second wild herd starts from -pubkey, giving C(3,2) = 3 valid collision pairs instead of 1. Cross-wild collisions use scalar halving mod n.
- 3-set dispatch — tame + wild₁ + wild₂ with automatic third-split allocation (#58)
- Cross-wild resolution —
compute_candidate_keys_cross_wild()with SCALAR_HALF (#58)
🎭 Modular Constraint Filtering
If you know k ≡ R (mod M), the search space shrinks by factor M. Kangaroo rescales the problem with base point H = M*G and solves for j where k = M*j + R.
--mod-step/--mod-start— specify constraint from CLI (#59)
⚡ GPU Performance
Dual point evaluation probes both P+J and P−J for DP, reusing the same inverse. Negation map tuning adds dp-bits-scaled cycle caps and point-based repeat tracking. Branchless carry propagation in fe_add/fe_sub.
- Dual point evaluation — 2× DP sampling density per step (#39)
- Negation map tuning — cycle cap, 2-cycle prevention, repeat tracking (#57)
- Branchless carries — unrolled carry chains, adaptive workgroup variants (#61)
- Hardened readback — fallible GPU buffer mapping instead of panics (#44)
🩹 Correctness Fixes
GPU unsigned 256-bit wrapping was silently misinterpreted on the CPU side, causing valid collisions to be rejected on larger puzzles with negation map active.
- Signed distance handling —
distance_to_scalar()for correct GPU→CPU distance conversion (#56) - Unified 8-formula candidates — DPTable and CPU solver share collision resolution code (#53)
- CPU timeout — solver respects caller-provided time budget (#55)
🧪 Benchmark Improvements
--benchmark is now side-effect free by default — use --save-benchmarks to opt into file persistence.
--save-benchmarksflag — explicit opt-in for file output (#51)
✅ Upgrading
cargo install kangaroo👉 Changelog
🚀 Enhancements
- solver: Multi-set kangaroos — 3-set dispatch for higher collision probability (#58)
- modular: Mod constraint filtering for ECDLP (#59)
- benchmark: Optional file output with
--save-benchmarks(#51)
⚡ Performance
- gpu: Dual point evaluation via inverse reuse (#39)
- gpu: Harden wgpu readback path (#44)
- shader: Benchmark stability improvements (#47)
- negmap: Optimize walk + cycle guards (#57)
- gpu: Branchless carry propagation in WGSL hot path (#61)
🩹 Fixes
- cpu: Unify 8-formula collision candidates (#53)
- cpu: Enforce timeout budget (#55)
- cpu: Signed distance interpretation for GPU unsigned wrapping (#56)
💅 Refactors
- shaders: Loop-based 256-bit add/sub paths (#45)
🏡 Chore
- Create FUNDING.yml
- Update Rust dependencies (#42)