Skip to content

v0.7.0

Choose a tag to compare

@oritwoen oritwoen released this 22 Feb 03:08
· 32 commits to main since this release

v0.7.0 is the next minor release.

👀 Highlights

🦘 Three-set kangaroos for 3× collision pairs, --mod-step/--mod-start for modular constraints, and GPU-side optimizations (dual-point eval, branchless carries, negation map tuning). Fixed signed distance handling that silently dropped valid collisions on larger puzzles.

🦘 Multi-Set Kangaroos

Three kangaroo sets instead of two. A second wild herd starts from -pubkey, giving C(3,2) = 3 valid collision pairs instead of 1. Cross-wild collisions use scalar halving mod n.

  • 3-set dispatch — tame + wild₁ + wild₂ with automatic third-split allocation (#58)
  • Cross-wild resolutioncompute_candidate_keys_cross_wild() with SCALAR_HALF (#58)

🎭 Modular Constraint Filtering

If you know k ≡ R (mod M), the search space shrinks by factor M. Kangaroo rescales the problem with base point H = M*G and solves for j where k = M*j + R.

  • --mod-step / --mod-start — specify constraint from CLI (#59)

⚡ GPU Performance

Dual point evaluation probes both P+J and P−J for DP, reusing the same inverse. Negation map tuning adds dp-bits-scaled cycle caps and point-based repeat tracking. Branchless carry propagation in fe_add/fe_sub.

  • Dual point evaluation — 2× DP sampling density per step (#39)
  • Negation map tuning — cycle cap, 2-cycle prevention, repeat tracking (#57)
  • Branchless carries — unrolled carry chains, adaptive workgroup variants (#61)
  • Hardened readback — fallible GPU buffer mapping instead of panics (#44)

🩹 Correctness Fixes

GPU unsigned 256-bit wrapping was silently misinterpreted on the CPU side, causing valid collisions to be rejected on larger puzzles with negation map active.

  • Signed distance handlingdistance_to_scalar() for correct GPU→CPU distance conversion (#56)
  • Unified 8-formula candidates — DPTable and CPU solver share collision resolution code (#53)
  • CPU timeout — solver respects caller-provided time budget (#55)

🧪 Benchmark Improvements

--benchmark is now side-effect free by default — use --save-benchmarks to opt into file persistence.

  • --save-benchmarks flag — explicit opt-in for file output (#51)

✅ Upgrading

cargo install kangaroo

👉 Changelog

compare changes

🚀 Enhancements

  • solver: Multi-set kangaroos — 3-set dispatch for higher collision probability (#58)
  • modular: Mod constraint filtering for ECDLP (#59)
  • benchmark: Optional file output with --save-benchmarks (#51)

⚡ Performance

  • gpu: Dual point evaluation via inverse reuse (#39)
  • gpu: Harden wgpu readback path (#44)
  • shader: Benchmark stability improvements (#47)
  • negmap: Optimize walk + cycle guards (#57)
  • gpu: Branchless carry propagation in WGSL hot path (#61)

🩹 Fixes

  • cpu: Unify 8-formula collision candidates (#53)
  • cpu: Enforce timeout budget (#55)
  • cpu: Signed distance interpretation for GPU unsigned wrapping (#56)

💅 Refactors

  • shaders: Loop-based 256-bit add/sub paths (#45)

🏡 Chore

  • Create FUNDING.yml
  • Update Rust dependencies (#42)

❤️ Contributors