Skip to content

chore(release): prepare 0.21.0#566

Merged
michalharakal merged 2 commits intodevelopfrom
release/0.21.0
Apr 29, 2026
Merged

chore(release): prepare 0.21.0#566
michalharakal merged 2 commits intodevelopfrom
release/0.21.0

Conversation

@michalharakal
Copy link
Copy Markdown
Contributor

Summary

Release prep for 0.21.0 — the JVM Vector half of milestone M5 (CPU backend dispatch). 10 PRs landed since 0.20.0:

area PRs
Kernel SPI baseline + ServiceLoader #554, #559
Panama FP32 matmul + tile-blocking + routing #557, #558, #560, #561
Q4_K SIMD + sibling SPI + MemSeg + Q6_K + Q4_0 #562, #563, #564, #565
ScratchPool SPI, TensorOps.permute #550, #552
Q4_K/Q5_K canonical layout fix #556

After this release every quantized format in JvmQuantizedVectorKernels (Q4_0, Q4_K, Q4_K MemSeg, Q6_K, Q8_0) is SIMD'd to some degree. FP32 matmul through the SPI hits 8.6×–10.8× over scalar at 256/512/1024 (Apple Silicon NEON, JMH); Q4_K matmul-vector hits ≈30/55/73 GFLOPS at 1024²/4096×1024/4096².

Deferred

The priority-100 native (FFM) kernel provider that closes M5's literal ≥2.5× for Q4_K milestone metric is captured as a PRD in NATIVE_FFM_KERNEL_PROVIDER.md — module layout, FFM binding pattern, staged delivery plan, success metrics, risks. Trigger conditions for un-deferring documented at the bottom of the PRD.

Files changed

  • gradle.propertiesVERSION_NAME 0.21.0-SNAPSHOT → 0.21.0
  • CHANGELOG.md — new 0.21.0 section
  • README.md — Quickstart coordinates 0.20.0 → 0.21.0; compact "What's New"
  • NATIVE_FFM_KERNEL_PROVIDER.md (new) — PRD for the deferred native provider

Test plan

  • ./gradlew :skainet-backends:skainet-backend-cpu:compileKotlinJvm — clean.
  • All M5 PRs already merged with their own test suites (213–218 cpu-backend jvmTest, kernel JMH benches).
  • Reviewer: spot-check Quickstart coordinates render correctly on a fresh ./gradlew build of a downstream consumer once 0.21.0 is published.

🤖 Generated with Claude Code

michalharakal and others added 2 commits April 28, 2026 23:12
- gradle.properties: drop -SNAPSHOT; RELEASE_SIGNING_ENABLED stays true.
- CHANGELOG: add 0.21.0 section covering the JVM Vector half of M5
  (kernel SPI + Panama FP32 + tile-blocking + production routing +
  ServiceLoader auto-discovery + Q4_K SIMD + sibling SPI + Q4_K MemSeg
  + Q6_K SIMD + Q4_0 partial SIMD), plus ScratchPool SPI,
  TensorOps.permute, and Q4_K/Q5_K canonical layout fix.
- README: bump Quickstart coordinates to 0.21.0; compact "What's New"
  section.
- NATIVE_FFM_KERNEL_PROVIDER.md: PRD for the deferred priority-100
  native FFM kernel provider — module layout, FFM binding pattern,
  staged delivery plan, success metrics, risks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes NATIVE_FFM_KERNEL_PROVIDER.md from the release as a tracked
file, plus its references in README's "What's New" and the CHANGELOG
intro paragraph for the M5 section. The deferred native FFM kernel
provider is still mentioned by name in the CHANGELOG (since the
deferral is the actual release-relevant fact), just without a link to
a doc that's no longer in the tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@michalharakal michalharakal merged commit 5839da9 into develop Apr 29, 2026
9 checks passed
@michalharakal michalharakal deleted the release/0.21.0 branch May 2, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant