Skip to content

feat(scratch): add ScratchPool SPI for runtime workspace allocation#550

Merged
michalharakal merged 1 commit intodevelopfrom
feature/ISSUE-549-scratch-pool
Apr 28, 2026
Merged

feat(scratch): add ScratchPool SPI for runtime workspace allocation#550
michalharakal merged 1 commit intodevelopfrom
feature/ISSUE-549-scratch-pool

Conversation

@michalharakal
Copy link
Copy Markdown
Contributor

Closes #549.

Summary

Adds a generic workspace allocator (ScratchPool) for short-lived FloatArray buffers used by attention scratch, RoPE tables, KV-cache slice copies, padding scratch, and any other nn workload that allocates intermediates per forward step.

Workspace allocation is generic across nn workloads — CNN, encoder, embedding, training-time gradient buffers all need short-lived intermediates. Routing those through a pool eliminates per-step allocation pressure on the GC heap.

What's added

  • sk.ainet.lang.tensor.scratch.ScratchPool — SPI with acquireFloat, acquireFloatZeroed, scope { ... }, and stats().
  • NoopScratchPool — default no-op pool; every acquire allocates fresh. Bit-for-bit equivalent to pre-pool behavior.
  • SizeClassedScratchPool — power-of-two slabs starting at 64 floats; scoped lifetime; per-class cap with surplus drop. Single-threaded by intent (one forward at a time per pool); concurrent forwards use separate pools.
  • ExecutionContext.scratch: ScratchPool — backward-compatible accessor with default NoopScratchPool. Existing impls keep working unchanged.

Why upstream

Discussed in #549. The downstream call sites (KVCache, MHA, RoPE in SKaiNET-transformers) are transformer-specific, but the SPI itself is a generic tensor-workspace allocator. Any nn workload benefits — placing it next to sk.ainet.lang.tensor.data keeps it reusable.

Out of scope

  • Per-thread ambient carrier (ScratchPoolContext) — only needed where call sites don't take an ExecutionContext. With the upstream ctx.scratch field, downstream can pass it through naturally.
  • Direct-memory variants (MemorySegment-backed pool) — future iteration.

Version bump

gradle.properties0.21.0-SNAPSHOT for downstream composite-build integration.

Test plan

  • SizeClassedScratchPoolTest — 11 tests covering size-class rounding, scope recycling, nested scopes, stats correctness, surplus drop, no-op pool. All passing on JVM.
  • Backward compat: existing tests in :skainet-lang:skainet-lang-core:jvmTest pass unchanged.
  • Downstream consumer validation: in progress at SKaiNET-developers/SKaiNET-transformers (composite build).

🤖 Generated with Claude Code

Closes #549.

Adds a generic workspace allocator for short-lived FloatArray buffers used
by attention scratch, RoPE tables, KV-cache slice copies, padding scratch,
and any other nn workload that allocates intermediates per forward step.

* `sk.ainet.lang.tensor.scratch.ScratchPool` — SPI with `acquireFloat`,
  `acquireFloatZeroed`, `scope { ... }`, and `stats()`.
* `NoopScratchPool` — default no-op pool; every acquire allocates fresh.
  Bit-for-bit equivalent to pre-pool behavior.
* `SizeClassedScratchPool` — power-of-two slabs starting at 64 floats;
  scoped lifetime; per-class cap with surplus drop. Single-threaded by
  intent (one forward at a time per pool); concurrent forwards use
  separate pools.
* `ExecutionContext.scratch: ScratchPool` — backward-compatible accessor
  with default `NoopScratchPool`. Existing impls keep working unchanged.

Out of scope here: per-thread ambient carrier (only needed where call
sites don't take ExecutionContext); direct-memory variants.

Bumps version to 0.21.0-SNAPSHOT for downstream composite-build
integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

📖 Documentation Preview

The documentation has been built successfully for this PR.

Generated Files:

  • Operator documentation: docs/modules/operators/_generated_/
  • JSON schema output: operators.json

Artifacts:

  • Download the documentation-preview-550 artifact to view the complete documentation locally.

This comment will be updated automatically when the PR is updated.

@michalharakal michalharakal merged commit 380e7c5 into develop Apr 28, 2026
11 checks passed
@michalharakal michalharakal deleted the feature/ISSUE-549-scratch-pool branch April 28, 2026 20:57
@michalharakal michalharakal mentioned this pull request Apr 28, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ScratchPool SPI for runtime workspace allocation

1 participant