Skip to content

zenforks-v0.10.0 — vanilla rename + pinned-upload

Choose a tag to compare

@lilith lilith released this 28 May 00:05
· 22 commits to main since this release

First release of the zenforks-cubecl-* family on crates.io.

What this is

A maintained fork of tracel-ai/cubecl
v0.10.0, with 11 of its 16 crates renamed and published to crates.io
under the zenforks-cubecl-* namespace. The renamed crates carry a
small number of internal-use patches that downstream imazen
projects (zenmetrics, six *-gpu perceptual-metric crates) depend on
while waiting for upstream PRs to merge.

What's renamed vs not

Renamed in 0.10.0 (published from this tag):
zenforks-cubecl, zenforks-cubecl-runtime, zenforks-cubecl-core,
zenforks-cubecl-cuda, zenforks-cubecl-wgpu, zenforks-cubecl-cpu,
zenforks-cubecl-cpp, zenforks-cubecl-hip, zenforks-cubecl-spirv,
zenforks-cubecl-std, zenforks-cubecl-opt.

Stays upstream (consume directly from tracel-ai/cubecl's
crates.io publication at 0.10.0): cubecl-common, cubecl-ir,
cubecl-macros, cubecl-macros-internal, cubecl-zspace.

These are leaves of the dep graph; no transitive dep on a patched
crate, so no need to rename.

What patches ship in 0.10.0

Only the pinned-host-buffer fast path for
ComputeClient::create_from_slice and friends, in cubecl-runtime.
~4x HtoD speedup on CUDA workloads via direct DMA from pinned host
memory at 12-25 GB/s on PCIe 4.0 (vs ~5-6 GB/s pageable bounce).
Drafted as upstream PR
tracel-ai/cubecl#1334.

The PTX cache widening and Metal Atomic capability honesty
patches are coming in zenforks-v0.10.1.

Consumer pin convention

In your Cargo.toml, alias via the package field:

[dependencies]
cubecl-runtime = { package = "zenforks-cubecl-runtime", version = "0.10.0" }
cubecl-cuda    = { package = "zenforks-cubecl-cuda",    version = "0.10.0" }
# ...etc
# Non-renamed crates stay on upstream:
cubecl-common  = "0.10.0"

Then in source code, use cubecl_runtime::*; resolves to our
package because we keep [lib] name = "cubecl_runtime" unchanged.
No source rewrites needed.

Acknowledgement

Built on the great work of the upstream
tracel-ai/cubecl maintainers.
This fork exists to ship downstream patches without waiting on
upstream review cycles, not to replace upstream.