Release Summary
Published new versions of crates to crates.io with BitNet integration for Craftsman Ultra 30b 1-bit model support.
Published Crates
BitNet Integration Features (ruvllm v2.0.2)
From PR #151:
-
TL1 Kernels: High-performance ternary linear kernels
- AVX2 SIMD implementation (
tl1_avx2.rs)
- WASM portable implementation (
tl1_wasm.rs)
- Generic kernel interface (
tl1_kernel.rs)
-
Ternary Tensor Quantization
- 1.58-bit quantization (ternary: -1, 0, +1)
- Efficient bitpacked storage
- Fast dequantization paths
-
RLM (Reasoning Language Model) Components
rlm_embedder.rs - Embedding layer with ternary weights
rlm_refiner.rs - Refinement passes for improved accuracy
-
Expert Cache with MoE Support
expert_cache.rs - Mixture-of-Experts caching
- Dynamic expert loading/unloading
-
GGUF Export
gguf_export.rs - Export to GGUF format for llama.cpp compatibility
-
Evaluation & Tracing
eval.rs - Model evaluation utilities
trace.rs - Inference tracing and debugging
Other Updates
- ruvector-sona v0.1.5: Added
Debug implementation for SonaEngine
- ruvector-crv v0.1.1: Added README for crates.io documentation
- rvlite v0.3.0: Standalone vector database with 22 WASM module integrations
- Workspace version: Bumped to 2.0.2
Related PRs
Documentation
- ADR-017: Craftsman Ultra 30b 1-bit BitNet integration
- DDD: BitNet quantizer module design
- Research: Craftsman Ultra 30b 1-bit analysis
Release Summary
Published new versions of crates to crates.io with BitNet integration for Craftsman Ultra 30b 1-bit model support.
Published Crates
ruvllmruvector-sonaruvector-temporal-tensorruvector-crvrvliteruvector-coreruvector-gnnruvector-graphruvector-mincutruvector-raftruvector-clusterruvector-replicationBitNet Integration Features (ruvllm v2.0.2)
From PR #151:
TL1 Kernels: High-performance ternary linear kernels
tl1_avx2.rs)tl1_wasm.rs)tl1_kernel.rs)Ternary Tensor Quantization
RLM (Reasoning Language Model) Components
rlm_embedder.rs- Embedding layer with ternary weightsrlm_refiner.rs- Refinement passes for improved accuracyExpert Cache with MoE Support
expert_cache.rs- Mixture-of-Experts cachingGGUF Export
gguf_export.rs- Export to GGUF format for llama.cpp compatibilityEvaluation & Tracing
eval.rs- Model evaluation utilitiestrace.rs- Inference tracing and debuggingOther Updates
Debugimplementation forSonaEngineRelated PRs
Documentation