Skip to content

Release/0.18.0#460

Merged
michalharakal merged 3 commits intodevelopfrom
release/0.18.0
Apr 8, 2026
Merged

Release/0.18.0#460
michalharakal merged 3 commits intodevelopfrom
release/0.18.0

Conversation

@michalharakal
Copy link
Copy Markdown
Contributor

No description provided.

michalharakal and others added 3 commits April 8, 2026 14:50
Same class of bug as GGUF StreamingTensorInfo.nBytes overflow:
StreamingSafeTensorInfo.sizeInBytes was Int, silently truncating
via (dataOffsets.second - dataOffsets.first).toInt() for large tensors.

Changes:
- StreamingSafeTensorInfo.sizeInBytes: Int → Long
- Remove .toInt() from data offset subtraction
- Add >2GB guard on loadTensorData() with clear error message
- Fix ShardedSafeTensorsReader delegate return type
- Update test assertions for Long comparisons

Fixes #452

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Highlights:
- TurboQuant KV-cache compression (runtime ~8x at 4-bit)
- Memory architecture hardening (storage, placement, ownership)
- KV-cache subsystem with asymmetric K/V policies
- Quantization-preserving loaders (GGUF + SafeTensors)
- Int overflow fix for tensors > 2 GB (#452)
- Tekken tokenizer for Mistral models
- CPU SIMD kernels and JMH benchmarks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@michalharakal michalharakal merged commit c419de7 into develop Apr 8, 2026
4 checks passed
@michalharakal michalharakal deleted the release/0.18.0 branch April 8, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant