feat: safetensors support — BF16 streaming indexer#60
Conversation
Parses the safetensors JSON header (no serde dependency) and produces GgufFile-compatible types so stream_index_gguf_bf16_with_header works unchanged on safetensors files. Safetensors stores full BF16 weights — no quantization noise. For the reasoning diff pipeline, BF16→Base17 gives cleaner fingerprints than Q8_0→f32→Base17. Includes test_stream_index_qwen35_safetensors for 11-shard Qwen3.5-27B indexing at full BF16 precision.
…tic indexing Splits stream_index_gguf_bf16 into: - stream_index_gguf_bf16(): parses GGUF header, delegates to _with_header - stream_index_gguf_bf16_with_header(): the core loop, works with any pre-parsed header (GGUF or safetensors) No behavior change for existing callers.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7ab54615e1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let shape = extract_json_array_u64(obj_str, "shape").unwrap_or_default(); | ||
|
|
||
| // Extract data_offsets | ||
| let offsets = extract_json_array_u64(obj_str, "data_offsets").unwrap_or_default(); | ||
| let offset = if offsets.len() >= 1 { offsets[0] } else { 0 }; |
There was a problem hiding this comment.
Fail fast when required tensor fields are missing
This code silently substitutes defaults ([] for shape, 0 for data_offsets) when parsing fails, then still emits a TensorInfo. On a truncated/corrupted or slightly non-conforming safetensors header, that can make multiple tensors read from the wrong byte range (often offset 0) and produce a seemingly successful but corrupted index instead of returning an error. Required fields for each tensor should be validated and parsing should fail if they are absent or invalid.
Useful? React with 👍 / 👎.
What
Adds safetensors format support to the streaming indexer. Same pattern as GGUF: parse header → iterate tensors → project rows → write bgz7.
Files
safetensors.rs(new, 414 lines)read_safetensors_header()→ parses the JSON header, producesGgufFile-compatible typesstream_index_safetensors_bf16()→ thin wrapper: parse header → delegate tostream_index_gguf_bf16_with_headertest_stream_index_qwen35_safetensors(11 shards, ~55 GB)gguf_indexer.rs(refactored)stream_index_gguf_bf16_with_header()— the core loop, format-agnosticstream_index_gguf_bf16()now just parses the GGUF header and delegatesmod.rs— registered safetensors moduleWhy safetensors for the reasoning diff
The Qwen3.5 base and distilled models are available as BOTH:
Indexing at BF16 gives cleaner Base17 fingerprints — Q8_0 introduces quantization noise before the golden-step projection. For causal diffing, less noise = sharper NARS truth values = more reliable reasoning scaffold detection.
Architecture
One core pipeline, two header parsers. Syntax-checked with rustc 1.94.1.