Skip to content

fix: streaming APR import + mmap reader (realizar#136)#418

Merged
noahgift merged 3 commits intomainfrom
fix/streaming-apr-reader-mmap
Mar 6, 2026
Merged

fix: streaming APR import + mmap reader (realizar#136)#418
noahgift merged 3 commits intomainfrom
fix/streaming-apr-reader-mmap

Conversation

@noahgift
Copy link
Contributor

@noahgift noahgift commented Mar 6, 2026

Summary

  • AprV2StreamingWriter: writes tensor data to temp file incrementally. Peak RAM = 1 shard (~5GB), was 134GB for 67GB Qwen3.5-35B-A3B import.
  • mmap reader for apr tensors: list_tensors() uses MappedFile + AprV2ReaderRef for APR v2 files. 10.9MB RSS on 67GB file (was 89GB → swap storm).
  • Fixes pre-existing model_config.rs architecture field + gguf.rs BrickProfiler API breakage.

Dogfood results

67GB APR (1,811 tensors, Qwen3.5-35B-A3B)
Before: 89 GB RSS → OOM/swap storm
After:  10.9 MB RSS, 0.00s wall clock, 0 swap

Contract

streaming-reader-v1.yaml — FALSIFY-MMAP-001 verified.

Test plan

  • apr tensors on 67GB .apr: 10.9MB RSS (FALSIFY-MMAP-001)
  • Library builds clean (cargo build --release --lib)
  • pv validate contracts/streaming-reader-v1.yaml passes
  • CI unified gate

🤖 Generated with Claude Code

noahgift and others added 3 commits March 6, 2026 15:41
…H-129)

- REALIZR_MAX_SEQ_LEN: override max sequence length (default 2048)
- REALIZR_FREE_CPU_WEIGHTS=1: free CPU weight copies after GPU preload
- Use with_max_seq_len instead of new() for flexible KV cache sizing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…(Refs GH-176)

Replace benchmark_bricks() derived estimates (fugazi `*` suffixed scores)
with brick_scores_from_profiler() that reads actual GPU-synced timing
from trueno's BrickProfiler via all_brick_stats().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ls (Refs realizar#136)

AprV2StreamingWriter writes tensor data to temp file incrementally (peak
RAM = 1 shard ~5GB, was 134GB). list_tensors() uses MappedFile +
AprV2ReaderRef for APR v2 files (10.9MB RSS on 67GB file, was 89GB).

Also fixes: pre-existing model_config.rs architecture field, gguf.rs
BrickProfiler API breakage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@noahgift noahgift merged commit 4bd25a2 into main Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant