Skip to content

NeuroBrix 0.2.1

Latest

Choose a tag to compare

@benkelaya benkelaya released this 09 Jun 00:16

NeuroBrix 0.2.1

First publicly usable release of the universal inference runtime — not final, just usable by people. One engine, any model, any modality, zero model-specific code.

Highlights

  • 4-mode execution matrix green on orpheus-snac, hat-l-x4, Flex.1-alpha, deepseek-moe-16b — each validated in a clean-room venv (none of the 14 vendor libraries installed), greedy and cross-mode-consistent. The four modes are two independent branches × two variants: PyTorch (--sequential op-by-op oracle, --compiled fused torch+cuDNN/cuBLAS) and Triton (--triton-sequential, --triton — NeuroBrix @triton.jit kernels, zero torch).
  • Audio, LLM, image, and upscaler families run end-to-end.
  • New aten::im2col Triton kernel (HAT OCAB) → hat-l-x4 now 4/4.
  • Flex/FLUX CLIP-pooler advanced-index fix (_meta_index) → Flex.1-alpha triton coherent.
  • R30 triton-sequential cross-device (pipeline-parallel) → deepseek-moe-16b 4/4.
  • Zero Outsider: SNAC traced into orpheus's .nbx (no runtime HF download); internal tokenizer/mel/g2p/filterbank runners; engine runs from the .nbx with only torch (compiled) / triton+NBXTensor (triton).
  • Engine docs aligned to the real CLI (9 families, 4 modes, upscalers).

Documented debt — Qwen3-30B-A3B --sequential

Validated by construction, not executed. The original frozen-seq-dim view crash is eliminated by re-tracing (graph carries {mul,s0,s1}, 0 frozen views), and correctness is established via compiled mode (same graph, "Paris.") plus the SymbolicShapeResolver shared by both paths. The op-by-op sequential oracle was not run to completion on 30B (≈115k ops × per-op dispatch is impractically slow). This is honest debt — not counted as a 4/4 pass. Qwen3 ships/validated in compiled. The other four matrix models are 4/4.

Install

pip install torch --index-url https://download.pytorch.org/whl/cu121
pip install neurobrix

Full details: CHANGELOG.md. Model licenses are each vendor's own (engine is Apache-2.0; it carries no model-license machinery).