Izwi v0.1.0-beta-17
Pre-release
Pre-release
Runtime Support
- macOS native artifacts are Metal-capable on Apple Silicon.
- Linux and Windows native artifacts are CPU-only and intentionally do not bundle CUDA runtime libraries.
- NVIDIA CUDA is supported through the Docker
production-cudatarget /docker compose --profile cudaand through source builds with--features cuda. - Native artifacts are checked for CUDA runtime payloads and per-file size before release publication.
What's Changed
🚀 New Features & Model Support
- Granite Speech 4.1 2B Plus: Added native support for the new Granite Speech architecture, including standalone diarization, speaker-attributed ASR, and full support for long-audio workloads.
- Progress Tracking: Introduced end-to-end ASR transcription progress indicators.
- Summary Configuration: Switched ASR transcript summaries to a fully opt-in configuration.
⚡ Apple Silicon / Metal Optimizations
- Granite Speech: Optimized decode steps on Metal to bring Granite Speech ASR well below real-time latency while stabilizing desktop performance across long audio.
- Whisper: Accelerated Whisper Metal decode pipelines while safely gating unstable SDPA (Scaled Dot-Product Attention) mechanics by default.
- Nemotron: Significantly optimized Nemotron ASR on Metal by implementing native depthwise kernels.
- Parakeet & Qwen3: Accelerated Parakeet ASR execution and optimized Qwen3 ASR GGUF inference pipelines on Apple Silicon.
🛠️ Stability & Performance Fixes
- Qwen Summaries: Added automatic retry mechanisms for transcription summaries when encountering invalid Qwen logit generations.
- Model Management: Improved model selection and management workflows inside the UI's TTS job modal.
- Profiling: Enhanced Granite ASR decode profiling to better trace and isolate compute bottlenecks.
Full Changelog: v0.1.0-beta-16...v0.1.0-beta-17