Izwi v0.1.0-beta-17

Pre-release

Pre-release

github-actions released this 22 Jun 12:20

618c225

Runtime Support

macOS native artifacts are Metal-capable on Apple Silicon.
Linux and Windows native artifacts are CPU-only and intentionally do not bundle CUDA runtime libraries.
NVIDIA CUDA is supported through the Docker production-cuda target / docker compose --profile cuda and through source builds with --features cuda.
Native artifacts are checked for CUDA runtime payloads and per-file size before release publication.

What's Changed

🚀 New Features & Model Support

Granite Speech 4.1 2B Plus: Added native support for the new Granite Speech architecture, including standalone diarization, speaker-attributed ASR, and full support for long-audio workloads.
Progress Tracking: Introduced end-to-end ASR transcription progress indicators.
Summary Configuration: Switched ASR transcript summaries to a fully opt-in configuration.

⚡ Apple Silicon / Metal Optimizations

Granite Speech: Optimized decode steps on Metal to bring Granite Speech ASR well below real-time latency while stabilizing desktop performance across long audio.
Whisper: Accelerated Whisper Metal decode pipelines while safely gating unstable SDPA (Scaled Dot-Product Attention) mechanics by default.
Nemotron: Significantly optimized Nemotron ASR on Metal by implementing native depthwise kernels.
Parakeet & Qwen3: Accelerated Parakeet ASR execution and optimized Qwen3 ASR GGUF inference pipelines on Apple Silicon.

🛠️ Stability & Performance Fixes

Qwen Summaries: Added automatic retry mechanisms for transcription summaries when encountering invalid Qwen logit generations.
Model Management: Improved model selection and management workflows inside the UI's TTS job modal.
Profiling: Enhanced Granite ASR decode profiling to better trace and isolate compute bottlenecks.

Full Changelog: v0.1.0-beta-16...v0.1.0-beta-17

Assets 15