Skip to content

Izwi v0.1.0-beta-17

Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 22 Jun 12:20

Runtime Support

  • macOS native artifacts are Metal-capable on Apple Silicon.
  • Linux and Windows native artifacts are CPU-only and intentionally do not bundle CUDA runtime libraries.
  • NVIDIA CUDA is supported through the Docker production-cuda target / docker compose --profile cuda and through source builds with --features cuda.
  • Native artifacts are checked for CUDA runtime payloads and per-file size before release publication.

What's Changed

🚀 New Features & Model Support

  • Granite Speech 4.1 2B Plus: Added native support for the new Granite Speech architecture, including standalone diarization, speaker-attributed ASR, and full support for long-audio workloads.
  • Progress Tracking: Introduced end-to-end ASR transcription progress indicators.
  • Summary Configuration: Switched ASR transcript summaries to a fully opt-in configuration.

⚡ Apple Silicon / Metal Optimizations

  • Granite Speech: Optimized decode steps on Metal to bring Granite Speech ASR well below real-time latency while stabilizing desktop performance across long audio.
  • Whisper: Accelerated Whisper Metal decode pipelines while safely gating unstable SDPA (Scaled Dot-Product Attention) mechanics by default.
  • Nemotron: Significantly optimized Nemotron ASR on Metal by implementing native depthwise kernels.
  • Parakeet & Qwen3: Accelerated Parakeet ASR execution and optimized Qwen3 ASR GGUF inference pipelines on Apple Silicon.

🛠️ Stability & Performance Fixes

  • Qwen Summaries: Added automatic retry mechanisms for transcription summaries when encountering invalid Qwen logit generations.
  • Model Management: Improved model selection and management workflows inside the UI's TTS job modal.
  • Profiling: Enhanced Granite ASR decode profiling to better trace and isolate compute bottlenecks.

Full Changelog: v0.1.0-beta-16...v0.1.0-beta-17