v0.3.0
π― WhisperForge 0.3.0: Quantization & Performance
π Major Features
Phase C: INT8 Post-Training Quantization β
- 4Γ model size reduction: Tiny (150 MB β 37 MB), ideal for edge deployment
--quantize int8flag in model converter- Transparent loadingβno code changes required
- Full precision (FP32) and quantized (INT8) models interoperable
Phase B.5: GPU-Accelerated Mel Spectrogram β
- CubeCL DFT kernel for GPU mel filterbank matmul
--features cubecl-stftenables GPU STFT pipeline- Faster audio preprocessing on WGPU backend
Burn 0.21 & burn-flex Migration
- Latest Burn with improved numerical stability
- CPU fallback (burn-flex) seamlessly handles CPU inference
- Better WGPU runtime integration
π What's Changed
- Quantized models fully compatible with CLI and library API
- Streaming audio pipeline (Phase B) now fully integrated
- Fixed EOT suppression at step 0 for robustness
- Improved error handling across all crates
π¦ All 5 Crates Published
whisperforge-corev0.3.0 β librarywhisperforge-cliv0.3.0 β binarywhisperforge-convertv0.3.0 β model converterwhisperforge-alignv0.3.0 β VAD + SRTwhisperforge-diarizev0.3.0 β speaker diarization
π οΈ Dependency Updates
Simplified workspace dependency management: inter-crate deps now use workspace version automatically.
π Next Phase
Phase D: WASM Targetβbrowser-native speech-to-text with wasm-bindgen.
Full Changelog: v0.2.0...v0.3.0