What's Changed
Fix
- Parakeet TDT CoreML — M5 ANE crash (#314, refs #313). The
aufklarer/Parakeet-TDT-v3-CoreML-INT8repo rebuilt with a multi-encoder iOS18 layout (encoder.mlmodelc30s +encoder_5s.mlmodelc+encoder_15s.mlmodelc); the priorEnumeratedShapesiOS17 build SIGSEGV'd inbnns::GraphCompileon M5 ANE. Single-shape-30sand-iOS-5srepos were unaffected and continue to ship unchanged.
New
encoderVariant:parameter onParakeetASRModel.fromPretrained. Pick a shape-specific encoder from the multi-encoder repo for short voice-pipeline chunks:let model = try await ParakeetASRModel.fromPretrained( modelId: "aufklarer/Parakeet-TDT-v3-CoreML-INT8", encoderVariant: "5s")
nil(default) keeps the previousencoder.mlmodelcpath so single-shape repos work unchanged.
Performance
- macOS now prefers
.cpuAndNeuralEnginewith.cpuAndGPUfallback (was pinned to GPU only because of the historical ANE crash class). M5 Pro warm encoder forward: 8 ms at 5s / 24 ms at 15s / 74 ms at 30s, all on ANE. ~540 MB less peak RSS than GPU on the 30s shape. - WER unchanged on LibriSpeech test-clean n=200: 2.37% / 116× RTF / 916 MB peak, matching the published baseline.
CLI
brew upgrade speech # once homebrew-core formula bumps to v0.0.21
# or build from source
git clone --branch v0.0.21 https://github.com/soniqo/speech-swift.git
cd speech-swift && make build
./.build/release/speech transcribe meeting.wav --engine parakeetFull Changelog: v0.0.20...v0.0.21