Skip to content

v0.0.21

Latest

Choose a tag to compare

@ivan-digital ivan-digital released this 17 Jun 16:14
· 22 commits to main since this release
7609977

What's Changed

Fix

  • Parakeet TDT CoreML — M5 ANE crash (#314, refs #313). The aufklarer/Parakeet-TDT-v3-CoreML-INT8 repo rebuilt with a multi-encoder iOS18 layout (encoder.mlmodelc 30s + encoder_5s.mlmodelc + encoder_15s.mlmodelc); the prior EnumeratedShapes iOS17 build SIGSEGV'd in bnns::GraphCompile on M5 ANE. Single-shape -30s and -iOS-5s repos were unaffected and continue to ship unchanged.

New

  • encoderVariant: parameter on ParakeetASRModel.fromPretrained. Pick a shape-specific encoder from the multi-encoder repo for short voice-pipeline chunks:
    let model = try await ParakeetASRModel.fromPretrained(
        modelId: "aufklarer/Parakeet-TDT-v3-CoreML-INT8",
        encoderVariant: "5s")
    nil (default) keeps the previous encoder.mlmodelc path so single-shape repos work unchanged.

Performance

  • macOS now prefers .cpuAndNeuralEngine with .cpuAndGPU fallback (was pinned to GPU only because of the historical ANE crash class). M5 Pro warm encoder forward: 8 ms at 5s / 24 ms at 15s / 74 ms at 30s, all on ANE. ~540 MB less peak RSS than GPU on the 30s shape.
  • WER unchanged on LibriSpeech test-clean n=200: 2.37% / 116× RTF / 916 MB peak, matching the published baseline.

CLI

brew upgrade speech   # once homebrew-core formula bumps to v0.0.21

# or build from source
git clone --branch v0.0.21 https://github.com/soniqo/speech-swift.git
cd speech-swift && make build
./.build/release/speech transcribe meeting.wav --engine parakeet

Full Changelog: v0.0.20...v0.0.21