This repo wires a macOS Speech framework bridge (Swift) into a Rust library that exposes a Node.js addon for live microphone transcription. The native-live.js script streams mic audio into Rust, forwards it to the Swift analyzer, and prints the transcripts coming back from the native callback.
The flow only targets macOS Sonoma (15) and newer because Apple’s asynchronous SpeechAnalyzer APIs require those SDKs.
- macOS 15 (Sonoma) or newer with Xcode 16 toolchain (provides the 26.0 SDK referenced below)
- Rust toolchain (
rustupor Homebrew) - Node.js 18+
- Swift compiler (
swiftc) from Xcode command-line tools
First install the Node dependencies (only needed once):
npm installCompile the Swift bridge against the macOS 26.0 SDK. We drop the artifacts into the Rust target/release directory so the Node addon can dlopen it without extra steps.
mkdir -p target/release swift-module-cache
swiftc -module-cache-path swift-module-cache \
-emit-library -emit-module -module-name SpeechShim \
-o target/release/libSpeechShim.dylib swift/SpeechShim.swift \
-framework Speech -framework AVFoundation -framework CoreMedia \
-target arm64-apple-macos26.0That command emits:
target/release/libSpeechShim.dylibtarget/release/SpeechShim.swiftmodule
The Rust layer loads libSpeechShim.dylib at runtime, so keeping it beside the Rust artifacts simplifies things.
Build the addon in release mode (it produces target/release/whisper_local.node):
cargo build --releaseIf you already built, rerun whenever Rust sources change or after rebuilding Swift so the linker can see the updated dylib.
If you want to package the outputs, copy both the Swift dylib and the Node addon together, e.g.:
cp target/release/libSpeechShim.dylib target/release/whisper_local.node /path/to/app/resources/When running from the repo you can skip this because the next step points DYLD_LIBRARY_PATH at target/release.
Point the dynamic loader at the release artifacts and start the Node script. By default it uses the macOS speech analyzer backend (WHISPER_MODE=os).
DYLD_LIBRARY_PATH=target/release \
WHISPER_MODE=os \
node native-live.jsWHISPER_MODE=ostells Rust to use the Swift Speech bridge.native-live.jscaptures mono 16 kHz audio viamic, converts to float samples, and pushes them into Rust.- The Node console prints mic-buffer diagnostics plus
[os] transcription: …lines from the Swift callback.
- If you see
sa2_push_pcm_f32 return code=-2, the Swift shim rejected the audio format. The current build expects 16 kHz mono; rebuild the Swift shim and verify the logs ([SpeechShim] expected audio format …) match your mic configuration. - If Node cannot load
libSpeechShim.dylib, ensureDYLD_LIBRARY_PATHincludes the directory containing the dylib or copy it next towhisper_local.node.
src/lib.rs– Rust entry point; handles the Node bridge and forwards mic buffers to Swift.src/speech_analyzer.rs– Rust FFI bindings to the Swift speech shim.swift/SpeechShim.swift– Async Swift host wrapping Apple’sSpeechAnalyzer/SpeechTranscriber.native-live.js– Demo Node script that captures mic audio and prints transcription callbacks.
Feel free to swap WHISPER_MODE to whisper to experiment with the ONNX model path once macOS transcription is flowing.