Realtime MacOS speech-to-text library for performant microphone and system audio transcription. Supports local model transcription and Deepgram. Perfect for meeting apps, note takers, or ambient computing.
Splitstream uses MacOS' Audio Tap API with Rust bindings from cidre to capture system output. Elementary echo cancellation powered by SpeexDSP keeps the mic channel clean even when audio is playing through speakers at negliglible performance cost.
The result is multithreaded and non-blocking dual transcription that runs seamlessly on low-end devices.
The crucial pieces of this library (core audio pipeline, audio capture, echo cancellation, transcription backend) were designed and written by a human. Docstrings and the refactoring that turned into a usable library was made with Claude.
- macOS 14.2+: the audio tap API used for system audio capture only works for macOS 14.2 and beyond
- Rust 1.80+
- Xcode Command Line Tools:
xcode-select --install - cmake:
brew install cmake(required to compile audio dependencies)
Parakeet is the default backend. There is no API key, no cloud, runs entirely on your machine.
Step 1. Add splitstream to your project:
cargo add splitstreamStep 2. Download the Parakeet model and run download-parakeet (one-time setup, ~500MB):
cargo install splitstream --bin download-parakeet
download-parakeetThis downloads the three model files (encoder.onnx, decoder_joint.onnx, tokenizer.json) into ./models/parakeet-eou/ in your current directory.
Step 3. Use it in your Rust project:
# Cargo.toml
[dependencies]
splitstream = "0.1" # version may vary
tokio = { version = "1.52.3", features = ["full"] } # version may varyuse splitstream::{AudioSource, SplitStreamBuilder};
#[tokio::main]
async fn main() {
let (handle, mut rx) = SplitStreamBuilder::new()
.with_parakeet("models/parakeet-eou")
.echo_cancellation(true)
.start()
.await
.expect("failed to start splitstream");
tokio::spawn(async move {
tokio::signal::ctrl_c().await.ok();
handle.shutdown();
});
while let Some(t) = rx.recv().await {
let label = match t.source {
AudioSource::Mic => "[🎤 Microphone]",
AudioSource::Sys => "[🖥️ System Audio]",
};
println!("{} {}", label, t.text);
}
}Deepgram streams audio to the cloud and returns transcripts with very low latency. Requires a Deepgram API key.
Note: Deepgram pulls in Opus encoding which requires cmake:
brew install cmake
Step 1. Add splitstream with the deepgram feature:
cargo add splitstream --features deepgramStep 2. Set your API key (add to .env or export in your shell):
DEEPGRAM_API_KEY=your_key_here
Step 3. Use it in your Rust project:
# Cargo.toml
[dependencies]
splitstream = "0.1" # version may vary
tokio = { version = "1.52.3", features = ["full"] } # version may varyuse splitstream::{AudioSource, SplitStreamBuilder};
#[tokio::main]
async fn main() {
let api_key = std::env::var("DEEPGRAM_API_KEY").expect("DEEPGRAM_API_KEY not set");
let (handle, mut rx) = SplitStreamBuilder::new()
.with_deepgram(api_key)
.echo_cancellation(true)
.start()
.await
.expect("failed to start splitstream");
tokio::spawn(async move {
tokio::signal::ctrl_c().await.ok();
handle.shutdown();
});
while let Some(t) = rx.recv().await {
let label = match t.source {
AudioSource::Mic => "[🎤 Microphone]",
AudioSource::Sys => "[🖥️ System Audio]",
};
println!("{} {}", label, t.text);
}
}| Backend | Method | Feature flag | Notes |
|---|---|---|---|
| Parakeet | .with_parakeet(model_dir) |
parakeet (default) |
Local ONNX, no API key needed |
| Deepgram | .with_deepgram(api_key) |
deepgram |
Cloud, requires API key + cmake |
To use Deepgram only (skips compiling Parakeet/ONNX):
splitstream = { version = "0.1", default-features = false, features = ["deepgram"] }The handle returned from .start() lets you toggle settings without stopping capture:
handle.set_mic_muted(true); // silence the mic channel
handle.set_sys_muted(true); // silence system audio (compliance mode)
handle.set_echo_cancellation(false); // toggle AEC on the fly
handle.shutdown(); // stop everything cleanlyAll of these take effect on the next 20ms tick and don't block.
Configure it via settings.toml in the working directory:
transcription_backend = "parakeet" # "parakeet" | "deepgram"
echo_cancellation = true
compliance_mode_on_start = false # start with sys audio muted
parakeet_model_dir = "./models/parakeet-eou"
# Required for Deepgram — or set DEEPGRAM_API_KEY in your environment
# api_key = "..."| Flag | Default | What it gates |
|---|---|---|
parakeet |
✅ on | Parakeet ONNX inference (pulls in ORT + ONNX Runtime) |
deepgram |
❌ off | Deepgram cloud backend (pulls in Opus, requires cmake) |
whisper |
❌ off | WIP |
Splitstream is dual-licensed:
- Open source / individuals — AGPL-3.0. Free to use, modify, and distribute, provided your application is also released under AGPL-3.0.
- Commercial use — If you want to use Splitstream in a proprietary product without open-sourcing your code, a commercial license is required. Contact retaildiamond@gmail.com.
See LICENSE-COMMERCIAL for details.
