Helping you sound crispy.
Edit video by editing text — entirely on your machine.
Features · How It Works · Quick Start · Contributing
Recording yourself is easy. Editing out every "um", false start, and awkward pause? That's the hard part.
Toaster is a transcript-first desktop editor for spoken audio and video. Instead of scrubbing a timeline, you read your words, select the ones you don't want, and delete them — just like editing a document. Toaster handles the audio splicing, waveform sync, and caption export behind the scenes.
Everything runs locally. No cloud APIs, no uploads, no subscriptions.
- Edit media by editing text — see your transcript, select words, delete/silence/restore in one click
- Local transcription — generate word-level transcripts with on-device models (Whisper ecosystem)
- Filler & disfluency detection — automatically highlight "um", "uh", "you know", and pauses
- Non-destructive editing — every action is reversible; your original file is never touched
- Synchronized playback — transcript, waveform, and video stay in lockstep as you edit
- Export cleaned media — render your final cut with captions (SRT/VTT) and script text
- Save & resume — project files preserve your edits for iterative sessions
- Privacy-first — no runtime network calls, no telemetry, fully offline
- Open a video or audio file
- Transcribe with a local model — Toaster generates a word-level transcript
- Read and edit — select words you want to remove and hit Delete
- Preview — play back your edit in real time with synced waveform and video
- Export — render the cleaned media plus captions and script
The entire workflow stays on your machine. Your media never leaves your computer.
Download the latest installer from the Releases page.
| Platform | Architectures | Format |
|---|---|---|
| Windows | x64, ARM64 | .msi |
| Linux (Debian) | x64, ARM64 | .deb |
| Linux (RPM) | x64, ARM64 | .rpm |
| Linux (any) | x64, ARM64 | .AppImage |
Note: Windows installers in v0.1.0 are unsigned — SmartScreen will show "Windows protected your PC" the first time. Click More info → Run anyway to install. Code signing is planned for a follow-up release.
macOS builds are not currently published. Build from source if you need a macOS app — see docs/build.md.
See docs/build.md for full platform setup. The short version:
bun install --frozen-lockfile
cargo tauri devOn Windows, run .\scripts\setup-env.ps1 first to configure the MSVC + LLVM build environment.
"localhost refused to connect" on launch (Windows)
If the installed app shows a localhost refused to connect page instead of the editor, you most likely have a leftover debug build of toaster.exe still running from a previous cargo tauri dev / cargo tauri build --debug session. Tauri's single-instance plugin forwards every Start-Menu launch to that stale window, which expects a Vite dev server at http://localhost:1420 and shows the WebView2 connection error when it doesn't answer.
Get-Process toaster | Stop-Process -Force
Start-Process "C:\Program Files\Toaster\toaster.exe"The released MSI from the Releases page bundles all frontend assets directly into toaster.exe and never contacts localhost:1420.
| Layer | Technology |
|---|---|
| Desktop shell | Tauri 2.x |
| Backend | Rust |
| Frontend | React · TypeScript · Tailwind CSS |
| State | Zustand |
| Transcription | Local model inference (Whisper ecosystem) |
| Export | FFmpeg 7 |
We welcome contributions! Please read CONTRIBUTING.md before opening a PR.
# Run the checks contributors are expected to pass
cd src-tauri && cargo test && cargo clippy
npm run lintFor translation contributions, see CONTRIBUTING_TRANSLATIONS.md.
Toaster is forked from Handy by CJ Pais. Handy proved that a free, open-source, fully-offline speech tool could be simple, private, and community-driven. Toaster builds on that foundation with a transcript-first editing workflow.
We're grateful to the projects that make Toaster possible:
- Tauri — the Rust-native app framework that keeps the bundle small and the runtime fast
- Whisper by OpenAI — the speech recognition model at the heart of local transcription
- whisper.cpp & ggml — cross-platform inference and hardware acceleration
- FFmpeg — the Swiss Army knife of media processing
MIT — see LICENSE for details.
