Skip to content
rcspam edited this page Apr 26, 2026 · 9 revisions

🌐 Language: English | Français

dictee — Wiki

Linux voice dictation with local speech recognition, optional translation, and full KDE Plasma 6 / GNOME integration. Supports 4 ASR backends (Parakeet-TDT, Canary, faster-whisper, Vosk) across 25+ languages.

This wiki is the technical companion to the README. The README covers what dictee is, how to install it, and typical usage. The wiki dives deeper into configuration, backend internals, post-processing pipeline, troubleshooting, and contribution.

🚀 Three entry paths

I want to install itInstallation · Setup-Wizard · GPU-Setup

I want to understand how it worksASR-Backends · Post-Processing-Overview · Translation · Configuration

I want to contribute or build from sourceDeveloper-Guide · Building from source · Testing

📖 Full page index

Getting started

  • Installation — 1-liner, .deb/.rpm/AUR/tarball, aarch64/Jetson, non-packaged distros
  • Setup-Wizard — first-run 8-step guided flow with end-to-end GIF walkthrough
  • Configuration — tab-by-tab reference of the dictee-setup UI (all backends, PP, UI options)
  • Plasmoid-Widget — KDE Plasma 6 widget, 5 animation styles, advanced settings
  • Tray-Icon — system tray menu, light/dark themes, GNOME/Ubuntu (AppIndicator)
  • Keyboard-Shortcuts — KDE/GNOME shortcut capture, tiling WMs (Sway/i3/Hyprland), double shortcut
  • Voice-Commands — every voice command per language + the floating cheatsheet (Ctrl+Alt+F9)
  • GPU-Setup — CUDA prerequisites per distro, cuDNN, GPU detection, CPU fallback

Speech recognition (ASR)

Translation

  • Translation — 5 backends compared (Canary built-in, LibreTranslate, Ollama, Google, Bing)
  • Ollama-Setup — install, recommended models (Gemma 3 4B), structured prompts

Post-processing pipeline

Diarization & CLI

  • Diarization — Sortformer, up to 4 speakers, VRAM limits, batch chunked (v1.4)
  • CLI-Reference — every command: dictee, dictee-switch-backend, Rust binaries

Reference

  • Troubleshooting — common errors, GPU OOM, logs, daemon socket
  • FAQ — why Rust? why not Whisper streaming? multi-user support?
  • Developer-Guide — build, architecture, tests, contribution workflow
  • Changelog — version history (mirrors GitHub release notes)

🔗 External links


Languages: this wiki is available in English and Français — every page has a language switcher at the top. README.fr.md is the canonical French documentation for the project overview.

📖 dictee Wiki

🇬🇧 Home · 🇫🇷 Accueil


Getting started / Premiers pas

Speech recognition / ASR

Translation / Traduction

Post-processing / Post-traitement

CLI

Reference / Référence


🏠 Repo · 📦 Releases · 🐛 Issues

Clone this wiki locally