flow-st8

Local voice-to-text for Windows. Press Ctrl+Win, talk, press again — your words are transcribed by Whisper and pasted wherever your cursor is. No cloud, no subscription, no audio leaving your machine.

No cloud. No subscription. No screenshots sent to anyone's server.

Requirements

Windows 10/11
Python 3.11+
Microphone
NVIDIA GPU recommended (RTX 20xx/30xx/40xx) — also works on CPU, slower

Install

git clone https://github.com/luckmattos/flow-st8.git
cd flow-st8

python install.py

install.py detects your hardware and installs the right PyTorch version automatically:

NVIDIA GPU found → installs CUDA build (~2.7GB, transcription <1s)
No GPU → installs CPU build (lighter, slower transcription)

On first run, Whisper downloads the model (~1.5GB, one time only).

Run

Double-click flow-st8.vbs — no terminal, no console window.

Or from terminal:

python main.py

A mic icon appears in the system tray. You're ready.

How to use

Press Ctrl+Win — icon turns red, recording starts
Talk — pause freely, silence is filtered out automatically
Press Ctrl+Win again — icon turns blue, transcribing
Text is pasted wherever your cursor is

Tray menu: right-click the icon to record, toggle autostart, or quit.

Choosing a model

Edit %APPDATA%\flow-st8\config.toml and change name under [model].

Model	Size	With NVIDIA GPU	CPU only	Quality
`tiny`	39 MB	~0.1s	1-2s	Low, hallucinates
`base`	138 MB	~0.3s	3-5s	Decent
`small`	460 MB	~0.7s	8-12s	Good — best for CPU
`medium`	1.5 GB	~1.5s	25-35s	Very good
`large-v3-turbo`	1.5 GB	~0.4-1.2s ⭐	20-45s	Excellent (default)

No GPU? Stick with base or small. Running large-v3-turbo on CPU means waiting 20-45 seconds per sentence — not practical.

Benchmarks on RTX 4060 Laptop 8GB with large-v3-turbo:

Speech duration	GPU latency	CPU latency
5s	~0.4s	~8s
15s	~1.2s	~25s
30s	~2.5s	~45s

Configuration

Config file: %APPDATA%\flow-st8\config.toml (auto-created on first run)

[model]
name = "large-v3-turbo"   # tiny | base | small | medium | large-v3-turbo
language = "pt"
initial_prompt = "Transcription in Brazilian Portuguese."

[hotkey]
mode = "toggle"           # toggle = press to start, press to stop
key = "ctrl+win"

[audio]
device_index = -1         # -1 = system default mic
max_seconds = 210         # 3m30s max per recording
gain = 1.8

[vad]
enabled = true
speech_threshold = 0.5

[startup]
autostart = true          # start with Windows

Privacy

Commercial voice dictation apps send your audio — and often screenshots of your screen — to their servers every time you speak.

flow-st8 does not:

Send audio anywhere
Capture screenshots
Require internet
Store anything outside your machine

Everything runs locally. Audio is processed and discarded.

Stack

Layer	Technology
Speech-to-text	OpenAI Whisper + PyTorch
Voice detection	Silero VAD v6
Global hotkey	Win32 `WH_KEYBOARD_LL` hook via `ctypes`
Audio capture	sounddevice
Text injection	pyperclip + Win32 `SendInput`
Tray icon	pystray + Pillow
Config	TOML via stdlib `tomllib`
Autostart	`winreg` → `HKCU\...\Run`

For architecture details see ARCHITECTURE.md.

Roadmap

Keywords: speech-to-text, voice-dictation, whisper, local, offline, privacy, windows, cuda, python, real-time, hotkey

Author

luckmattos - built with AI coding agents and tools. MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
.release-please-manifest.json		.release-please-manifest.json
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
DISCOVERY.md		DISCOVERY.md
PROGRESS.md		PROGRESS.md
README.md		README.md
RELEASING.md		RELEASING.md
VERSION		VERSION
app.py		app.py
audio_feedback.py		audio_feedback.py
autostart.py		autostart.py
banner.svg		banner.svg
config.py		config.py
flow-st8.vbs		flow-st8.vbs
hotkey.py		hotkey.py
injector.py		injector.py
install.py		install.py
main.py		main.py
recorder.py		recorder.py
release-please-config.json		release-please-config.json
requirements.txt		requirements.txt
transcriber.py		transcriber.py
tray.py		tray.py
vad.py		vad.py
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

flow-st8

Requirements

Install

Run

How to use

Choosing a model

Configuration

Privacy

Stack

Roadmap

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

flow-st8

Requirements

Install

Run

How to use

Choosing a model

Configuration

Privacy

Stack

Roadmap

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages