Skip to content

bazzofx/voice2text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice2Text

A lightweight Windows system-tray app that transcribes your speech and types it directly into whate ver window you have open — no API key, no internet connection, no cost.

Built with faster-whisper (local Whisper model) and Py Qt6.


Requirements

  • Windows 10 or 11
  • Python 3.10 or newer — download from python.org

Installation

1. Clone or download the repository

git clone https://github.com/bazzofx/voice2text.git
cd voice2text

Or download and extract the ZIP from GitHub.

2. Install dependencies

Double-click install.bat, or run in a terminal:

pip install -r requirements.txt

This installs PyQt6, faster-whisper, sounddevice, pynput, and pyperclip.

3. Run the app

python main.py

A small coloured dot will appear in your system tray (bottom-right of the taskbar).


First run

et. The tray icon will be grey while loading and turn green when ready. This download only happens once.


How to use

  1. Open any app where you want to type — Notepad, Word, a browser text field, etc.
  2. Click inside it so it has focus.
  3. Press the hotkey (Ctrl+Space by default) to start recording. The tray icon turns red.
  4. Speak clearly.
  5. Press the hotkey again to stop. The icon turns blue while transcribing.
  6. The transcribed text is automatically pasted into your active window.

Tray icon colours

Colour Meaning
Grey Loading model
Green Ready
Red Recording
Blue Transcribing

Settings

Right-click the tray icon and choose Settings to change:

Setting Options
Hotkey Any key combination, e.g. Ctrl+Space, Win+Space
Mode Toggle (press once to start / again to stop) or Push-to-talk (hold to record)
Model size tiny (fast) → large-v3 (most accurate)
Language Auto-detect or pick a specific language
Device CPU (default) or CUDA GPU
Microphone Syste
small ~244 MB
medium ~769 MB
large-v3 ~1.5 GB

GPU acceleration (optional)

By default the app runs on CPU, which works on any machine. If you have an NVIDIA GPU and want fast er transcription:

  1. Install the CUDA Toolkit 12 from NVIDIA.
  2. Open Settings in the tray app and change Device to CUDA GPU.

Troubleshooting

Tray icon doesn't appear Make sure system tray icons are visible in Windows taskbar settings.

Text is not pasted into my window Make sure the target window has focus (click inside it) before the transcription finishes. The app pastes using Ctrl+V into the active window.

Transcription is slow Try switching to the tiny model in Settings for faster (but less accurate) results.

Audio device not detected Open Settings and select your microphone manually from the Microphone dropdown.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors