GUI Audio and video transcription using OpenAI Whisper.
- Transcribe audio files (WAV, MP3, FLAC, etc.)
- Transcribe video files (MP4, MKV, AVI, etc.)
- Capture live audio from system sources
- Record from specific applications
Before installing, make sure you have the following dependencies:
Required:
- Python 3.10+
- GTK 4
Optional but Recommended:
FFmpeg (required for video transcription and audio capture):
# Ubuntu/Debian
sudo apt install ffmpeg
# Fedora
sudo dnf install ffmpeg
# Arch Linux
sudo pacman -S ffmpeg
# macOS
brew install ffmpegCUDA Toolkit (for GPU acceleration - significantly faster transcription):
# Ubuntu/Debian (CUDA 12.x)
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update
sudo apt install cuda-toolkit-12-1
# Fedora
sudo dnf install cuda
# Arch Linux
sudo pacman -S cuda
# Verify installation
nvcc --version
nvidia-smiNote: CUDA requires an NVIDIA GPU. For AMD/Intel GPUs, the app will automatically fall back to CPU mode.
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
python3 main.pyflatpak-builder --user --install build com.conv.transcript.json
flatpak run com.conv.transcriptGUI: Run python3 main.py
CLI:
# Transcribe a file
./transcription/transcript.py audio.mp3 -m base -l en -o output.txt
# Capture from audio source
./transcription/transcript.py --capture --device "source-name"
# Capture from application
./transcription/transcript.py --capture-app| Feature | Requires |
|---|---|
| Audio file transcription | Python + Whisper (included) |
| Video file transcription | FFmpeg |
| Audio capture/recording | FFmpeg + PulseAudio/PipeWire |
| GPU acceleration | NVIDIA GPU + CUDA Toolkit |
Without FFmpeg: Audio files only (no video or capture)
Without CUDA: CPU mode only (slower but works on any system)