A Python application that captures real-time audio and transcribes it using OpenAI Whisper. The transcribed text is displayed on the console with intelligent audio buffering for optimal speech recognition.
WhisperLab/
├── .git/
├── .venv/
├── src/
│ ├── __init__.py
│ ├── main.py
│ ├── audio_capture.py
│ ├── transcription.py
│ ├── config.py
│ └── test_microphone.py
├── requirements.txt
├── run.py
├── .gitignore
└── README.md
-
Clone the repository:
git clone https://github.com/LiteObject/WhisperLab.git cd WhisperLab -
Create a virtual environment (recommended):
python -m venv .venv # On Windows: .venv\Scripts\activate # On macOS/Linux: source .venv/bin/activate
-
Install the required dependencies:
pip install -r requirements.txt
Option 1: Using the runner script (Recommended)
python run.pyOption 2: Running from src directory
cd src
python main.pyThe application uses intelligent audio buffering for better transcription:
- Loads the Whisper model (this may take a moment on first run)
- Starts capturing audio from your default microphone
- Buffers audio for 3 seconds to ensure quality transcription
- Transcribes every 2 seconds when sufficient audio is detected
- Displays transcribed text with clear indicators
- Press Ctrl+C to stop the application
When working correctly, you'll see output like:
Loading Whisper model: base
Whisper model 'base' loaded successfully
Starting WhisperLab...
Press Ctrl+C to stop
Starting audio recording...
Recording started at 16000 Hz, 1 channel(s)
Transcription thread started...
🎤 Transcribing 3.0s of audio (level: 0.0456)...
✅ Transcribed #1: Hello, this is a test of the whisper transcription system.
🎤 Transcribing 3.0s of audio (level: 0.0523)...
✅ Transcribed #2: How are you doing today?- Speak clearly for 3-5 seconds at a time
- Wait 2-3 seconds between phrases for processing
- Ensure good microphone placement and volume
- Minimize background noise for better accuracy
WhisperLab automatically performs a quick microphone test when starting up:
python run.pyOutput:
🔍 Checking microphone availability...
🚀 Quick Microphone Test (3 seconds)
✅ Microphone working!
✅ Microphone check passed!
Starting WhisperLab...
You can control the startup microphone check behavior:
Environment Variables:
WHISPERLAB_MIC_CHECK_ENABLED=false- Disable startup microphone checkWHISPERLAB_MIC_CHECK_EXIT_ON_FAIL=true- Exit automatically if microphone fails
Example:
# Skip microphone check
set WHISPERLAB_MIC_CHECK_ENABLED=false
python run.py
# Exit automatically if microphone doesn't work
set WHISPERLAB_MIC_CHECK_EXIT_ON_FAIL=true
python run.pyThis project requires the following Python packages:
openai-whisper- OpenAI's Whisper model for speech recognitionsounddevice- Cross-platform audio I/O librarysoundfile- Audio file I/Onumpy- Numerical computing librarytorch- PyTorch deep learning frameworklibrosa- Audio analysis library
See requirements.txt for the complete list of dependencies.
- Real-time audio capture with intelligent buffering
- Speech-to-text transcription using OpenAI Whisper
- Smart audio processing with 3-second buffers for better accuracy
- Console output with clear status indicators
- Multithreaded architecture for smooth audio processing
- Built-in microphone testing tool for troubleshooting
If you're not seeing transcriptions, first test your microphone:
cd src
python test_microphone.pyThis will:
- Show available audio devices
- Record 5 seconds of audio
- Display audio levels and detection status
- Help diagnose microphone issues
- Python 3.7 or higher
- Microphone access
- Compatible audio drivers