Fast batch audio transcription powered by Deepgram Nova-3. Upload multiple MP3 files and get accurate transcriptions with concurrent processing.
- ✅ Batch Processing - Upload unlimited files, processed in batches of 10
- ✅ Concurrent Transcription - 10 files transcribed simultaneously for 10x speed
- ✅ Multi-Language Support - English, Polish, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean
- ✅ Customizable Options - Toggle smart formatting and punctuation
- ✅ Download as ZIP - Get all transcriptions in one convenient file
- ✅ Simple Web UI - Clean Streamlit interface, no coding required
- Python 3.8 or higher
- Deepgram API key (Get one free)
If you don't have Python installed on macOS:
# Using Homebrew (install if needed: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)")
brew install python3
# Verify installation
python3 --versionIf you don't have Python installed on Windows:
- Download Python from python.org
- Run the installer and check "Add Python to PATH" during installation
- Verify installation in Command Prompt or PowerShell:
python --version- Clone the repository:
git clone https://github.com/bsisduck/DBAT.git
cd DBAT- Create virtual environment:
macOS/Linux:
python3 -m venv venv
source venv/bin/activateWindows:
python -m venv venv
venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Start the application:
streamlit run app.py-
Open your browser at
http://localhost:8501 -
Enter your Deepgram API key in the sidebar
-
Select your language and options
-
Upload MP3 files and click "Transcribe All"
-
Download the ZIP file with all transcriptions
- Smart Format: Improves readability with proper formatting
- Punctuation: Adds punctuation marks to transcriptions
- Language: Select the language of your audio files
You can provide your API key in two ways:
- Enter it in the sidebar UI (recommended)
- Set
DEEPGRAM_API_KEYin.envfile
- Files are split into batches of 10
- Each batch processes 10 files concurrently using threading
- Deepgram Nova-3 API transcribes the audio
- Transcriptions are saved as
.txtfiles - All files are packaged into a timestamped ZIP
- Sequential: 28 files × 1 min = ~28 minutes
- Concurrent (10 at once): 3 batches × 1 min = ~3 minutes
Processing time depends on audio length and API response time.
streamlit==1.28.1
deepgram-sdk==3.5.0
python-dotenv==1.0.0
DBat/
├── app.py # Main Streamlit application
├── transcription_service.py # Deepgram API wrapper
├── batch_processor.py # Concurrent batch processing
├── zip_utils.py # ZIP file creation
├── requirements.txt # Python dependencies
├── .env.example # Environment template
├── LICENSE # MIT License
└── README.md # This file
English, Polish, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean
More languages available - check Deepgram docs
API key not working?
- Verify key is valid at https://console.deepgram.com
- Check for typos when entering the key
Files not uploading?
- Only MP3 files are supported
- Max file size: 2000 MB per file
- Ensure stable internet connection
Slow processing?
- Processing time depends on audio length
- Check your Deepgram plan limits
- Longer files take proportionally longer
MIT License - see LICENSE file for details
- Built with Streamlit
- Powered by Deepgram Nova-3
- Concurrent processing with Python ThreadPoolExecutor
Contributions welcome! Please open an issue or submit a pull request.
- Deepgram API Docs
- Streamlit Docs
- For bugs, open an issue on GitHub