Audio Transcription

A local CLI tool that transcribes audio files to clean, structured text using OpenAI Whisper. Runs entirely offline — no API keys, no cloud, no cost per minute.

Features

Transcribes any common audio format (.mp3, .wav, .m4a, .ogg, .flac, .aac, .wma)
Outputs structured .txt files with timestamps, metadata header, and wrapped lines
Batch mode — process an entire folder in one command
Auto-detects language or accepts a language hint for faster processing
Supports all Whisper model sizes (tiny → large)
Fully tested — 47 unit tests, no heavy dependencies required to run the test suite

Example Output

========================================
 TRANSCRIPTION
========================================
 File     : recording.mp3
 Date     : 2026-03-30
 Duration : 42:02
 Language : en
 Model    : whisper-medium
========================================

[00:00:00] So this system basically drops to the critical risk
           for real time.

[00:00:03] So this system will have three components.

[00:00:06] It has the data ingestion layer.

...

========================================
 END OF TRANSCRIPTION
========================================

Requirements

Python 3.9+
ffmpeg installed on your system

Installation

# 1. Clone the repo
git clone https://github.com/ShoreDataLabs/audioscribe.git
cd audioscribe

# 2. Install Python dependencies
pip install -r requirements.txt

# 3. Install ffmpeg (macOS)
brew install ffmpeg

# Install ffmpeg (Ubuntu/Debian)
# sudo apt install ffmpeg

# Install ffmpeg (Windows)
# https://ffmpeg.org/download.html

Usage

Transcribe a single file

python transcribe.py audio/recording.mp3

Transcribe a whole folder

python transcribe.py audio/ --batch

Options

positional arguments:
  path                  Path to an audio file, or a directory when using --batch

options:
  --batch               Process all audio files found in the given directory
  --model {tiny,base,small,medium,large}
                        Whisper model size (default: medium)
  --output OUTPUT       Output directory for transcript files (default: ./output)
  --language LANGUAGE   Language code e.g. 'en' or 'fr' (default: auto-detect)

Examples

# Use a larger model for better accuracy
python transcribe.py recording.mp3 --model large

# Skip language detection for faster processing
python transcribe.py recording.mp3 --language en

# Batch transcribe with custom output directory
python transcribe.py recordings/ --batch --output ./transcripts

Model Size Guide

Model	Size	Speed	Accuracy	Best For
tiny	75 MB	Fastest	Good	Quick drafts, clear audio
base	145 MB	Fast	Better	General use
small	465 MB	Medium	Good	Balanced
medium	1.5 GB	Slower	Great	Default — best general accuracy
large	2.9 GB	Slowest	Best	Maximum accuracy, long-form

The model is downloaded automatically on first use and cached locally.

Project Structure

audio-transcription/
├── transcribe.py          # CLI entry point
├── src/
│   ├── validator.py       # File format, existence, and size checks
│   ├── input_handler.py   # Single file and batch directory resolution
│   ├── preprocessor.py    # ffmpeg conversion to 16kHz mono WAV
│   ├── engine.py          # Whisper wrapper with lazy model caching
│   ├── formatter.py       # Structured transcript output with timestamps
│   └── writer.py          # Writes output file, creates directories
├── specs/                 # Acceptance criteria per module
├── tests/                 # 47 unit tests (pytest)
└── output/                # Default transcript output directory (gitignored)

Running Tests

No heavy dependencies needed — Whisper and ffmpeg are fully mocked.

pip install pytest
python -m pytest tests/ -v

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Transcription

Features

Example Output

Requirements

Installation

Usage

Transcribe a single file

Transcribe a whole folder

Options

Examples

Model Size Guide

Project Structure

Running Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
output		output
specs		specs
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
transcribe.py		transcribe.py

Folders and files

Latest commit

History

Repository files navigation

Audio Transcription

Features

Example Output

Requirements

Installation

Usage

Transcribe a single file

Transcribe a whole folder

Options

Examples

Model Size Guide

Project Structure

Running Tests

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages