AudioCrowd

⚠️ WARNING — project unreliable, no longer used

AudioCrowd ended up being unreliable in practice. I have switched to my own fork of Mozilla Common Voice, with my upstream changes to make self-hosting easier: https://github.com/thiswillbeyourgithub/common-voice/tree/enh-self-hosting

(Note: this is my personal fork — thiswillbeyourgithub/common-voice, branch enh-self-hosting — not the official Mozilla repo.)

This repository is left here for reference but is not recommended for new deployments.

A collaborative Gradio web UI where multiple volunteers record themselves speaking sentences to build an ASR (Automatic Speech Recognition) fine-tuning dataset.

Features

Multi-user support: volunteers authenticate via a simple CSV file and record simultaneously without conflicts
Automatic sentence assignment: each user gets 5 sentences from a shared pool; new sentences are drawn automatically as recordings are completed
Auto-save: recordings are saved as soon as the user stops recording -- no manual save button
Audio processing: recordings are converted to 16 kHz mono WAV with silence trimming
NeMo-compatible output: metadata is appended to a JSONL manifest compatible with NVIDIA NeMo
Flagging: users can flag problematic samples for later review
Skip & discard: skip unwanted sentences or discard mispronounced recordings
Bilingual UI: English and French, auto-detected from browser or forced via config

Keyboard shortcuts

Key	Action
Space	Start/stop recording
R	Reset and restart recording
S	Skip current sentence
D	Discard last recording
F	Flag current sample (toggle)
G	Flag previous sample (toggle)

Quick start

With uv (no Docker)

# Prepare a JSONL file with one {"text": "..."} per line
# Prepare a CSV file with username,password rows (no header)
uv run AudioCrowd.py sentences.jsonl --users-csv users.csv

Full options:

uv run AudioCrowd.py sentences.jsonl \
  --users-csv users.csv \
  --salt mysalt \
  --output-dir ./recordings/ \
  --output-jsonl ./output.jsonl \
  --port 7860 \
  --share \
  --lang fr

With Docker

cd ./docker
cp env_file.example env_file
# Edit env_file with your settings (JSONL_PATH, USERS_CSV, etc.)
docker compose up --build

The app is exposed on port 7760 by default (mapped to 7860 inside the container). Mount your dataset directory and recordings are persisted to ./recordings/ on the host.

Input format

A JSONL file with at least a text field per line:

{"text": "The patient presents with acute symptoms."}
{"text": "Administer 500mg of amoxicillin twice daily."}

NeMo-format lines with audio_filepath/duration fields are also accepted; only text is used.

Output format

WAV files are saved as {userid}_{uuid4[:8]}.wav in the output directory. The JSONL manifest contains:

{"audio_filepath": "recordings/f3a1b2c3d4e5_a1b2c3d4.wav", "text": "The patient presents with...", "duration": 3.42, "timestamp": "2026-03-06T14:23:01+00:00", "userid": "f3a1b2c3d4e5", "sentence_index": 42}
{"audio_filepath": "recordings/f3a1b2c3d4e5_b2c3d4e5.wav", "text": "Flagged example...", "duration": 2.10, "timestamp": "2026-03-06T14:24:00+00:00", "userid": "f3a1b2c3d4e5", "sentence_index": 43, "flagged": true}

Tech stack

Python + Gradio -- single-file app launched via uv run (PEP 723 inline metadata)
click for CLI argument parsing
loguru for logging (stderr + audiodataset.log)
soundfile + numpy for audio processing
fcntl.flock for cross-process file locking (concurrent multi-user safety)

Alternatives

If you need a more full-featured, production-ready crowdsourcing platform, consider Mozilla Common Voice — an open-source initiative for collecting speech data in many languages. It can also be self-hosted via its Docker Compose setup.

AudioCrowd is intentionally simpler: a single-file app for quickly spinning up a private recording session with a specific sentence list and a known group of volunteers.

License

AGPLv3

Built with Claude Code.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docker		docker
docs		docs
.gitignore		.gitignore
AudioCrowd.py		AudioCrowd.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudioCrowd

Features

Keyboard shortcuts

Quick start

With uv (no Docker)

With Docker

Input format

Output format

Tech stack

Alternatives

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AudioCrowd

Features

Keyboard shortcuts

Quick start

With uv (no Docker)

With Docker

Input format

Output format

Tech stack

Alternatives

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages