Release v0.3.0 — dataset export mode + clean clip boundaries · mudassar531/hearsay

hearsay 0.3.0

The headline: hearsay now turns media into TTS/STT training datasets (hearsay dataset <SOURCE> + a web UI mode), alongside the existing markdown/JSON engine.

New

Dataset export mode — slice audio on word-level timestamps (never mid-word) into LJSpeech (metadata.csv), NeMo (manifest.jsonl), and HF audiofolder layouts, with a dataset_card.md and dropped.jsonl.
Quality filtering (on by default) + opt-in clipping detection.
Optional speaker diarization via hearsay[diarize] — --diarize / --dominant-speaker (single-voice TTS) / --per-speaker.
--normalize (two-pass EBU R128) and --pad (edge padding) with a de-click fade for clean clip boundaries.
Resumable combined builds for playlists / feeds.

Notes

No new required dependency (audio via the ffmpeg hearsay already needs; diarization is the only, opt-in, extra).
PyPI 0.2.0 (2026-06-14) was a maintenance release of the markdown engine, before the dataset mode existed — hence this is 0.3.0.

Full details in CHANGELOG.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.0 — dataset export mode + clean clip boundaries

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

hearsay 0.3.0

New

Notes

Uh oh!