podcast-transcriber

Script for transcribing audio files into an article format (designed for the Dojima Futures podcast or other podcasts)

Assumptions / Requirements: Python >= 3.10 OpenAI's Whisper model Audacity >= 3.0 for piping Separate audio tracks for each speaker

General process: -look for speaker audio files and load them into Audacity -if there are multiple files for a given speaker, in Audacity sort and merge them into one file -label sounds based on Audacity sound/audio detection -export audio into clips based on labels -transcribe with OpenAI's Whisper model -format using audio timestamps from the labels to split into paragraphs per speaker (and general transcription cleanup)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
podcast-transcriber.py		podcast-transcriber.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

podcast-transcriber

About

Releases

Packages

Languages

License

PurposeUnknown/podcast-transcriptions

Folders and files

Latest commit

History

Repository files navigation

podcast-transcriber

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages