Skip to content

repo for simplifying podcast transcriptions using OpenAI's Whisper and other stuff

License

Notifications You must be signed in to change notification settings

PurposeUnknown/podcast-transcriptions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

podcast-transcriber

Script for transcribing audio files into an article format (designed for the Dojima Futures podcast or other podcasts)

Assumptions / Requirements: Python >= 3.10 OpenAI's Whisper model Audacity >= 3.0 for piping Separate audio tracks for each speaker

General process: -look for speaker audio files and load them into Audacity -if there are multiple files for a given speaker, in Audacity sort and merge them into one file -label sounds based on Audacity sound/audio detection -export audio into clips based on labels -transcribe with OpenAI's Whisper model -format using audio timestamps from the labels to split into paragraphs per speaker (and general transcription cleanup)

About

repo for simplifying podcast transcriptions using OpenAI's Whisper and other stuff

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages