Skip to content

Python script for converting speech to text. Uses Google Cloud Speech-to-Text API. Suitable for long audio/video files.

Notifications You must be signed in to change notification settings

delorfin/Speech-to-text

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Speech-to-text

Python script for converting speech to text. Uses Google Cloud Speech-to-Text API. Suitable for long audio/video files.

Originally based on Sundar Krishnan’s work.

Improvements

  • The code actually works now
  • Detects language automatically (from up to 6 predefined options)
  • Supports any audio or video format as a source, not just MP3
  • Converts to Opus format, which takes less space → faster upload
  • Supports source files of any length (previously was up to 100 MB)
  • Removes temporary files from disk after transcribing
  • Provides succinct verbose output for every stage of the process
  • Works on Windows, too

Installation

  1. Set up Google Cloud stuff: do the 6 steps
  2. Create a storage bucket
  3. Install ffmpeg
  4. Install all the dependencies for the .py
  5. Change the settings in the top of the .py for your needs

Usage

  1. Put your audio or video files to a specified folder
  2. Run .py
  3. Do your stuff (the whole process will take about 50–80% of your files duration)
  4. Gather your transcripts from another folder

About

Python script for converting speech to text. Uses Google Cloud Speech-to-Text API. Suitable for long audio/video files.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%