delorfin / Speech-to-text Public

forked from Sundar0989/Speech-to-text

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Python script for converting speech to text. Uses Google Cloud Speech-to-Text API. Suitable for long audio/video files.

0 stars 78 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
main.py		main.py

Repository files navigation

Speech-to-text

Python script for converting speech to text. Uses Google Cloud Speech-to-Text API. Suitable for long audio/video files.

Originally based on Sundar Krishnan’s work.

Improvements

The code actually works now
Detects language automatically (from up to 6 predefined options)
Supports any audio or video format as a source, not just MP3
Converts to Opus format, which takes less space → faster upload
Supports source files of any length (previously was up to 100 MB)
Removes temporary files from disk after transcribing
Provides succinct verbose output for every stage of the process
Works on Windows, too

Installation

Set up Google Cloud stuff: do the 6 steps
Create a storage bucket
Install ffmpeg
Install all the dependencies for the .py
Change the settings in the top of the .py for your needs

Usage

Put your audio or video files to a specified folder
Run .py
Do your stuff (the whole process will take about 50–80% of your files duration)
Gather your transcripts from another folder

About

Python script for converting speech to text. Uses Google Cloud Speech-to-Text API. Suitable for long audio/video files.

speech-to-text google-cloud-speech

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%