Skip to content

Using the Google Cloud Speech API to transcribe YouTube videos

Notifications You must be signed in to change notification settings

galenballew/speech2text

Repository files navigation

Speech2Text

Using the Google Cloud Speech API to perform NLP and LDA on YouTube video transcriptions

The Challenge: So much of the language we process every day is heard, not written. When thinking of natural language processing, it's important to take that into account. This project serves as a proof of concept platform for accessing the other half of language.

The Toolkit:

  • scikit-learn
  • Google Cloud project
  • gensim
  • numpy
  • MongoDB
  • Google Cloud Speech API
  • Google Cloud Storage API

The Results: The topic extraction was accurate, but not insightful. Increasing the size of the data set or using more nuanced data could yield better results. One possible idea is transcribing stand up comedy over time and grouping comedians by time or topics.

About

Using the Google Cloud Speech API to transcribe YouTube videos

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published