Skip to content

fopina/voice-control-goof

Repository files navigation

voice-control-goof

Simple library to goof around with voice control using Google unofficial speech-to-text and text-to-speech APIs.
A somewhat configurable module structure to add functionality!

Quickstart (OSX)

  • Install Homebrew

      ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
    
  • Install sox, portaudio, flac and pocketsphinx

      brew install sox
      brew install portaudio
      brew install flac
      brew install cmu-pocketsphinx
    
  • If it's the first time you install a brew package that includes a python module, be sure to follow the warning that brew showed when installing cmu-pocketsphinx

      If you need Python to find the installed site-packages:
      mkdir -p ~/Library/Python/2.7/lib/python/site-packages
      echo '/usr/local/lib/python2.7/site-packages' > ~/Library/Python/2.7/lib/python/site-packages/homebrew.pth
    
  • Install pip and PyAudio

      sudo easy_install pip
      sudo pip install --allow-external PyAudio --allow-unverified PyAudio PyAudio
    
  • Clone this

      git clone https://github.com/fopina/voice-control-goof
    
  • Copy config.py.example to config.py and download Jasper language model and dictionary

      cd voice-control-goof
      cp config.py.example config.py
      curl -O https://raw.githubusercontent.com/jasperproject/jasper-client/master/client/languagemodel_persona.lm
      curl -O https://raw.githubusercontent.com/jasperproject/jasper-client/master/client/dictionary_persona.dic
    
  • Optionally, get your own Google Speech API key and update config.py

  • Goof!

      skmac:voice-control-goof fopina$ p echo_test.py 
      please speak into the microphone
      speech to text...
      hello world (confidence: 0.97335243)
    
      Winner: hello world
    
      text to speech...
      please speak into the microphone
      speech to text...
      please speak into the microphone
      speech to text...
      how are you (confidence: 0.95447284)
      are you
    
      Winner: how are you
    
      text to speech...
    

Dependencies

  • command line MP3 player for TTS (using sox)
  • command line wave2flac conversion tool (using flac)
  • command line tool to downsample WAVE when input rate is above 16k (using sox)

Links

CMU Sphinx - Open Source Toolkit For Speech Recognition
google-speech-v2 - Unofficial Google STT API "documentation"
Chromium Developers - Get your own Google Speech API key
Jasper Project - Control anything with your voice
ZeroKidz - If you don't want to get your own key, maybe you'll find an active one here

About

Goofing around with speech recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages