Simple library to goof around with voice control using Google unofficial speech-to-text and text-to-speech APIs.
A somewhat configurable module structure to add functionality!
-
Install Homebrew
ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
-
Install sox, portaudio, flac and pocketsphinx
brew install sox brew install portaudio brew install flac brew install cmu-pocketsphinx
-
If it's the first time you install a brew package that includes a python module, be sure to follow the warning that brew showed when installing cmu-pocketsphinx
If you need Python to find the installed site-packages: mkdir -p ~/Library/Python/2.7/lib/python/site-packages echo '/usr/local/lib/python2.7/site-packages' > ~/Library/Python/2.7/lib/python/site-packages/homebrew.pth
-
Install pip and PyAudio
sudo easy_install pip sudo pip install --allow-external PyAudio --allow-unverified PyAudio PyAudio
-
Clone this
git clone https://github.com/fopina/voice-control-goof
-
Copy config.py.example to config.py and download Jasper language model and dictionary
cd voice-control-goof cp config.py.example config.py curl -O https://raw.githubusercontent.com/jasperproject/jasper-client/master/client/languagemodel_persona.lm curl -O https://raw.githubusercontent.com/jasperproject/jasper-client/master/client/dictionary_persona.dic
-
Optionally, get your own Google Speech API key and update config.py
-
Goof!
skmac:voice-control-goof fopina$ p echo_test.py please speak into the microphone speech to text... hello world (confidence: 0.97335243) Winner: hello world text to speech... please speak into the microphone speech to text... please speak into the microphone speech to text... how are you (confidence: 0.95447284) are you Winner: how are you text to speech...
- command line MP3 player for TTS (using sox)
- command line wave2flac conversion tool (using flac)
- command line tool to downsample WAVE when input rate is above 16k (using sox)
CMU Sphinx - Open Source Toolkit For Speech Recognition
google-speech-v2 - Unofficial Google STT API "documentation"
Chromium Developers - Get your own Google Speech API key
Jasper Project - Control anything with your voice
ZeroKidz - If you don't want to get your own key, maybe you'll find an active one here