## Python Packages for Speech Recgonition


 - mapiai
 - assemblyai
 - google-cloud-speech
 - pocketsphinx
 - SpeechRecognition
 - watson-developer-cloud
 - wit
    
#### Some of these packages—such as wit and apiai—offer built-in features,
#### like natural language processing for identifying a speaker’s intent,
#### which go beyond basic speech recognition. 
#### Others, like google-cloud-speech, focus solely on speech-to-text conversion.

In [1]:
! pip install SpeechRecognition

Collecting SpeechRecognition
  Downloading https://files.pythonhosted.org/packages/26/e1/7f5678cd94ec1234269d23756dbdaa4c8cfaed973412f88ae8adf7893a50/SpeechRecognition-3.8.1-py2.py3-none-any.whl (32.8MB)
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.8.1


You are using pip version 10.0.1, however version 18.0 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


In [6]:
import speech_recognition as sr
sr.__version__

'3.8.1'

In [7]:
r = sr.Recognizer()

In [8]:
r

<speech_recognition.Recognizer at 0x4e7b5c0>

- recognize_bing(): Microsoft Bing Speech
- recognize_google(): Google Web Speech API
- recognize_google_cloud(): Google Cloud Speech - requires installation of the google-cloud-speech package
- recognize_houndify(): Houndify by SoundHound
- recognize_ibm(): IBM Speech to Text
- recognize_sphinx(): CMU Sphinx - requires installing PocketSphinx
- recognize_wit(): Wit.ai

*Of the seven, only recognize_sphinx() works offline with the CMU Sphinx engine.

- The default key provided by SpeechRecognition is for testing purposes only, and Google may revoke it at any time. 
- It is not a good idea to use the Google Web Speech API in production. 
- Even with a valid API key, you’ll be limited to only 50 requests per day, and there is no way to raise this quota. 
- Fortunately, SpeechRecognition’s interface is nearly identical for each API,

In [9]:
r.recognize_google()

TypeError: recognize_google() missing 1 required positional argument: 'audio_data'

In [10]:
import os as os

In [11]:
os.getcwd()

'C:\\Users\\Dell\\Documents'

In [13]:
os.chdir('C:\\Users\\Dell\\Documents\\python-speech-recognition\\audio_files')

In [14]:
os.listdir()

['harvard.wav', 'jackhammer.wav']

In [15]:
harvard = sr.AudioFile('harvard.wav')
harvard

<speech_recognition.AudioFile at 0x4e840b8>

In [16]:
 with harvard as source:
    audio = r.record(source)

In [17]:
audio

<speech_recognition.AudioData at 0x4e84390>

In [18]:
r.recognize_google(audio)

"best small about beer drinkers it takes heat to bring out the order hold up resource help in West it's ok take always fine with him because of astora my favourite is just for food is Bihar cross bun"

#### actual audio

- the stale smell of old beer lingers 
- it takes heat to bring out the odor 
- a cold dip restores health and zest 
- a salt pickle taste fine with ham
- tacos al Pastore are my favorite
- a zestful food is the hot cross bun'

Lets try more APIs to get a better result

https://azure.microsoft.com/en-in/try/cognitive-services/?api=speech-api

In [20]:
r.recognize_bing(audio,'58d6877152964c718ce36b38db85cce5')

'The stale smell of old beer lingers it takes heat to bring out the older it cold dip restore his health and zest a salt pickle taste fine with Ham tacos al pastore are my favorite a zestful food is the hot cross bun.'

In [21]:
#Bing gave better results

In [26]:
! pip install google-cloud



Enable API at https://console.cloud.google.com/apis/api/speech.googleapis.com/overview?project=subtle-creek-142210
    
 https://console.cloud.google.com/apis/credentials/wizard
 ?project=subtle-creek-142210

In [36]:
! pip install --upgrade google-api-python-client --ignore-installed pyasn1 --user --no-warn-script-location


Collecting google-api-python-client
  Using cached https://files.pythonhosted.org/packages/56/04/5259a17a16a779426f6e2ac62796135b0d4a59cf8033a21037fd4ba5bf81/google_api_python_client-1.7.4-py3-none-any.whl
Collecting pyasn1
  Using cached https://files.pythonhosted.org/packages/d1/a1/7790cc85db38daa874f6a2e6308131b9953feb1367f2ae2d1123bb93a9f5/pyasn1-0.4.4-py2.py3-none-any.whl
Collecting six<2dev,>=1.6.1 (from google-api-python-client)
  Using cached https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Collecting google-auth-httplib2>=0.0.3 (from google-api-python-client)
  Using cached https://files.pythonhosted.org/packages/33/49/c814d6d438b823441552198f096fcd0377fd6c88714dbed34f1d3c8c4389/google_auth_httplib2-0.0.3-py2.py3-none-any.whl
Collecting uritemplate<4dev,>=3.0.0 (from google-api-python-client)
  Using cached https://files.pythonhosted.org/packages/e5/7d/9d5a640c4f8bf2c8b1afc015e9a9d8de32e1

In [39]:
r.recognize_google_cloud

<bound method Recognizer.recognize_google_cloud of <speech_recognition.Recognizer object at 0x0000000004E7B5C0>>

In [40]:
!pip install oauth2client

Collecting oauth2client
  Downloading https://files.pythonhosted.org/packages/95/a9/4f25a14d23f0786b64875b91784607c2277eff25d48f915e39ff0cff505a/oauth2client-4.1.3-py2.py3-none-any.whl (98kB)
Installing collected packages: oauth2client
Successfully installed oauth2client-4.1.3


In [48]:
r.recognize_houndify(audio)

TypeError: recognize_houndify() missing 2 required positional arguments: 'client_id' and 'client_key'

You can send streaming audio to the sample program sample_wave.py. There are two .wav files you can try. You will get back a JSON Response based on the contents of the audio.

https://www.houndify.com/applications/register?newClient=true

In [49]:
r.recognize_houndify(audio,'qK1XpbBJXUFRAfVPhAtMkA==','tvHe-CIPyhtVWnZu7mewDMkgFVpu8uLcwaZWMpApAxIhTUJQEKuugXTATkNJyxGwtn98LEA2MFZ8OKmGz-vYAQ==')

'the stale smell of old beer lingers it takes heat to bring out the odor it cold dip restores health and zest a salt pickled taste fine with him tacos al pastor are my favorite eight zestful food is the hot cross bun'

In [None]:
#slightly different

In [50]:
r.recognize_ibm(audio)

TypeError: recognize_ibm() missing 2 required positional arguments: 'username' and 'password'

get started free https://www.ibm.com/watson/services/speech-to-text/
The Lite plan gets you started with 100 minutes per month at no cost. When you upgrade to a paid plan, you will get access to Customization capabilities.
https://console.bluemix.net/services/speech-to-text/7a8a4c1c-a0fa-4df5-8e29-2d02c587cf6d/?paneId=manage&new=true&env_id=ibm:yp:eu-gb&org=3b0fe479-0791-4eff-b88b-a882a2e4556e&space=ff974e0e-c168-411f-9d0a-f83fcdd61fd2

In [53]:
! pip install --upgrade "watson-developer-cloud>=2.0.1"



Collecting watson-developer-cloud>=2.0.1
  Downloading https://files.pythonhosted.org/packages/d5/e8/bbd1e9fad890008e888536b8cf74beb08d9df12126d32258ce74fb2c0123/watson-developer-cloud-2.0.1.tar.gz (233kB)
Collecting websocket-client==0.47.0 (from watson-developer-cloud>=2.0.1)
  Downloading https://files.pythonhosted.org/packages/9d/fb/f51a03e232e00d6c504dfe815aed090c894ba3f8d3f7fd9612f3e227bf24/websocket_client-0.47.0-py2.py3-none-any.whl (200kB)
Building wheels for collected packages: watson-developer-cloud
  Running setup.py bdist_wheel for watson-developer-cloud: started
  Running setup.py bdist_wheel for watson-developer-cloud: finished with status 'done'
  Stored in directory: C:\Users\Dell\AppData\Local\pip\Cache\wheels\72\da\bd\5803d89075df5870fff6a08276027982888c3d2c766ca83f98
Successfully built watson-developer-cloud
Installing collected packages: websocket-client, watson-developer-cloud
Successfully installed watson-developer-cloud-2.0.1 websocket-client-0.47.0


In [30]:
! pip install pocketsphinx

Collecting pocketsphinx
  Downloading https://files.pythonhosted.org/packages/b2/b7/33ea7440fe7aa0d423210bd418e11d6c29f125fd34e8809bf07cb4aa640d/pocketsphinx-0.1.15-cp35-cp35m-win_amd64.whl (29.1MB)
Installing collected packages: pocketsphinx
Successfully installed pocketsphinx-0.1.15


In [59]:
r.recognize_sphinx(audio)

"this they'll smell of old we're lingers it takes heat to bring out the odor called it restores health and zest case all the colt is fine with him couples all pastore my favorite is as full food is the hot cross mon"

In [60]:
r.recognize_wit(audio)

TypeError: recognize_wit() missing 1 required positional argument: 'key'

key from khttps://wit.ai/ajayohri/MyFirstApp/settings

In [61]:
r.recognize_wit(audio,'L43BGEH2LKOIT4SPCPALQVRANO2HTF5U')

'the stale smell of old beer lingers it takes heat to bring out the odor a cold dip restores health and zest a salt pickle taste fine with ham tacos al Pastore are my favorite a zestful food is be hot cross bun'

In [None]:
#best performance so far