For below, I followed this page https://realpython.com/python-speech-recognition/

In [1]:
! pip install SpeechRecognition



In [2]:
import speech_recognition as sr

# The Recognizer Class
All of the magic in SpeechRecognition happens with the Recognizer class.

To create a Recognizer instance, just type:

In [3]:
r = sr.Recognizer()

In [4]:
r.recognize_google()

TypeError: recognize_google() missing 1 required positional argument: 'audio_data'

There are 7 recognize_() methods of the Recognizer class. In each case, ***audio_data must be an instance of SpeechRecognition’s AudioData class.***

**There are two ways to create an AudioData instance: from an audio file or audio recorded by a microphone** (perfect for us!).

# Audio File Example
Downloading `harvard.wav` from this link https://github.com/realpython/python-speech-recognition/tree/master/audio_files, or use link I provided in Slack

In [5]:
harvard = sr.AudioFile('harvard.wav')

In [6]:
with harvard as source:
    audio = r.record(source)

You can now invoke recognize_google() to attempt to recognize any speech in the audio. Depending on your internet connection speed, you may have to wait several seconds before seeing the result.

Note: Of the 7 r.recognize files to choose from, SpeechRecognition ships with a default API key for the Google Web Speech API, this is easiest for us to get started with. The other six APIs all require authentication with either an API key or a username/password combination. There may be a limit with the number of google audio files (indicates 50 per day per user so if this becomes a problem we'll switch and look into API's).

In [7]:
r.recognize_google(audio)

'the stale smell of old beer lingers it takes heat to bring out the odor a cold dip restores health and zest a salt pickle taste fine with ham tacos al Pastore are my favorite a zestful food is be hot cross bun'

This is a good transcription - one thing to note is that pauses are not taken into account for new sentences, so no periods. This was also a very clean audio file if you listen to it - only word it got incorrectly was the final 'be' which was supposed to be 'the'

Note: There are also ways to deal with ambient noise in the tutorial I watched, however, I think we can have our demos assuming perfectly clear audio. There is also a way to return a dictionary of other possible sentences if we really need those. For now this is good

# Audio with microphone (probably not necessary immediately, but why not)

In [8]:
!brew install portaudio
!pip install pyaudio

Updating Homebrew...
[34m==>[0m [1mAuto-updated Homebrew![0m
Updated 3 taps (homebrew/cask-versions, homebrew/core and homebrew/cask).
[34m==>[0m [1mUpdated Formulae[0m
[1mpandoc [32m✔[0m[0m            binaryen            libnotify           wireguard-tools
[1mwget [32m✔[0m[0m              istioctl            links               youtube-dl
algernon            krakend             lmod
apr                 libcroco            ruby@2.4
[34m==>[0m [1mDeleted Formulae[0m
ruby@2.3

To reinstall 19.6.0, run `brew reinstall portaudio`


In [9]:
import speech_recognition as sr

In [10]:
# create instance of recognizer class
r = sr.Recognizer()

Now, instead of using an audio file as the source, you will use the default system microphone. You can access this by creating an instance of the Microphone class

In [11]:
# create instance of microphone class
mic = sr.Microphone()

You can get a list of microphone names available by calling the list_microphone_names() static method of the Microphone class.

In [12]:
sr.Microphone.list_microphone_names()

['Built-in Microphone', 'Built-in Output', 'Multi-Output Device']

In [13]:
# This is just an example of how to switch to different microphones, no need to run
# mic = sr.Microphone(device_index=3)

### Start speaking as soon as you run the code below, it will stop on its own

In [14]:
with mic as source:
    audio = r.listen(source)

### transcripton of text below

In [15]:
r.recognize_google(audio)

'testing microphone one two three testing this microphone'