# Speech Understanding 
# Lecture 9: The SpeechRecognition Module


### Mark Hasegawa-Johnson, KCGI, December 17, 2022

In today's lecture, we will learn how to use the <a href="https://pypi.org/project/SpeechRecognition/">Speech Recognition</a> module in order to access high-performance commercial and open-source speech recognizers.

Here are the contents:
1. <a href="#section_1">Installing SpeechRecognition</a>
1. <a href="#section_2">Using speech_recognition from the microphone</a>
1. <a href="#section_3">Using speech_recognition to perform a web search</a>
1. <a href="#section_4">Using speech_recognition from an audio file</a>
1. <a href="#homework">Homework</a>


<a id='section_1'></a>

## 1. Installing SpeechRecognition

The SpeechRecognition module is installed using pip and conda.  If you have anaconda installed, you can try the following two commands:

In [8]:
!pip freeze

anyio==3.6.2
appdirs==1.4.4
appnope==0.1.3
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
asttokens==2.0.8
attrs==22.1.0
audioread==3.0.0
backcall==0.2.0
beautifulsoup4==4.11.1
bleach==5.0.1
certifi==2022.9.24
cffi==1.15.1
charset-normalizer==2.1.1
contourpy==1.0.5
cycler==0.11.0
debugpy==1.6.3
decorator==5.1.1
defusedxml==0.7.1
entrypoints==0.4
executing==1.1.1
fastjsonschema==2.16.2
fonttools==4.38.0
idna==3.4
importlib-metadata==5.0.0
ipykernel==6.16.2
ipython==8.5.0
ipython-genutils==0.2.0
ipywebrtc==0.6.0
ipywidgets==8.0.2
jedi==0.18.1
Jinja2==3.1.2
joblib==1.2.0
jsonschema==4.16.0
jupyter==1.0.0
jupyter-console==6.4.4
jupyter-server==1.21.0
jupyter_client==7.4.4
jupyter_core==4.11.2
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.3
kiwisolver==1.4.4
librosa==0.9.2
llvmlite==0.39.1
MarkupSafe==2.1.1
matplotlib==3.6.1
matplotlib-inline==0.1.6
mistune==2.0.4
nbclassic==0.4.5
nbclient==0.7.0
nbconvert==7.2.3
nbformat==5.7.0
nes

In [1]:
!pip install SpeechRecognition

Collecting SpeechRecognition
  Downloading SpeechRecognition-3.9.0-py2.py3-none-any.whl (32.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m32.8/32.8 MB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.9.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3[0m[39;49m -> [0m[32;49m22.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [1]:
!pip install pyaudio

Collecting pyaudio
  Using cached PyAudio-0.2.12.tar.gz (42 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h    Preparing wheel metadata ... [?25ldone
[?25hBuilding wheels for collected packages: pyaudio
  Building wheel for pyaudio (PEP 517) ... [?25ldone
[?25h  Created wheel for pyaudio: filename=PyAudio-0.2.12-cp39-cp39-macosx_10_9_universal2.whl size=38250 sha256=c5c0f2cfb3ac0a658e0f4641a84b9b2641ad5ae85b52e8025529b8d55a296db6
  Stored in directory: /Users/yuuiri/Library/Caches/pip/wheels/66/6c/95/76afa159d5be5215b6f6499625a0ddf1134e60f85beeba6e7a
Successfully built pyaudio
Installing collected packages: pyaudio
Successfully installed pyaudio-0.2.12
You should consider upgrading via the '/Users/yuuiri/PycharmProjects/college/venv/bin/python -m pip install --upgrade pip' command.[0m


The SpeechRecognition package is a python user interface that connects, in the back end, to many different speech recognizers, see: https://pypi.org/project/SpeechRecognition/

To start with, let's use the Google speech recognizer.  This one only works if you're connected to the internet.

In [2]:
import speech_recognition as sr
speech = sr.Recognizer()
print('Python is listening...')
with sr.Microphone() as source:
    speech.adjust_for_ambient_noise(source)
    audio = speech.listen(source)
    inp = speech.recognize_google(audio)
    print('You just said',inp,'.')

ModuleNotFoundError: No module named 'speech_recognition'

<a id='section_2'></a>

## 2. Using SpeechRecognition

We can use python's <a href="https://docs.python.org/3/tutorial/errors.html">exception handling</a> in case the speech recognizer has trouble recognizing what we say:

In [None]:
import speech_recognition as sr
speech = sr.Recognizer()

while True:
    print('Python is listening...')
    with sr.Microphone() as source:
        speech.adjust_for_ambient_noise(source)
        try:
            audio = speech.listen(source)
            inp = speech.recognize_google(audio)
            print('You just said',inp,'.')
        except sr.UnknownValueError:
            continue
        except sr.RequestError:
            continue
        except sr.WaitTimeoutError:
            continue
        if inp=="stop listening":
            print('Goodbye!')
            break

<a id='section_3'></a>

## 3. Using SpeechRecognizer to search the web

The speech recognizer can now be used to give text input for any application.  For example, let's try using it to search the web.  

To start with, here's how we open a web page in python:


In [None]:
import webbrowser
webbrowser.open("http://wsj.com")

Now let's use the speech recognizer to input the web page:

In [None]:
import speech_recognition as sr
import webbrowser
speech = sr.Recognizer()

while True:
    print('Python is listening...')
    with sr.Microphone() as source:
        speech.adjust_for_ambient_noise(source)
        try:
            audio = speech.listen(source)
            inp = speech.recognize_google(audio)
            print('You just said',inp,'.')
            inp.replace('browser ', '')
            webbrowser.open("http://" + inp)
        except sr.UnknownValueError:
            continue
        except sr.RequestError:
            continue
        except sr.WaitTimeoutError:
            continue
        if inp=="stop listening":
            print('Goodbye!')
            break

Finally, let's use speech recognition to perform a web search.  To do that, all we need is to replace this line:

```webbrowser.open("http://" + inp)```

...with this one:

```webbrowser.open("http://google.com/search?q=" + inp)```

In [None]:
import speech_recognition as sr
import webbrowser
speech = sr.Recognizer()

while True:
    print('Python is listening...')
    with sr.Microphone() as source:
        speech.adjust_for_ambient_noise(source)
        try:
            audio = speech.listen(source)
            inp = speech.recognize_google(audio)
            print('You just said',inp,'.')
            inp.replace('browser ', '')
            webbrowser.open("http://google.com/search?q=" + inp)
        except sr.UnknownValueError:
            continue
        except sr.RequestError:
            continue
        except sr.WaitTimeoutError:
            continue
        if inp=="stop listening":
            print('Goodbye!')
            break

<a id="section_4"></a>

## 4. Using speech_recognition from an audio file

If you have an audio file, you can use the speech_recognition module to transcribe it.  For example, let's download the audio file we used in lecture 4:

In [None]:
import urllib.request, soundfile, IPython

example_url = "https://catalog.ldc.upenn.edu/desc/addenda/LDC93S1.wav"
webdata = urllib.request.urlopen(example_url).read()
with open("webdata.wav", "wb") as f:
    f.write(webdata)
    
speech_wave, speech_rate = soundfile.read("webdata.wav")
IPython.display.Audio(data=speech_wave, rate=speech_rate)

Now let's use speech_recognition to transcribe it:

In [None]:
import speech_recognition as sr
speech = sr.Recognizer()
with sr.AudioFile("webdata.wav") as source:
    audio = speech.record(source)
    inp = speech.recognize_google(audio)
    print('The person in this audio file said:',inp,'.')

<a id='homework'></a>

## Homework for Week 9

Create a text file called `week9.py`.

This file should `def` a function called `transcribe_wavefile`, with the following parameters:
* Input: str = name of the input file
* Return: str = recognized text  

Here is a template that you can cut and paste, to get started:

In [None]:
import speech_recognition as sr

def transcribe_wavefile(filename):
    '''
    Use sr.Recognizer.AudioFile(filename) as the source,
    recognize from that source,
    and return the recognized text.
    '''
    raise "You need to write this part!"
    return inp

Test whether your code works by running the following block:

In [3]:
import week9, importlib
importlib.reload(week9)

inp = week9.transcribe_wavefile("webdata.wav")
print(inp)

result2:
{   'alternative': [   {   'confidence': 0.83203697,
                           'transcript': 'she has a duck suit and Gracie '
                                         'washer all year'},
                       {   'transcript': 'she has a duck suit and greasy '
                                         'washer all year'},
                       {   'transcript': 'she has a duck suit and greasy water '
                                         'all year'},
                       {   'transcript': 'she has a duck suit and greasy wash '
                                         'water all year'},
                       {   'transcript': 'she has a duck suit and Gracie wash '
                                         'water all year'}],
    'final': True}
The person in this audio file said: she has a duck suit and Gracie washer all year .
she has a duck suit and Gracie washer all year


When the block above is working, try uploading your text file `week9.py` to <a href="https://www.gradescope.com/">Gradescope</a>.  The autograder checks the following things:

1. Did you submit a text file called `week9.py`?
1. Does your text file contains a method called `transcribe_wavefile`?
1. Does `week9.transcribe_wavefile` return a string?
1. Does `week9.transcribe_wavefile("webdata.wav")` return the string `she has a duck suit and Gracie washer all year`?
1. Does `week9.transcribe_wavefile` also work if applied to a secret audio file that is different from `webdata.wav`?