### Speech Recognition

- Speech recognition is about translating spoken languages into text. 


- There are various real-life applications of speech recognition systems e.g. Apple SIRI which recognizes speech and translates it into text, and acts on the translated text subsequently.


#### 1. How Does it Work

    1.1 Speech input i.e. audio file or captured via microphone
    1.2 Convert physical sounds into electrical signals
    1.3 Convert electrical signals to digital data using analog-to-digital converter
    1.4 Model can be built on digitized data and used to transcribed audion into text


- The Hidden Markov Model (HMM), deep neural network models are used to convert the audio into text. 


- Full details on process is beyond the scope of this blog. 


- This is a simple demonstration on how to convert speech to text using Python. 


- I'll use the 'Speech Recognition' API and 'PyAudio' library.


- The [Speech Recognition API](https://pypi.org/project/SpeechRecognition/) supports several API’s, here I'll be using the Google speech recognition API. 


#### 2. Python Libraries


In [1]:
# Install Speech Recognition API
!pip install SpeechRecognition

Collecting SpeechRecognition
  Downloading SpeechRecognition-3.8.1-py2.py3-none-any.whl (32.8 MB)
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.8.1


You should consider upgrading via the 'c:\users\hakng\anaconda3\python.exe -m pip install --upgrade pip' command.


#### 3. Convert an Audio File into Text

- Import Speech recognition library


- Initialize Google speech recognizer class in order to recognize the speech


- Audio formats supported by the speech recognition API: WAV, AIFF, AIFF-C, FLAC


- I'll be using an audio clip from Winston Churchill's Nazi speech in WAV format.


- By default, the Google recognizer reads English. It supports different languages, for more details please check this [documentation](https://cloud.google.com/speech-to-text/docs/languages).


- There is a maximum time limit for an individual speech recognition 'session' which seems to be around 60 seconds. 

In [3]:
# Import speech recognition library
import speech_recognition as sr

# Initialize recognizer class
r = sr.Recognizer()

def print_audio_text(filename, language_code='EN'):

    # Read audio WAV file as source
    with sr.AudioFile(filename) as source:

        # Parse the audio file and store the transcript 
        audio_text = r.listen(source)

        try:

            # Using Google speech recognition
            text = r.recognize_google(audio_text, language=language_code)

            print('Converting audio into text ...')
            print(text)

        except:

            # The recoginize() method will throw a request error if the API is unreachable
             print('Sorry.. run again...')
                
print_audio_text('nazi.wav')

Converting audio into text ...
many people think that the best way to escape War get to dwell at 4


#### 4. Converting to a Different Language

To read a french language audio file, you'll need to pass the `languageCode` paramter to the recogonize_google function. Refer to the [documentation](https://cloud.google.com/speech-to-text/docs/languages) for more details.

In [4]:
# Adding French langauge code
print_audio_text('french.wav', language_code='fr-FR')

Converting audio into text ...
bonjour monsieur bonjour Maman et Lyndon la secrétaire vous appeler il y a 2 jours pour faire une réservation ah oui monsieur bean donne une chambre seule avec baignoire vous avez réservé pour 5 nuits c'est exact voici notre chambre 409 c'est au 4e étage que c'est la dernière vous passez une bonne soirée merci


#### 5. Microphone Speech into Text

Install the PyAudio library, which enables audio input and output through the microphone and speaker. 

In [1]:
!pip install PyAudio-0.2.11-cp37-cp37m-win_amd64.whl



In [1]:
import speech_recognition as sr

r = sr.Recognizer()

def print_microphone_text(language_code='EN'):

    # Reading microphone as source
    with sr.Microphone() as source:

        print('Please speak into microphone')
        audio_text = r.listen(source)
        print('Time\'s up, thanks')

        try:
            # Use Google speech recognition
            print('Text: ' + r.recognize_google(audio_text, language_code))

        except:
             print('Sorry, I did not get that')
                
print_microphone_text()

Please speak into microphone
Time's up, thanks
Sorry, I did not get that


#### 6. Microphone Speech in a Different Language

As before, passing in the `languageCode` to the `recognize_google` function will allow recognition in the respective language.

In [7]:
print_microphone_text('cmn-Hans-CN')

Please speak into microphone
Time's up, thanks
Sorry, I did not get that
