## <b><font color='darkblue'>Prefac</font></b>
([article source](https://towardsdatascience.com/easy-speech-to-text-with-python-3df0d973b426)) ([article source](https://towardsdatascience.com/easy-speech-to-text-with-python-3df0d973b426)) <b><font size='3ptx'>Speech is the most common means of communication and the majority of the population in the world relies on speech to communicate with one another. Speech recognition system basically translates spoken languages into text.</font> There are various real-life examples of speech recognition systems. For example, Apple SIRI which recognize the speech and truncates into text.</b>

### <b><font color='darkgreen'>How does Speech recognition work?</font></b>
<b><font size='3ptx'>Hidden Markov Model (HMM), deep neural network models are used to convert the audio into text. </font></b>

![SST process](images/sst_process.PNG)

A full detailed process is beyond the scope of this blog. In this blog, I am demonstrating how to convert speech to text using Python. This can be done with the help of the “[**Speech Recognition**](https://pypi.org/project/SpeechRecognition/)” API and “[**PyAudio**](https://pypi.org/project/PyAudio/)” library.

<b>Speech Recognition API supports several API’s, in this blog I used Google speech recognition API</b>. For more details, please check [this](https://pypi.org/project/SpeechRecognition/). It helps to translate for converting speech into text.

### <b><font color='darkgreen'>Python Libraries</font></b>

In [1]:
#!pip install SpeechRecognition

## <b><font color='darkblue'>Convert an audio file into text</font></b>
Below are the steps to convert an audio file into text:
1. Import Speech recognition library
2. Initializing recognizer class in order to recognize the speech. We are using google speech recognition.
3. Audio file supports by speech recognition: wav, AIFF, AIFF-C, FLAC. I used ‘wav’ file in this example
4. I have used ‘taken’ movie audio clip which says “I don’t know who you are I don’t know what you want if you’re looking for ransom I can tell you I don’t have money”
5. By default, google recognizer reads English. It supports different languages, for more details please check this documentation.

### <b><font color='darkgreen'>Simple Code example</font></b>
Here we are using below wave file as example to test SST API:
- [datas/OSR_us_000_0010_8k.wav](https://www.voiptroubleshooter.com/open_speech/american/OSR_us_000_0010_8k.wav)

In [2]:
#import library
import speech_recognition as sr

TEST_WAVE_FILE_PATH = 'datas/OSR_us_000_0010_8k.wav'

In [3]:
# Initialize recognizer class (for recognizing the speech)
r = sr.Recognizer()

In [4]:
# Reading Audio file as source
# listening the audio file and store in audio_text variable

with sr.AudioFile(TEST_WAVE_FILE_PATH) as source:    
    audio_text = r.listen(source)

In [6]:
# recoginize_() method will throw a request error if the API is unreachable, hence using exception handling
try: 
    # using google speech recognition
    text = r.recognize_google(audio_text)
    print('Converting audio transcripts into text ...')
    print(text) 
except:
    print('Sorry.. run again...')

Converting audio transcripts into text ...
do birds canoe slid on the smooth planks glue the sheet to the dark blue background it is easy to tell the depth of a well these days a chicken leg is a rare dish rice is often served in round Bowls the juice of lemons makes fine punch the box was thrown beside the park truck the dogs are fed chopped corn and garbage


### <b><font color='darkgreen'>How about converting different audio language?</font></b>
For example, if we want to read a french language audio file, then <b>need to add language option in the recogonize_google. Remaining code remains the same. Please refer more on [the documentation](https://cloud.google.com/speech-to-text/docs/languages):
```python
#Adding french langauge option
text = r.recognize_google(audio_text, language = "fr-FR")
```

### <b><font color='darkgreen'>Microphone speech into text</font></b>
In order to receive audio from the microphone, we follow below steps to achieve it:
1. We need to install PyAudio library which used to receive audio input and output through the microphone and speaker. Basically, it helps to get our voice through the microphone.

In [8]:
#!pip install PyAudio

2. Instead of audio file source, we have to use the Microphone class. Remaining steps are the same.
  
```python
#import library

import speech_recognition as sr

# Initialize recognizer class (for recognizing the speech)

r = sr.Recognizer()

# Reading Microphone as source
# listening the speech and store in audio_text variable

with sr.Microphone() as source:
    print("Talk")
    audio_text = r.listen(source)
    print("Time over, thanks")
# recoginize_() method will throw a request error if the API is unreachable, hence using exception handling
    
    try:
        # using google speech recognition
        print("Text: "+r.recognize_google(audio_text))
    except:
         print("Sorry, I did not get that")
```

### <b><font color='darkgreen'>How about talking in a different language?</font></b>
Again, we need to add the required language option in the <font color='blue'>recognize_google()</font>. I am talking in Tamil, Indian language and adding “ta-IN” in the language option:
```python
# Adding "tamil language"
print(“Text: “+r.recognize_google(audio_text, language = “ta-IN”))
```

## <b><font color='darkblue'>Summary</font></b>
<b><font size='3ptx'>Google speech recognition API is an easy method to convert speech into text, but it requires an internet connection to operate.</font></b>

In this blog, we have seen how to convert the speech into text using Google speech recognition API. This would be very helpful for NLP projects especially handling audio transcripts data.  You can also read [this article on KDnuggets](https://www.kdnuggets.com/2020/06/easy-speech-text-python.html).

## <b><font color='darkblue'>Supplement</font></b>
* [OPEN - The Open Speech Repository](https://www.voiptroubleshooter.com/open_speech/american.html)