In [1]:
# pip install SpeechRecognition
import speech_recognition as sr
sr.__version__

'3.8.1'

In [2]:
# Creating a Recognizer instance 
r = sr.Recognizer()

### Each Recognizer instance has seven methods for recognizing speech from an audio source using various APIs

- recognize_bing(): Microsoft Bing Speech

- recognize_google(): Google Web Speech API

- recognize_google_cloud(): Google Cloud Speech - requires installation of the google-cloud-speech package

- recognize_houndify(): Houndify by SoundHound

- recognize_ibm(): IBM Speech to Text

- recognize_sphinx(): CMU Sphinx - requires installing PocketSphinx

- recognize_wit(): Wit.ai

In [3]:
my_audio_file = sr.AudioFile('bassel_audio.wav')
with my_audio_file as source:
    audio = r.record(source) 

The context manager opens the file and reads its contents, storing the data in an AudioFile instance 
called source. Then the record() method records the data from the entire file into an AudioData instance.

In [4]:
type(audio) 

speech_recognition.AudioData

### Supported File Types

Currently, SpeechRecognition supports the following file formats:

- WAV: must be in PCM/LPCM format
- AIFF
- AIFF-C
- FLAC: must be native FLAC format; OGG-FLAC is not supported

In [5]:
r.recognize_google(audio)

'this is a test file recording live from Great Falls Virginia'

In [6]:

my_audio_file = sr.AudioFile('bassel_audio.wav')
with my_audio_file as source:
    audio = r.record(source,offset= 2) # Capturing the first 2 seconds of the audio file

In [7]:
r.recognize_google(audio)

'recording live from Great Falls Virginia'

In [8]:
my_audio_file = sr.AudioFile('bassel_audio.wav')
with my_audio_file as source:
# The offset and duration keyword arguments are useful for segmenting an audio file if you have prior knowledge 
# of the structure of the speech in the file
    audio = r.record(source, offset=2, duration=2) 

In [9]:
r.recognize_google(audio)

'recording live from'

In [10]:
my_audio_file = sr.AudioFile('OSR_us_000_0030_8k.wav')
with my_audio_file as source:
    audio = r.record(source)

In [11]:
r.recognize_google(audio)

"painted the sockets in the wall dull green the child crawled into the dense grass bribes fail we're honest men work trample the spark else the Flames will spread the hilt of the sword was carved with Fine Designs a round hole was drilled through the thin board Footprints showed the path he took up the beach she was waiting at my front lawn event near the edge brought in fresh air fraud the old mule with a Crooked Stick"

## Working With Microphones


In [None]:
# pip install pyaudio

In [12]:
# Create instance of the microphone class
mic = sr.Microphone()

In [13]:
sr.Microphone.list_microphone_names()

['DisplayPort', 'MacBook Pro Microphone', 'MacBook Pro Speakers']

In [14]:
with mic as source:
    audio = r.listen(source)

In [15]:
r.recognize_google(audio)

'solai from data science in Flatiron show me what you can do'

In [16]:
# handling ambient noise
# Make some noise while I record
with mic as source:
    r.adjust_for_ambient_noise(source)
    audio = r.listen(source)

In [17]:
r.recognize_google(audio)

"is it me you're looking for"

In [20]:
# In French

with mic as source:
    r.adjust_for_ambient_noise(source)
    audio = r.listen(source)

In [21]:
r.recognize_google(audio, language='fr-FR')

'maintenant on va faire en français'