**TEXT CHARACTERIZATION USING SPEECH RECOGNITION**

In this project, we'll explore the capabilities of speech recognition libraries to analyze audio speech files.

 We'll do this by either importing pre-recorded audio files or directly recording and analyzing spoken words. Through this project, we'll learn various techniques to process and manipulate the converted text data obtained from speech to text conversion. We'll delve into different operations that can be performed on the transcribed speech data, gaining insights and understanding from the spoken content.

Project Timeline:
1. Importing necessary libraries
2. Utilizing speech recognition tools
3. Examining the audio data
4. Analyzing the lyrics of a song

#Import Statements

In [2]:
!pip install SpeechRecognition

Collecting SpeechRecognition
  Downloading SpeechRecognition-3.10.4-py2.py3-none-any.whl (32.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m32.8/32.8 MB[0m [31m36.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.10.4


In [3]:
import pandas as pd
import numpy as np
import speech_recognition as sr

In [4]:
import IPython.display as ipd

In [10]:
recognizer = sr.Recognizer()
audio_file = sr.AudioFile("voice-data.wav")
type(audio_file)

**Using Google Web Speech API**

This is a free web service provided by Google i.e. Recognize Google, through which we can convert audio speech files to text and perform operations on them.

In [14]:
# recognizer = sr.Recognizer()
# audio_file = sr.AudioFile("voice-data.wav")
# type(audio_file)


# with audio_file as source:
#     audio_file = recognizer.record(source)
#     result=recognizer.recognize_google(audio_data=audio_file)


# with audio_file_ as source:
#     audio_file = recognizer.record(source, duration = 5.0)
#     result1 = recognizer.recognize_google(audio_data=audio_file)


In [11]:
with audio_file as source:
    audio_file = recognizer.record(source)
    result=recognizer.recognize_google(audio_data=audio_file)

There are two types of taking input :

1) Duration : This is used to select a specific time of audio data i.e. if you want to select just 5 seconds then you can set Duration= 5 and it will only select the 5 seconds of audio file

2) Offset : This is used when you cut out some part of the starting data from your audio file i.e. if you don't want first 2 seconds then you can select offset as 2 and it will skip the first two seconds.

In [12]:
# with audio_file_ as source:
#     audio_file = recognizer.record(source, duration = 5.0)
#     result1 = recognizer.recognize_google(audio_data=audio_file)

In [15]:
import speech_recognition as sr

recognizer = sr.Recognizer()
audio_file = sr.AudioFile("voice-data.wav")

# with audio_file as source:
#     audio_data = recognizer.record(source)
#     result = recognizer.recognize_google(audio_data=audio_data)

# print(result)

# You can also use another with block to record audio for a specific duration
with audio_file as source:
    audio_data = recognizer.record(source, duration=5.0)
    result1 = recognizer.recognize_google(audio_data=audio_data)

print(result1)


Eden awesome okay would you be then


In [16]:
with audio_file as source:
    audio_file_ = recognizer.record(source, duration = 5.0)
    result1 = recognizer.recognize_google(audio_data=audio_file_)

In [17]:
with audio_file as source:
    audio_file_ = recognizer.record(source, offset = 2.0)
    result2 = recognizer.recognize_google(audio_data=audio_file_)

Combining duration and offset:

In [19]:
with audio_file as source:
    audio_file_ = recognizer.record(source, duration= 5.0, offset = 2.0)
    result3 = recognizer.recognize_google(audio_data=audio_file_)

Comparing the results:

In [21]:
print(result)
print("---------------------------------------------------------------------------------")
print(result1)
print("---------------------------------------------------------------------------------")

print(result2)
print("---------------------------------------------------------------------------------")

print(result3)

Eden would you be then you know locking this meeting in so that we don't have no participants because once the division of team is done and stuff like that what does it or is it okay if people if more people join in the game you can play I think we have about 18 people like that most people would know about it so we playing code names guys it's a fun board game just just enjoy yourself have people join in case they want to do and then once we started okay so let's get started my screen guys like this is the time for rules if you don't understand now so the game that you're going to play is called code names and in the game you're all going to be like spy agents so much and the other side of team will look at it and before I explained for the how many of you know the game coordinates how many of you play the game for kids before Akash kindly call me if I'm going wrong somewhere else play coordinates it's good for us because we're going to be done came today so basically put up into two 

**Effect Of Noise:**

Noise in the background data can create disturbances or error in the results hence it is necessary to remove the noise from the audio file


In [23]:
with audio_file as source:
    recognizer.adjust_for_ambient_noise(source, duration=0.5)
    audio = recognizer.record(source)

result4= recognizer.recognize_google(audio)

In [24]:
print(result4)

would you be then you know locking this meeting in so that we don't have no participants because once the division of team is done and stuff like that okay or does it or is it okay if people if more people join in the game you can I think we have about 18 people and stuff like that most people would know about it so we playing code names guys it's a fun board game just just enjoy yourself happy okay so let's get started my screen guys like this time for


In [25]:
result_str= result.split(' ')

In [26]:
result_str

['Eden',
 'would',
 'you',
 'be',
 'then',
 'you',
 'know',
 'locking',
 'this',
 'meeting',
 'in',
 'so',
 'that',
 'we',
 "don't",
 'have',
 'no',
 'participants',
 'because',
 'once',
 'the',
 'division',
 'of',
 'team',
 'is',
 'done',
 'and',
 'stuff',
 'like',
 'that',
 'what',
 'does',
 'it',
 'or',
 'is',
 'it',
 'okay',
 'if',
 'people',
 'if',
 'more',
 'people',
 'join',
 'in',
 'the',
 'game',
 'you',
 'can',
 'play',
 'I',
 'think',
 'we',
 'have',
 'about',
 '18',
 'people',
 'like',
 'that',
 'most',
 'people',
 'would',
 'know',
 'about',
 'it',
 'so',
 'we',
 'playing',
 'code',
 'names',
 'guys',
 "it's",
 'a',
 'fun',
 'board',
 'game',
 'just',
 'just',
 'enjoy',
 'yourself',
 'have',
 'people',
 'join',
 'in',
 'case',
 'they',
 'want',
 'to',
 'do',
 'and',
 'then',
 'once',
 'we',
 'started',
 'okay',
 'so',
 "let's",
 'get',
 'started',
 'my',
 'screen',
 'guys',
 'like',
 'this',
 'is',
 'the',
 'time',
 'for',
 'rules',
 'if',
 'you',
 "don't",
 'understand',


Different number of words used:

In [27]:
unique_words = set(result_str)
print(unique_words)

{'if', 'to', 'be', "you're", 'understand', "it's", 'team', 'before', 'coordinates', 'secret', 'meeting', 'know', 'a', 'words', 'knows', 'stuff', "we're", 'one', 'people', 'called', 'fun', 'is', 'into', 'you', 'rules', 'look', 'code', 'side', 'other', 'many', 'will', "let's", 'what', 'came', 'get', 'Master', 'among', 'playing', 'where', 'some', 'blue', 'of', 'board', 'play', 'wrong', 'agents', 'okay', 'two', 'participants', 'kids', 'I', 'today', "I'm", 'operative', 'only', 'Team', 'have', 'can', 'Akash', 'case', 'Red', 'somewhere', 'nine', 'teams', 'no', 'Blue', 'join', 'else', 'like', 'going', 'screen', 'protected', 'here', 'just', 'call', 'each', 'my', 'names', 'enjoy', 'want', 'location', 'both', 'at', 'started', 'spy', 'aim', 'time', 'hidden', 'in', 'but', 'yourself', 'money', 'how', 'they', 'red', 'and', 'has', "they're", 'basically', 'Eden', 'for', 'see', 'joining', 'do', 'done', 'much', 'different', 'their', 'game', 'does', 'me', 'make', 'up', 'word', 'from', 'respective', 'or', 

In [28]:
print("The number of different words used: ",len(unique_words))

The number of different words used:  153


**Repetition of words**

In [29]:
# To count the number of times the unique words appear , first in the unique_word list
word_dict = {} #An empty dictionary
for word in result_str:
    word_dict[word] = 0
print(word_dict)

{'Eden': 0, 'would': 0, 'you': 0, 'be': 0, 'then': 0, 'know': 0, 'locking': 0, 'this': 0, 'meeting': 0, 'in': 0, 'so': 0, 'that': 0, 'we': 0, "don't": 0, 'have': 0, 'no': 0, 'participants': 0, 'because': 0, 'once': 0, 'the': 0, 'division': 0, 'of': 0, 'team': 0, 'is': 0, 'done': 0, 'and': 0, 'stuff': 0, 'like': 0, 'what': 0, 'does': 0, 'it': 0, 'or': 0, 'okay': 0, 'if': 0, 'people': 0, 'more': 0, 'join': 0, 'game': 0, 'can': 0, 'play': 0, 'I': 0, 'think': 0, 'about': 0, '18': 0, 'most': 0, 'playing': 0, 'code': 0, 'names': 0, 'guys': 0, "it's": 0, 'a': 0, 'fun': 0, 'board': 0, 'just': 0, 'enjoy': 0, 'yourself': 0, 'case': 0, 'they': 0, 'want': 0, 'to': 0, 'do': 0, 'started': 0, "let's": 0, 'get': 0, 'my': 0, 'screen': 0, 'time': 0, 'for': 0, 'rules': 0, 'understand': 0, 'now': 0, "you're": 0, 'going': 0, 'called': 0, 'all': 0, 'spy': 0, 'agents': 0, 'much': 0, 'other': 0, 'side': 0, 'will': 0, 'look': 0, 'at': 0, 'before': 0, 'explained': 0, 'how': 0, 'many': 0, 'coordinates': 0, 'kids

In [30]:
for word in result_str:
    word_dict[word] = word_dict[word] + 1
print("The count for each word spoken number of times are: ",word_dict)

The count for each word spoken number of times are:  {'Eden': 1, 'would': 2, 'you': 10, 'be': 6, 'then': 3, 'know': 3, 'locking': 1, 'this': 4, 'meeting': 1, 'in': 6, 'so': 11, 'that': 4, 'we': 4, "don't": 2, 'have': 4, 'no': 1, 'participants': 1, 'because': 2, 'once': 2, 'the': 17, 'division': 1, 'of': 8, 'team': 9, 'is': 6, 'done': 2, 'and': 13, 'stuff': 1, 'like': 4, 'what': 1, 'does': 1, 'it': 4, 'or': 1, 'okay': 2, 'if': 4, 'people': 5, 'more': 1, 'join': 3, 'game': 6, 'can': 2, 'play': 4, 'I': 3, 'think': 1, 'about': 2, '18': 1, 'most': 2, 'playing': 1, 'code': 2, 'names': 2, 'guys': 2, "it's": 2, 'a': 2, 'fun': 1, 'board': 1, 'just': 2, 'enjoy': 1, 'yourself': 1, 'case': 2, 'they': 1, 'want': 1, 'to': 5, 'do': 1, 'started': 2, "let's": 1, 'get': 1, 'my': 2, 'screen': 1, 'time': 1, 'for': 7, 'rules': 1, 'understand': 1, 'now': 1, "you're": 2, 'going': 4, 'called': 1, 'all': 1, 'spy': 2, 'agents': 3, 'much': 1, 'other': 1, 'side': 1, 'will': 6, 'look': 1, 'at': 1, 'before': 2, 'ex

In [31]:
cols= ['Repetition']
count_df= pd.DataFrame.from_dict(word_dict,orient ='index',columns=cols)

In [32]:
count_df

Unnamed: 0,Repetition
Eden,1
would,2
you,10
be,6
then,3
...,...
make,1
some,1
money,1
from,1


In [33]:
count_df= count_df.reset_index()

In [34]:
count_df

Unnamed: 0,index,Repetition
0,Eden,1
1,would,2
2,you,10
3,be,6
4,then,3
...,...,...
148,make,1
149,some,1
150,money,1
151,from,1


In [35]:
count_df= count_df.rename(columns = {'index':'Word'})
count_df

Unnamed: 0,Word,Repetition
0,Eden,1
1,would,2
2,you,10
3,be,6
4,then,3
...,...,...
148,make,1
149,some,1
150,money,1
151,from,1


**Number of words spoken per minute:**

In [36]:
print("Total number of words: ",len(result_str))

Total number of words:  342


In [37]:
print("Total length of audio: 3.08 minutes ")

Total length of audio: 3.08 minutes 


In [38]:
print("Total number of words spoken per minute : ",(len(result_str)/3.08))

Total number of words spoken per minute :  111.03896103896103
