# Speech Analysis in conversation in computer science Classes

### Initial Plan of Attack :
    Step 1 : Audio to Text File Conversion
    Step 2 : Cleaning the data
    Step 3 : Emotion Detection
    Step 4 : Topic Extraction

#### Firstly Lets Just Install the libraries Required

Text2Emotion is a python package designed to identify emotions in text data. It works by recognizing emotions expressed in words when people are confident in their statements. For example, a dissatisfied customer may say, "I am very angry by your product services and gonna file a complaint." Text2Emotion can extract emotions from text and categorize them as Happy, Angry, Sad, Surprise, or Fear, providing a dictionary output.

#### https://pypi.org/project/text2emotion/

In [1]:
pip install text2emotion

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


NeatText is a simple NLP package for cleaning textual data and text preprocessing. Simplifying Text Cleaning For NLP & ML

#### https://pypi.org/project/neattext/

In [2]:
pip install neattext

Defaulting to user installation because normal site-packages is not writeableNote: you may need to restart the kernel to use updated packages.



#### Please find all the TV show conversations links i used as audio files in the project

#### https://www.youtube.com/watch?v=mNIXRXikYDc&ab_channel=TheEllenShow
#### https://www.youtube.com/watch?v=hVd_rdhKTVk&ab_channel=AppleTV
#### https://www.youtube.com/watch?v=skj-ALA1HFE&ab_channel=VideoAdvice
#### https://www.youtube.com/watch?v=PNTCM7cbrsc&t=184s&ab_channel=CampusMovieFest
#### https://www.youtube.com/watch?v=uQDDGriA1lk&ab_channel=SteveTVShow
#### https://www.youtube.com/watch?v=f5NJQiY9AuY&t=12s&ab_channel=TheEllenShow
#### https://www.youtube.com/watch?v=sd7dSHU4BKs&t=12s&ab_channel=Simulation
#### https://www.youtube.com/watch?v=yRQ5ntxnFaI&t=2s&ab_channel=TheLateShowwithStephenColbert
#### https://www.youtube.com/watch?v=pKtNyN53B_s&t=2s&ab_channel=TheKellyClarksonShow



####                                                            Step 1 : Audio to Text File Conversion

This code is a Python script that uses the AssemblyAI API to upload an audio file, transcribe it, and return the text transcript.

The script uses the requests library to make HTTP requests to the API, and defines three functions:

    uploadMyFile to upload the audio file to AssemblyAI and return an upload URL.
    startTranscription to initiate the transcription process using the upload URL.
    getTranscription to check the status of the transcription and retrieve the text transcript once it is completed.
    
The script requires an authentication key to use the AssemblyAI API, which must be obtained from the AssemblyAI website. The key is stored in the authKey variable.

#### https://www.assemblyai.com/

In [3]:
# To make requests to the API
import requests
import time

# Step 1
# Register and get auth key from https://www.assemblyai.com/
authKey = 'f9d89818e1214415b6dd6c9642439a6a'

# Parameters for HTTP request
headers = {
    'authorization' : authKey,
    'content-type'  : 'application/json'
}

# Url's for Upload and transcripts assigned by assemblyai
uploadUrl      = 'https://api.assemblyai.com/v2/upload'
transcriptUrl  = 'https://api.assemblyai.com/v2/transcript'

# Step 2 : Upload audio file
def uploadMyFile(fileName):

    def _readMyFile(fn):
        # 
        chunkSize = 10#5242880
        
        # Creation of Filestream to Read the file
        with open(fn, 'rb') as fileStream:

            while True:
                data = fileStream.read(chunkSize)
                ## Since it is a while we need to end the loop when data is not present
                if not data:
                    break
                ##  we use yeild instead of return for returning the whole file 
                ##  not just the first chunk of the file becasue we are reading the file in chunks
                yield data
    ## POST method
    response = requests.post(
        uploadUrl,
        headers= headers,
        data= _readMyFile(fileName)
    )
    
    ## every response has a json so we are intializing it to json
    json = response.json()

    return json['upload_url']
# END def uploadMyFile

# Step 3 : Start Transcription
## we provide the uploaded audio url and expect transcription Id from assemblyai
def startTranscription(aurl):
    ## POST method
    response = requests.post(
        transcriptUrl,
        headers= headers,
        json= { 'audio_url' : aurl }
    )
    ## every response has a json so we are intializing it to json
    json = response.json()

    return json['id']
# END def startTranscription



# Step 4 : Start Transcription
## we provide the transcription Id and expect text from assemblyai
def getTranscription(tid):

    maxAttempts = 1000
    timedout    = False

    while True:
        ## Get method
        response = requests.get(
            f'{transcriptUrl}/{tid}', #transcriptUrl + '/' + tid,
            headers= headers
        )
        
        ## every response has a json so we are intializing it to json
        json = response.json()
        
        ## condition to break out of while loop
        if json['status'] == 'completed':
            break

        maxAttempts -= 1
        timedout = maxAttempts <= 0

        if timedout:
            break

        # Wait for 3 seconds before make the next try!
        # Why? Because we don't want to set AssemblyAI on Fire!!!
        time.sleep(3)

    return 'Timeout...' if timedout else json['text']
# END def getTranscription


In [4]:
audio_filenames = []
audio_transcript = []

In [5]:
#Import the modules
import text2emotion as te


[nltk_data] Downloading package stopwords to C:\Users\Mourya
[nltk_data]     Kunuku\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to C:\Users\Mourya
[nltk_data]     Kunuku\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to C:\Users\Mourya
[nltk_data]     Kunuku\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


In [6]:
import os

## Function to get all the file names whose extension ends with .mp3
def mp3gen():
    for root, dirs, files in os.walk('.'):
        for filename in files:
            if os.path.splitext(filename)[1] == ".mp3":
                yield os.path.join(root, filename)

for mp3file in mp3gen():
    print(f"processing : {mp3file}")
    audioUrl = uploadMyFile(mp3file)

    # step 2) Start Transcription
    transcriptionID = startTranscription(audioUrl)

    # step 3) Get Transcription Text
    text = getTranscription(transcriptionID)
    
    ## Adding Filenames and Audio Transcript to lists
    audio_filenames.append(mp3file)
    audio_transcript.append(text)
    
    print(f"processing Completed: {mp3file}")

    
    

processing : .\audio\Bill_Gates_Chats_with_Ellen_for_the_First_Time.mp3
processing Completed: .\audio\Bill_Gates_Chats_with_Ellen_for_the_First_Time.mp3
processing : .\audio\Ellen_Taught_This_Fan_How_to_Speak_English.mp3
processing Completed: .\audio\Ellen_Taught_This_Fan_How_to_Speak_English.mp3
processing : .\audio\Joe_Alwyn_Dishes_On_'Weird__Funny__Strange'_Intimate_Scenes_In_'Conversations_With_Friends'.mp3
processing Completed: .\audio\Joe_Alwyn_Dishes_On_'Weird__Funny__Strange'_Intimate_Scenes_In_'Conversations_With_Friends'.mp3
processing : .\audio\Jordan_Peterson___How_to_Have_Better_Conversations.mp3
processing Completed: .\audio\Jordan_Peterson___How_to_Have_Better_Conversations.mp3
processing : .\audio\Penn_Badgley_Can_Go_From_Charming_To_Creepy_Without_Changing_His_Expression.mp3
processing Completed: .\audio\Penn_Badgley_Can_Go_From_Charming_To_Creepy_Without_Changing_His_Expression.mp3
processing : .\audio\Small_Talk.mp3
processing Completed: .\audio\Small_Talk.mp3
proces

#### Step 2 : Cleaning the data and Step 3 : Emotion Detection

In [7]:
import neattext.functions as nfx
## Emotions list to store all the emotion results from each audio
emotions = []
##  To store the cleaned text record for each audio
cleaned_audiotext = []

## looping through each audio transcript which is generated from assemblyai
for audio in audio_transcript:
    
    ## Removing stop words from the audio transcript using the neattext library functions
    cleantext = nfx.remove_stopwords(audio)
    ## Removing punctuations from the audio transcript using the neattext library functions
    cleantext = nfx.remove_punctuations(cleantext)
    ## Removing user handles from the audio transcript using the neattext library functions
    cleantext = nfx.remove_userhandles(cleantext)
    #3 storing the clean text in cleaned_audiotext list to add it into the dataframe
    cleaned_audiotext.append(cleantext)
    
    #Call to the function get_emotion from text2emotion library which returns the emotion
    ## for text in the form of an dictionary for ex: {'Happy': 0.2, 'Angry': 0.07, 'Surprise': 0.2, 'Sad': 0.2, 'Fear': 0.33}
    emotionresult=te.get_emotion(cleantext)
    
    ## Storing the emotions results for all the audio file in emotions list to add it into the final dataframe
    emotions.append(emotionresult)
    

In [8]:
## I have decided to use Dataframes for those reasons i am importing pandas
import pandas as pd

## creation of dataframe mydf which include 4 columns 
    ## audio_filenames : which has mp3 file name
    ## audio_transcript : which has audio transcript of the mp3 files we have processed through assembly ai
    ## cleaned_audiotext : cleaned audio transcript text result after using the neattext library
    ## emotions : which will contain the dictionary result from text2emotion of audio transcript
mydf = pd.DataFrame(list(zip(audio_filenames, audio_transcript ,cleaned_audiotext , emotions)), columns = ['File Name', 'Audio Transcript', 'Clean Transcript', 'Emotion'])

In [9]:
mydf

Unnamed: 0,File Name,Audio Transcript,Clean Transcript,Emotion
0,.\audio\Bill_Gates_Chats_with_Ellen_for_the_Fi...,I'm so happy to have you here. This is the fir...,Im happy here time on thanks know nervous entr...,"{'Happy': 0.31, 'Angry': 0.03, 'Surprise': 0.2..."
1,.\audio\Ellen_Taught_This_Fan_How_to_Speak_Eng...,Our next guest is sitting in the audience righ...,guest sitting audience right now you Everybody...,"{'Happy': 0.24, 'Angry': 0.2, 'Surprise': 0.23..."
2,.\audio\Joe_Alwyn_Dishes_On_'Weird__Funny__Str...,"Everywhere you go. I'm like, crazy. Well, I me...",go Im like crazy Well meant Id calm down Theyr...,"{'Happy': 0.34, 'Angry': 0.03, 'Surprise': 0.1..."
3,.\audio\Jordan_Peterson___How_to_Have_Better_C...,Exploration. I really thought this was interes...,Exploration thought interesting intellectual d...,"{'Happy': 0.2, 'Angry': 0.07, 'Surprise': 0.2,..."
4,.\audio\Penn_Badgley_Can_Go_From_Charming_To_C...,"Listen, everybody, you know my next guest from...",Listen everybody know guest Gossip Girl easy J...,"{'Happy': 0.33, 'Angry': 0.04, 'Surprise': 0.2..."
5,.\audio\Small_Talk.mp3,Excuse me. Hi. I'm trying to relax. Would you ...,Excuse me Hi Im trying relax mind Oh sorry Mr ...,"{'Happy': 0.26, 'Angry': 0.02, 'Surprise': 0.1..."
6,.\audio\The_Oprah_Conversation_—_Will_Smith_On...,"To this day, if we start talking it's 4 hours....",day start talking 4 hours 4 hours exchange sen...,"{'Happy': 0.38, 'Angry': 0.04, 'Surprise': 0.1..."
7,.\audio\The_Viral_Voice_that_Sounds_Like_Siri_...,"My next guest blows my mind, and I'm sure she'...",guest blows mind Im sure going you Matter fact...,"{'Happy': 0.14, 'Angry': 0.06, 'Surprise': 0.2..."


#### Step 4 : Topic Extraction

Firstly I have to find techniques in NLP that can work unsupervised text data after thorough research i came acrros a techniques in NLP called topic modelling.

Topic modeling is a technique used in natural language processing (NLP) to discover latent topics or themes in a collection of documents. It is a way to automatically identify the topics present in a large corpus of text data, and to infer the underlying structure that connects these topics to individual documents.

There are several popular algorithms for topic modeling, such as Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF). These algorithms work by analyzing the distribution of words across the corpus and identifying patterns that suggest the presence of distinct topics. Once the topics are identified, they can be used for a variety of tasks such as document classification, summarization, and recommendation systems.

I have tested Both the methods in the following sections


#### Latent Dirichlet Allocation (LDA)

Two important notes:
    --> The user must decide gpf the amount of topics present in the document.
    --> The user must interpret what the topics are.


In [10]:
## Little bit of preprocessing
from sklearn.feature_extraction.text import CountVectorizer

## count venctorizer
## max_df = when building up the vocabulary we are going to ingore the terms which has really high document frequency
## 0.9 discard the words which show up in 90 % of the documents

## min_df = words that show up minimum number of times. for a word to be counted it is atleast present in one document
cv = CountVectorizer(max_df = 0.9, min_df= 1,stop_words ='english')

## since it is unsupervised we are going to do fit tranform to the entire dataset
dtm = cv.fit_transform(mydf['Audio Transcript'])

dtm

<8x987 sparse matrix of type '<class 'numpy.int64'>'
	with 1408 stored elements in Compressed Sparse Row format>

In [11]:
from sklearn.decomposition import LatentDirichletAllocation

## There are many propeties in LDA here we are only using 
## n_components = Increase the number if you want more subtopics in each topic
LDA = LatentDirichletAllocation(n_components = 10,random_state = 42)

## Fit it to Document Term Matrix
LDA.fit(dtm)

## Grabbing the top 15 words from each topic
# argsort does is return the index positions that would sort the array least to most
for i, topic in enumerate(LDA.components_):
    print(f"The top 15 words for Topic #{i}")
    print([cv.get_feature_names()[index] for index in topic.argsort()[-15:]])
    print('\n')
    
## Attach the topics to audio transcripts  
topic_results= LDA.transform(dtm)

## Assign the audio transcript high probability topics
mydf['Topic'] = topic_results.argmax(axis=1)


The top 15 words for Topic #0
['nice', 'alexa', 'creepy', 'got', 'said', 'mean', 'light', 'okay', 'kind', 'say', 'people', 'right', 'love', 'did', 'yeah']


The top 15 words for Topic #1
['trying', 'matter', 've', 'good', 'way', 'life', 'best', 'conversation', 'goodbye', 'pieces', 'sort', 'think', 'let', 'things', 'thought']


The top 15 words for Topic #2
['thought', 'know', 'interesting', 'maybe', 'better', 'kind', 'kids', 'thing', 'don', 'things', 'yeah', 'conversation', 'think', 'right', 'people']


The top 15 words for Topic #3
['dictionary', 'learned', 'god', 'know', 'english', 'okay', 'noun', 'affectionate', 'standing', 'stay', 'love', 'ellen', 'watch', 'word', 'right']


The top 15 words for Topic #4
['trying', 'matter', 've', 'good', 'way', 'life', 'best', 'conversation', 'goodbye', 'pieces', 'sort', 'think', 'let', 'things', 'thought']


The top 15 words for Topic #5
['trying', 'matter', 've', 'good', 'way', 'life', 'best', 'conversation', 'goodbye', 'pieces', 'sort', 'think'



In [12]:
mydf

Unnamed: 0,File Name,Audio Transcript,Clean Transcript,Emotion,Topic
0,.\audio\Bill_Gates_Chats_with_Ellen_for_the_Fi...,I'm so happy to have you here. This is the fir...,Im happy here time on thanks know nervous entr...,"{'Happy': 0.31, 'Angry': 0.03, 'Surprise': 0.2...",2
1,.\audio\Ellen_Taught_This_Fan_How_to_Speak_Eng...,Our next guest is sitting in the audience righ...,guest sitting audience right now you Everybody...,"{'Happy': 0.24, 'Angry': 0.2, 'Surprise': 0.23...",3
2,.\audio\Joe_Alwyn_Dishes_On_'Weird__Funny__Str...,"Everywhere you go. I'm like, crazy. Well, I me...",go Im like crazy Well meant Id calm down Theyr...,"{'Happy': 0.34, 'Angry': 0.03, 'Surprise': 0.1...",0
3,.\audio\Jordan_Peterson___How_to_Have_Better_C...,Exploration. I really thought this was interes...,Exploration thought interesting intellectual d...,"{'Happy': 0.2, 'Angry': 0.07, 'Surprise': 0.2,...",2
4,.\audio\Penn_Badgley_Can_Go_From_Charming_To_C...,"Listen, everybody, you know my next guest from...",Listen everybody know guest Gossip Girl easy J...,"{'Happy': 0.33, 'Angry': 0.04, 'Surprise': 0.2...",0
5,.\audio\Small_Talk.mp3,Excuse me. Hi. I'm trying to relax. Would you ...,Excuse me Hi Im trying relax mind Oh sorry Mr ...,"{'Happy': 0.26, 'Angry': 0.02, 'Surprise': 0.1...",7
6,.\audio\The_Oprah_Conversation_—_Will_Smith_On...,"To this day, if we start talking it's 4 hours....",day start talking 4 hours 4 hours exchange sen...,"{'Happy': 0.38, 'Angry': 0.04, 'Surprise': 0.1...",8
7,.\audio\The_Viral_Voice_that_Sounds_Like_Siri_...,"My next guest blows my mind, and I'm sure she'...",guest blows mind Im sure going you Matter fact...,"{'Happy': 0.14, 'Angry': 0.06, 'Surprise': 0.2...",0



 #### Non Negative Matrix Factorization   
 
 
 
 
 
 #### https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html

In [13]:
from sklearn.feature_extraction.text import TfidfVectorizer

In [14]:
tfidf = TfidfVectorizer(max_df = 0.9, min_df= 1,stop_words ='english')

In [15]:
dtm = tfidf.fit_transform(mydf['Clean Transcript'])

In [16]:
dtm

<8x991 sparse matrix of type '<class 'numpy.float64'>'
	with 1393 stored elements in Compressed Sparse Row format>

In [17]:
from sklearn.decomposition import NMF

In [18]:
nmf_model = NMF (n_components = 10, random_state = 42)

In [19]:
nmf_model.fit(dtm)



NMF(n_components=10, random_state=42)

In [20]:
for i, topic in enumerate(nmf_model.components_):
    print(f"The top 15 words for Topic #{i}")
    print([tfidf.get_feature_names()[index] for index in topic.argsort()[-15:]])
    print('\n')

The top 15 words for Topic #0
['love', 'videos', 'hug', 'stuck', 'dictionary', 'dominican', 'english', 'stay', 'watch', 'affectionate', 'standing', 'noun', 'ellen', 'right', 'word']


The top 15 words for Topic #1
['40', 'school', 'teachers', 'billion', 'porsche', 'terms', 'help', 'people', 'huge', 'thing', 'billionaire', 'money', 'right', 'yeah', 'kids']


The top 15 words for Topic #2
['line', 'hood', 'recorded', 'game', 'cutting', 'mad', 'professional', 'hey', 'going', 'said', 'video', 'siri', 'cut', 'alexa', 'light']


The top 15 words for Topic #3
['girl', 'looks', 'character', 'fun', 'remember', 'weird', 'called', 'whos', 'listen', 'love', 'okay', 'mean', 'right', 'people', 'yeah']


The top 15 words for Topic #4
['left', 'cried', 'brought', 'laughing', 'wheres', 'flower', 'boring', 'kid', 'old', 'katie', 'grumpy', 'girlfriends', 'gone', 'girlfriend', 'mr']


The top 15 words for Topic #5
['sort', 'know', 'lot', 'means', 'youtube', 'engaged', 'people', 'theres', 'conversations', 



In [21]:
topic_results = nmf_model.transform(dtm)

In [22]:
topic_results.argmax(axis =1)

array([1, 0, 8, 5, 9, 4, 6, 2], dtype=int64)

In [23]:
mydf['Topic'] = topic_results.argmax(axis=1)

In [24]:
mydf

Unnamed: 0,File Name,Audio Transcript,Clean Transcript,Emotion,Topic
0,.\audio\Bill_Gates_Chats_with_Ellen_for_the_Fi...,I'm so happy to have you here. This is the fir...,Im happy here time on thanks know nervous entr...,"{'Happy': 0.31, 'Angry': 0.03, 'Surprise': 0.2...",1
1,.\audio\Ellen_Taught_This_Fan_How_to_Speak_Eng...,Our next guest is sitting in the audience righ...,guest sitting audience right now you Everybody...,"{'Happy': 0.24, 'Angry': 0.2, 'Surprise': 0.23...",0
2,.\audio\Joe_Alwyn_Dishes_On_'Weird__Funny__Str...,"Everywhere you go. I'm like, crazy. Well, I me...",go Im like crazy Well meant Id calm down Theyr...,"{'Happy': 0.34, 'Angry': 0.03, 'Surprise': 0.1...",8
3,.\audio\Jordan_Peterson___How_to_Have_Better_C...,Exploration. I really thought this was interes...,Exploration thought interesting intellectual d...,"{'Happy': 0.2, 'Angry': 0.07, 'Surprise': 0.2,...",5
4,.\audio\Penn_Badgley_Can_Go_From_Charming_To_C...,"Listen, everybody, you know my next guest from...",Listen everybody know guest Gossip Girl easy J...,"{'Happy': 0.33, 'Angry': 0.04, 'Surprise': 0.2...",9
5,.\audio\Small_Talk.mp3,Excuse me. Hi. I'm trying to relax. Would you ...,Excuse me Hi Im trying relax mind Oh sorry Mr ...,"{'Happy': 0.26, 'Angry': 0.02, 'Surprise': 0.1...",4
6,.\audio\The_Oprah_Conversation_—_Will_Smith_On...,"To this day, if we start talking it's 4 hours....",day start talking 4 hours 4 hours exchange sen...,"{'Happy': 0.38, 'Angry': 0.04, 'Surprise': 0.1...",6
7,.\audio\The_Viral_Voice_that_Sounds_Like_Siri_...,"My next guest blows my mind, and I'm sure she'...",guest blows mind Im sure going you Matter fact...,"{'Happy': 0.14, 'Angry': 0.06, 'Surprise': 0.2...",2


In [25]:
## From the above observation 
## NMF works better for the data i am working on.