<a href="https://colab.research.google.com/github/katyalelas/NLP/blob/main/Practical_Course_05_Speech_Emotion_Classification_DONE_Klelas.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Speech Emotion Classification

## SAVEE Database
The database includes 4 participants, each acting 15 utterances for each of the 7 emotional states: 'anger', 'disgust', 'fear
', 'happiness', 'neutral', 'sadness' and 'surprise'. 

The database also includes all frames of the video recorded while they were acting the emotional states. THe participants' faces were marked at the points of interest for facial recognition/analysis. 

An example of the facial markers is 
<img src="img/facial_markers.jpeg">

The facial markers have allready been extracted, so we can easilz use them as features.

In [1]:
## Importing the needed packages
import os 
import pandas as pd
import numpy as np
import librosa
import tqdm

## MFCCs

So lets have a quick recap of what MFCCs are. They are a 2d representation of sound which uses the frequency haracteristic of how we hear to represent the frequency response of any signal. We calculate them by:
- calculating the spectrum of a signal
- filtering the spectrum with wilters that corespond to the mel frequency scale
- taking the DCT values of the coresponding filtered signals
<img src="img/mfcc.png">

Thankfully, we dont have to implement that, because librosa has our backs. 

## Classifier building

Now lets build our first classifier. We'll just split the input data, scale it and send it off to a SVM.

In [3]:
from sklearn.model_selection import train_test_split

X = pd.read_csv("voice_data.csv", index_col=None, header=[0])
y = pd.read_csv("voice_labels.csv", index_col=0, header=[0])

X = X.values
y = y["labels"].values


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=False)

In [4]:

y = pd.read_csv("voice_labels.csv", index_col=0, header=[0])
y.labels.unique()

array(['neutral', 'happiness', 'sadness', 'disgust', 'fear', 'surprise',
       'anger'], dtype=object)

In [5]:
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

clf = SVC()

clf.fit(X_train, y_train)
predicted = clf.predict(X_test)
acc = accuracy_score(y_test, predicted)
conf_mat = confusion_matrix(y_test, predicted)

print(acc)
print(conf_mat)

0.25
[[ 0  0  0 15  0  0  0]
 [ 7  1  5  1  0  0  1]
 [ 0  0 13  1  0  0  1]
 [ 0  0  1 12  0  0  2]
 [ 5 11  7  7  0  0  0]
 [ 4  5  1  4  1  0  0]
 [ 0  0 10  1  0  0  4]]


In [None]:
'anger', 'disgust', 'fear ', 'happiness', 'neutral', 'sadness' and 'surprise'

('anger', 'disgust', 'fear ', 'happiness', 'neutral', 'surprise')

Now try to see what happens if we split the data without shuffling it by settting the shuffle flag in the train_test_split function to False

## SPeaker based scaling/normalisation

In [6]:
class SpeakerBasedScaler:
    """
    A class that implements speaker based normalisation of a provided dataset. 

    Check out possible scaling techniques at http://benalexkeen.com/feature-scaling-with-scikit-learn/

    """

    def __init__(
        self,
        data: pd.DataFrame,
        labels: pd.DataFrame,
        speaker_col: str = 'subject',
        norm_type: str = 'Standard'
        ) -> None:

        self.data = data
        self.norm_data = data.copy()
        self.labels = labels
        self.speaker_col = speaker_col
        self.norm_type = norm_type
        self.speakers = labels[speaker_col].unique()
        self.problematic_features = []
    
    def scale(self) -> pd.DataFrame:

        for speaker in self.speakers:
            for feature in self.data.columns.values:
                speaker_idx = self.labels[self.speaker_col] == speaker
                
                vals = self.data.loc[speaker_idx, feature].values

                if np.count_nonzero(np.isnan(vals)):
                    print("nan value found in speaker: {} feature: {} ".format(speaker, feature))
                    print(vals)

                if self.norm_type == 'Standard':
                    
                    mu = np.nanmean(vals)
                    sd = np.nanstd(vals)

                    if sd == 0:
                        self.problematic_features.append(feature)

                    self.norm_data.loc[speaker_idx, feature] = (vals - mu) / sd
                
                elif self.norm_type == 'Robust':
                    
                    pass

                elif self.norm_type == 'MinMax':
                    
                    min = np.nanmin(vals)
                    max = np.nanmax(vals)
                    minmax = max - min
                    
                    if minmax == 0:
                        self.problematic_features.append(feature)

                    self.norm_data.loc[speaker_idx, feature] = (vals - min) / minmax
                    
                    # TODO: not yet implemented other types of normalisation
                elif self.norm_type == "Mean":
                    mu = np.nanmean(vals)
                    min = np.nanmin(vals)
                    max = np.nanmax(vals)
                    minmax = max - min
                    if minmax == 0:
                        self.problematic_features.append(feature)
                    self.norm_data.loc[speaker_idx, feature] = (vals - mu) / minmax

               
                else:
                   
                    raise Exception('Unknown or not yet implemented scaling of type: {}.'.format(self.norm_type))
        
        print(set(self.problematic_features))
        self.norm_data.drop(inplace=True, axis=1, labels=set(self.problematic_features))

        return self.norm_data



In [7]:
X = pd.read_csv("voice_data.csv", index_col=None, header=[0])
y = pd.read_csv("voice_labels.csv", index_col=0, header=[0])

scaler = SpeakerBasedScaler(X, y, speaker_col="subject", norm_type="Standard")
X = scaler.scale()
y = y["labels"].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=False)

clf = SVC()

clf.fit(X_train, y_train)
predicted = clf.predict(X_test)
acc = accuracy_score(y_test, predicted)
conf_mat = confusion_matrix(y_test, predicted)

print(acc)
print(conf_mat)

set()
0.4166666666666667
[[ 3  0  0 11  0  0  1]
 [ 1  3  2  0  3  4  2]
 [ 0  0  6  0  0  0  9]
 [ 0  0  0 11  0  0  4]
 [ 0  8  0  0 19  3  0]
 [ 1  3  0  0 10  1  0]
 [ 0  1  6  1  0  0  7]]


Lets check out a different modality (faces).

In [8]:
X = pd.read_csv("face_data.csv", index_col=None, header=[0])
y = pd.read_csv("face_labels.csv", index_col=0, header=[0])

X = X.values
y = y["labels"].values


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=False)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

clf = SVC()

clf.fit(X_train, y_train)
predicted = clf.predict(X_test)
acc = accuracy_score(y_test, predicted)
conf_mat = confusion_matrix(y_test, predicted)

print(acc)
print(conf_mat)

0.125
[[ 0 15  0  0  0  0  0]
 [ 0 15  0  0  0  0  0]
 [ 0 15  0  0  0  0  0]
 [ 0 15  0  0  0  0  0]
 [ 0 30  0  0  0  0  0]
 [ 0 15  0  0  0  0  0]
 [ 0 15  0  0  0  0  0]]


In [None]:
X = pd.read_csv("face_data.csv", index_col=None, header=[0])
y = pd.read_csv("face_labels.csv", index_col=0, header=[0])

scaler = SpeakerBasedScaler(X, y, speaker_col="subject", norm_type="Standard")
X = scaler.scale()
y = y["labels"].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=False)

clf = SVC()

clf.fit(X_train, y_train)
predicted = clf.predict(X_test)
acc = accuracy_score(y_test, predicted)
conf_mat = confusion_matrix(y_test, predicted)

print(acc)
print(conf_mat)



{'90', '30', '119', '95'}
0.6416666666666667
[[ 7  0  1  0  7  0  0]
 [ 2  0 10  0  3  0  0]
 [ 0  0 13  0  0  2  0]
 [ 0  0  4 10  1  0  0]
 [ 0  0  4  1 25  0  0]
 [ 1  0  6  0  0  8  0]
 [ 0  1  0  0  0  0 14]]


In [None]:
X1 = pd.read_csv("face_data.csv", index_col=None, header=[0])
y1 = pd.read_csv("face_labels.csv", index_col=0, header=[0])

X2 = pd.read_csv("voice_data.csv", index_col=None, header=[0])
y2 = pd.read_csv("voice_labels.csv", index_col=0, header=[0])

X = pd.concat([X1, X2], axis=1)
y = y2
print(X.shape)
scaler = SpeakerBasedScaler(X, y, speaker_col="subject", norm_type="Standard")
X = scaler.scale()
y = y["labels"].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, shuffle=False)

clf = SVC()

clf.fit(X_train, y_train)
predicted = clf.predict(X_test)
acc = accuracy_score(y_test, predicted)
conf_mat = confusion_matrix(y_test, predicted)

print(acc)
print(conf_mat)

(480, 160)




{'90', '119', '95'}
0.4583333333333333
[[ 4  0  1  3  0  0  7]
 [ 0  2  4  1  2  4  2]
 [ 0  0  9  0  0  0  6]
 [ 0  0  1 11  0  0  3]
 [ 2  8  0  0 20  0  0]
 [ 0  4  0  0 11  0  0]
 [ 0  1  5  0  0  0  9]]


In [9]:
from typing import Tuple

class SpeakerBasedCrossValidationIterator:
    """
    A class that implements speaker based cross validation, iterates over the dataset input through the df DataFrame, 
    taking into account the speaker information input through the label_df DataFrame and speaker_col string.
    returns:
    
    """    
    def __init__(
        self,
        df: pd.DataFrame,
        label_df: pd.DataFrame,
        n_folds: int,
        speaker_col: str = 'Subject'
        ) -> None:

        self.data = df
        self.labels = label_df
        self.n_folds = n_folds
        self.speaker_col = speaker_col
        self.speakers = self.labels[self.speaker_col].unique()
        
    
    def __iter__(self):
        self.fold_size = len(self.speakers) // self.n_folds
        self.last_fold_size = len(self.speakers) % self.n_folds + self.fold_size
        self.fold_num = 0
        return self
    
    def __next__(self) -> Tuple[pd.DataFrame]:
        if self.n_folds == self.fold_num:
            raise StopIteration
        else:
            if (self.fold_num + 1) < self.n_folds:
                test_speakers = self.speakers[self.fold_num * self.fold_size : (self.fold_num + 1) * self.fold_size]
            else:
                test_speakers = self.speakers[-self.last_fold_size:]
        
        test_index = ~np.isin(self.speakers, test_speakers)
        train_speakers = self.speakers.copy()
        train_speakers = train_speakers[test_index]
        self.fold_num += 1
        
        train_idx = self.labels[self.speaker_col].isin(train_speakers).values
        test_idx = self.labels[self.speaker_col].isin(test_speakers).values
        
        return self.data.loc[train_idx], self.data.loc[test_idx], self.labels.loc[train_idx], self.labels.loc[test_idx]


In [None]:
X1 = pd.read_csv("face_data.csv", index_col=None, header=[0])
y1 = pd.read_csv("face_labels.csv", index_col=0, header=[0])

X2 = pd.read_csv("voice_data.csv", index_col=None, header=[0])
y2 = pd.read_csv("voice_labels.csv", index_col=0, header=[0])

# TODO use pd.concat on data vectors to easily implement early fusion

scaler = SpeakerBasedScaler(X, y, speaker_col="subject", norm_type="Standard")
X = scaler.scale()

cv = SpeakerBasedCrossValidationIterator(X, y, 4, speaker_col="subject")

In [None]:
test_accuracies = []

for X_train, X_test, y_train, y_test in cv:
    clf = SVC()
    
    y_train = y_train["labels"].values
    y_test = y_test["labels"].values
    clf.fit(X_train, y_train)
    predicted = clf.predict(X_test)
    acc = accuracy_score(y_test, predicted)
    
    test_accuracies.append(acc)
    conf_mat = confusion_matrix(y_test, predicted)

    print(acc)
    print(conf_mat)
    
print(np.mean(test_accuracies))

In [None]:
np.unique(predicted)

array(['anger', 'disgust', 'fear', 'happiness', 'neutral', 'sadness',
       'surprise'], dtype=object)

In [12]:
from sklearn.preprocessing import OneHotEncoder

def oh_emotion(emotion):
    print(emotion["labels"])

    
resolve_dict = {
    'anger': np.array([1, 0, 0, 0, 0, 0, 0]),
    'disgust': np.array([0, 1, 0, 0, 0, 0, 0]),
    'fear': np.array([0, 0, 1, 0, 0, 0, 0]),
    'happiness': np.array([0, 0, 0, 1, 0, 0, 0]),
    'neutral': np.array([0, 0, 0, 0, 1, 0, 0]),
    'sadness': np.array([0, 0, 0, 0, 0, 1, 0]),
    'surprise': np.array([0, 0, 0, 0, 0, 0, 1]),
}

In [None]:
from sklearn.ensemble import VotingClassifier

X1 = pd.read_csv("face_data.csv", index_col=None, header=[0])
y1 = pd.read_csv("face_labels.csv", index_col=0, header=[0])

X2 = pd.read_csv("voice_data.csv", index_col=None, header=[0])
y2 = pd.read_csv("voice_labels.csv", index_col=0, header=[0])

print(y1)

#y1["encoded"] = y1.apply(lambda row: resolve_dict[row["labels"]], axis=1)
#y2["encoded"] = y2.apply(lambda row: resolve_dict[row["labels"]], axis=1)

print(y1.head())

scaler = SpeakerBasedScaler(X1, y1, speaker_col="subject", norm_type="Standard")
X1 = scaler.scale()

scaler = SpeakerBasedScaler(X2, y2, speaker_col="subject", norm_type="Standard")
X2 = scaler.scale()

cv1 = SpeakerBasedCrossValidationIterator(X1, y1, 4, speaker_col="subject")
cv2 = SpeakerBasedCrossValidationIterator(X2, y2, 4, speaker_col="subject")

test_accuracies = []

# X_train, X_test, y_train, y_test

for (cv1, cv2) in zip(cv1, cv2):
    X1_train = cv1[0]
    X1_test = cv1[1]
    y1_train = cv1[2]
    y1_test = cv1[3]
    
    X2_train = cv2[0]
    X2_test = cv2[1]
    y2_train = cv2[2]
    y2_test = cv2[3]

#     y1_train = y1_train["encoded"].values
#     y2_train = y2_train["encoded"].values
#     y1_test = y1_test["encoded"].values
#     y2_test = y2_test["encoded"].values
    y1_train = y1_train["labels"].values
    y2_train = y2_train["labels"].values
    y1_test = y1_test["labels"].values
    y2_test = y2_test["labels"].values
    
    clf1 = SVC(probability=True)
    clf2 = SVC(probability=True)
#     print(y1_train)
    clf1.fit(X1_train, y1_train)
    clf2.fit(X2_train, y2_train)
    
    
    
    # TODO: use predict_proba to get raw output probabilitis  
    
    # Implement the different decision logics tha use ouptu probabilities
    # and calculate accuracies
    
    
    


    subject    labels
0        DC     anger
1        DC     anger
2        DC     anger
3        DC     anger
4        DC     anger
..      ...       ...
475      KL  surprise
476      KL  surprise
477      KL  surprise
478      KL  surprise
479      KL  surprise

[480 rows x 2 columns]
  subject labels
0      DC  anger
1      DC  anger
2      DC  anger
3      DC  anger
4      DC  anger




{'119', '90', '95', '30'}
set()
[1 1 1 1 1 1 1 1 4 0 4 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 6 6 1 3 1 4
 6 6 6 6 6 6 6 6 1 1 2 3 0 3 3 3 1 3 1 3 3 3 3 1 4 4 2 4 4 0 4 0 4 3 3 3 4
 4 4 3 4 5 4 4 1 4 4 1 1 1 1 4 4 5 1 2 4 5 1 5 5 1 5 1 5 2 5 5 6 6 6 6 6 6
 6 6 1 6 6 6 6 6 6]
['neutral' 'neutral' 'happiness' 'sadness' 'happiness' 'sadness' 'disgust'
 'disgust' 'sadness' 'happiness' 'sadness' 'happiness' 'neutral' 'neutral'
 'neutral' 'neutral' 'fear' 'happiness' 'sadness' 'happiness' 'sadness'
 'sadness' 'happiness' 'sadness' 'happiness' 'fear' 'neutral' 'neutral'
 'neutral' 'neutral' 'neutral' 'happiness' 'sadness' 'sadness' 'happiness'
 'neutral' 'neutral' 'neutral' 'neutral' 'sadness' 'happiness' 'sadness'
 'sadness' 'happiness' 'happiness' 'neutral' 'neutral' 'surprise'
 'surprise' 'anger' 'anger' 'surprise' 'surprise' 'anger' 'anger' 'anger'
 'anger' 'surprise' 'surprise' 'anger' 'anger' 'anger' 'anger' 'surprise'
 'surprise' 'surprise' 'anger' 'anger' 'anger' 'surprise' 'surprise

In [21]:
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score 

X1 = pd.read_csv("face_data.csv", index_col=None, header=[0])
y1 = pd.read_csv("face_labels.csv", index_col=0, header=[0])

X2 = pd.read_csv("voice_data.csv", index_col=None, header=[0])
y2 = pd.read_csv("voice_labels.csv", index_col=0, header=[0])


In [None]:
y1.head()

Unnamed: 0,subject,labels
0,DC,anger
1,DC,anger
2,DC,anger
3,DC,anger
4,DC,anger


In [None]:
y1.labels.value_counts()

neutral      120
fear          60
surprise      60
disgust       60
happiness     60
sadness       60
anger         60
Name: labels, dtype: int64

In [22]:
print(y1)

#y1["encoded"] = y1.apply(lambda row: resolve_dict[row["labels"]], axis=1)
#y2["encoded"] = y2.apply(lambda row: resolve_dict[row["labels"]], axis=1)

print(y1.head())

scaler = SpeakerBasedScaler(X1, y1, speaker_col="subject", norm_type="Standard")
X1 = scaler.scale()

scaler = SpeakerBasedScaler(X2, y2, speaker_col="subject", norm_type="Standard")
X2 = scaler.scale()

cv1 = SpeakerBasedCrossValidationIterator(X1, y1, 4, speaker_col="subject")
cv2 = SpeakerBasedCrossValidationIterator(X2, y2, 4, speaker_col="subject")

test_accuracies = []

    subject    labels
0        DC     anger
1        DC     anger
2        DC     anger
3        DC     anger
4        DC     anger
..      ...       ...
475      KL  surprise
476      KL  surprise
477      KL  surprise
478      KL  surprise
479      KL  surprise

[480 rows x 2 columns]
  subject labels
0      DC  anger
1      DC  anger
2      DC  anger
3      DC  anger
4      DC  anger




{'30', '95', '119', '90'}
set()


In [43]:
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score, log_loss

X1 = pd.read_csv("face_data.csv", index_col=None, header=[0])
y1 = pd.read_csv("face_labels.csv", index_col=0, header=[0])

X2 = pd.read_csv("voice_data.csv", index_col=None, header=[0])
y2 = pd.read_csv("voice_labels.csv", index_col=0, header=[0])

print(y1)

#y1["encoded"] = y1.apply(lambda row: resolve_dict[row["labels"]], axis=1)
#y2["encoded"] = y2.apply(lambda row: resolve_dict[row["labels"]], axis=1)

print(y1.head())

scaler = SpeakerBasedScaler(X1, y1, speaker_col="subject", norm_type="Standard")
X1 = scaler.scale()

scaler = SpeakerBasedScaler(X2, y2, speaker_col="subject", norm_type="Standard")
X2 = scaler.scale()

cv1 = SpeakerBasedCrossValidationIterator(X1, y1, 4, speaker_col="subject")
cv2 = SpeakerBasedCrossValidationIterator(X2, y2, 4, speaker_col="subject")

test_accuracies = []

# X_train, X_test, y_train, y_test

for (cv1, cv2) in zip(cv1, cv2):
    X1_train = cv1[0]
    X1_test = cv1[1]
    y1_train = cv1[2]
    y1_test = cv1[3]
    
    X2_train = cv2[0]
    X2_test = cv2[1]
    y2_train = cv2[2]
    y2_test = cv2[3]

#     y1_train = y1_train["encoded"].values
#     y2_train = y2_train["encoded"].values
#     y1_test = y1_test["encoded"].values
#     y2_test = y2_test["encoded"].values
    y1_train = y1_train["labels"].values
    y2_train = y2_train["labels"].values
    y1_test = y1_test["labels"].values
    y2_test = y2_test["labels"].values
    
    clf1 = SVC(probability=True)
    clf2 = SVC(probability=True)
#     print(y1_train)
    clf1.fit(X1_train, y1_train)
    clf2.fit(X2_train, y2_train)
    
    # TODO: use predict_proba to get raw output probabilitis  
    
    
    predictions1=clf1.predict(X1_test)
    predictions2=clf2.predict(X2_test)
    probabilities1=clf1.predict_proba(X1_test) #gives the probabilities of each target class
    probabilities2=clf2.predict_proba(X2_test)
    
    
    score1 = accuracy_score(y1_test, predictions1)
    score2 = accuracy_score(y2_test, predictions2)

    print("Accuracy (face data) = " , score1)
    print("Accuracy (voice data) = " , score2)

    
    from sklearn.metrics import log_loss
    cross_entropy_loss1=log_loss(y1_test, probabilities1)
    cross_entropy_loss2=log_loss(y2_test, probabilities2)
    print()
    print( "cross-entropy loss (face data) =", cross_entropy_loss1, "cross-entropy loss (voice data) = ", cross_entropy_loss2)
  
    
    # Implement the different decision logics tha use ouptu probabilities
    # and calculate accuracies
    
    
    


    subject    labels
0        DC     anger
1        DC     anger
2        DC     anger
3        DC     anger
4        DC     anger
..      ...       ...
475      KL  surprise
476      KL  surprise
477      KL  surprise
478      KL  surprise
479      KL  surprise

[480 rows x 2 columns]
  subject labels
0      DC  anger
1      DC  anger
2      DC  anger
3      DC  anger
4      DC  anger




{'30', '95', '119', '90'}
set()
Accuracy (face data) =  0.75
Accuracy (voice data) =  0.2916666666666667

cross-entropy loss (face data) = 0.8892664570401568 cross-entropy loss (voice data) =  2.087995214343913
Accuracy (face data) =  0.5083333333333333
Accuracy (voice data) =  0.6666666666666666

cross-entropy loss (face data) = 1.0794375399260683 cross-entropy loss (voice data) =  0.8404434522044473
Accuracy (face data) =  0.7166666666666667
Accuracy (voice data) =  0.65

cross-entropy loss (face data) = 0.8679663865420495 cross-entropy loss (voice data) =  0.9472009312800286
Accuracy (face data) =  0.6416666666666667
Accuracy (voice data) =  0.4166666666666667

cross-entropy loss (face data) = 1.2048192625406897 cross-entropy loss (voice data) =  1.4060099671623216


In [41]:
y1_test

array(['anger', 'anger', 'anger', 'anger', 'anger', 'anger', 'anger',
       'anger', 'anger', 'anger', 'anger', 'anger', 'anger', 'anger',
       'anger', 'disgust', 'disgust', 'disgust', 'disgust', 'disgust',
       'disgust', 'disgust', 'disgust', 'disgust', 'disgust', 'disgust',
       'disgust', 'disgust', 'disgust', 'disgust', 'fear', 'fear', 'fear',
       'fear', 'fear', 'fear', 'fear', 'fear', 'fear', 'fear', 'fear',
       'fear', 'fear', 'fear', 'fear', 'happiness', 'happiness',
       'happiness', 'happiness', 'happiness', 'happiness', 'happiness',
       'happiness', 'happiness', 'happiness', 'happiness', 'happiness',
       'happiness', 'happiness', 'happiness', 'neutral', 'neutral',
       'neutral', 'neutral', 'neutral', 'neutral', 'neutral', 'neutral',
       'neutral', 'neutral', 'neutral', 'neutral', 'neutral', 'neutral',
       'neutral', 'neutral', 'neutral', 'neutral', 'neutral', 'neutral',
       'neutral', 'neutral', 'neutral', 'neutral', 'neutral', 'neutral',
 

In [40]:
probabilities1[:, 0] # probabilites for the 1st class

array([0.37015047, 0.13757426, 0.498654  , 0.30846648, 0.24971194,
       0.22134753, 0.25462919, 0.13839954, 0.34281671, 0.40540266,
       0.3078085 , 0.508736  , 0.24666688, 0.24435996, 0.13614444,
       0.23774591, 0.1313922 , 0.1822013 , 0.16490312, 0.11152319,
       0.08982953, 0.18982303, 0.1455407 , 0.14521133, 0.11864425,
       0.16456105, 0.12391371, 0.11171126, 0.10162574, 0.10060283,
       0.07227388, 0.16323279, 0.1720238 , 0.15805841, 0.23917082,
       0.13297315, 0.12892882, 0.03359793, 0.04385446, 0.02836369,
       0.05493279, 0.03679803, 0.02670541, 0.04099123, 0.09884307,
       0.08642242, 0.05087534, 0.03773334, 0.03679619, 0.02701398,
       0.03509982, 0.02416528, 0.06671025, 0.06320882, 0.07618733,
       0.0775299 , 0.04378611, 0.07261062, 0.05614021, 0.05754928,
       0.02735267, 0.0240948 , 0.02193058, 0.02213939, 0.02067531,
       0.01908135, 0.12560722, 0.0598714 , 0.05180989, 0.06644071,
       0.05816235, 0.03500158, 0.06037566, 0.01913449, 0.02323

In [25]:
clf1.classes_

array(['anger', 'disgust', 'fear', 'happiness', 'neutral', 'sadness',
       'surprise'], dtype=object)

In [39]:
probabilities1[0, :] # anger

array([0.37015047, 0.15048662, 0.21609726, 0.03129923, 0.14069492,
       0.06655781, 0.02471369])

In [42]:
probabilities1[25, :] #fear

array([0.16456105, 0.12285436, 0.29366317, 0.02790483, 0.17719637,
       0.06117327, 0.15264695])