## Module 9 - Case study 1

Business challenge/requirement:

Motion Studios is the largest Radio production house in Europe. Their total revenue $ 1B+. Company has launched a new reality show – "The Star RJ". The show is about finding a new Radio Jockey who will be the star presenter on upcoming shows.
In first round participants have to upload their voice clip online and the clip will be evaluated by experts for selection into the next round. There is a separate team in the first round for evaluation of male and female voice.
Response to the show is unprecedented and company is flooded with voice clips.
You as a ML expert have to classify the voice as either male/female so that first level of filtration is quicker.

Key issues:

Voice sample are across accents

Considerations:

The output from the pre-processed WAV files were saved into the CSV file

Data volume:

- Approx 3000 records – file voice-classification.csv

Fields in Data:

• meanfreq: mean frequency (in kHz)
• sd: standard deviation of frequency
• median: median frequency (in kHz)
• Q25: first quantile (in kHz)
• Q75: third quantile (in kHz)
• IQR: interquantile range (in kHz)
• skew: skewness (see note in specprop description)
• kurt: kurtosis (see note in specprop description)
• sp.ent: spectral entropy
• sfm: spectral flatness
• mode: mode frequency
• centroid: frequency centroid (see specprop)
• peakf: peak frequency (frequency with highest energy)
• meanfun: average of fundamental frequency measured across acoustic signal
• minfun: minimum fundamental frequency measured across acoustic signal
• maxfun: maximum fundamental frequency measured across acoustic signal
• meandom: average of dominant frequency measured across acoustic signal
• mindom: minimum of dominant frequency measured across acoustic signal
• maxdom: maximum of dominant frequency measured across acoustic signal
• dfrange: range of dominant frequency measured across acoustic signal
• modindx: modulation index. Calculated as the accumulated absolute difference between adjacent measurements of fundamental frequencies divided by the frequency range
• label: male or female

Business benefits:

Since "The Star RJ" is a reality show, time to select candidates is very short. The whole success of the show and hence the profits depends upon quick and smooth execution

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

  import pandas.util.testing as tm


In [2]:
df = pd.read_csv(r'D:\E\Courses\Edureka\Assignments\Dataset\module9\voice-classification.csv')

In [3]:
print(df.shape)
df.head(2)

(3168, 21)


Unnamed: 0,meanfreq,sd,median,Q25,Q75,IQR,skew,kurt,sp.ent,sfm,...,centroid,meanfun,minfun,maxfun,meandom,mindom,maxdom,dfrange,modindx,label
0,0.059781,0.064241,0.032027,0.015071,0.090193,0.075122,12.863462,274.402906,0.893369,0.491918,...,0.059781,0.084279,0.015702,0.275862,0.007812,0.007812,0.007812,0.0,0.0,male
1,0.066009,0.06731,0.040229,0.019414,0.092666,0.073252,22.423285,634.613855,0.892193,0.513724,...,0.066009,0.107937,0.015826,0.25,0.009014,0.007812,0.054688,0.046875,0.052632,male


In [5]:
df.isna().sum()

meanfreq    0
sd          0
median      0
Q25         0
Q75         0
IQR         0
skew        0
kurt        0
sp.ent      0
sfm         0
mode        0
centroid    0
meanfun     0
minfun      0
maxfun      0
meandom     0
mindom      0
maxdom      0
dfrange     0
modindx     0
label       0
dtype: int64

In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3168 entries, 0 to 3167
Data columns (total 21 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   meanfreq  3168 non-null   float64
 1   sd        3168 non-null   float64
 2   median    3168 non-null   float64
 3   Q25       3168 non-null   float64
 4   Q75       3168 non-null   float64
 5   IQR       3168 non-null   float64
 6   skew      3168 non-null   float64
 7   kurt      3168 non-null   float64
 8   sp.ent    3168 non-null   float64
 9   sfm       3168 non-null   float64
 10  mode      3168 non-null   float64
 11  centroid  3168 non-null   float64
 12  meanfun   3168 non-null   float64
 13  minfun    3168 non-null   float64
 14  maxfun    3168 non-null   float64
 15  meandom   3168 non-null   float64
 16  mindom    3168 non-null   float64
 17  maxdom    3168 non-null   float64
 18  dfrange   3168 non-null   float64
 19  modindx   3168 non-null   float64
 20  label     3168 non-null   obje

In [14]:
x = df.iloc[:, 0:20].values
y = df.iloc[:,20:21].values

In [17]:
from sklearn.preprocessing import LabelEncoder

labelencoder = LabelEncoder()
y = labelencoder.fit_transform(y)

In [18]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

In [21]:
len(x_train), len(x_test), len(y_train), len(y_test)

(2534, 634, 2534, 634)

##### applying logistic regression to predict voice

In [23]:
from sklearn.linear_model import LogisticRegression

log_reg_model = LogisticRegression()
log_reg_model.fit(x_train, y_train)



LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='warn', tol=0.0001, verbose=0,
                   warm_start=False)

In [25]:
y_pred1 = log_reg_model.predict(x_test)

In [34]:
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

In [27]:
confusion_matrix(y_test, y_pred1)

array([[257,  44],
       [ 10, 323]], dtype=int64)

In [28]:
accuracy_score(y_test, y_pred1)

0.9148264984227129

In [35]:
print(classification_report(y_test, y_pred1))

              precision    recall  f1-score   support

           0       0.96      0.85      0.90       301
           1       0.88      0.97      0.92       333

    accuracy                           0.91       634
   macro avg       0.92      0.91      0.91       634
weighted avg       0.92      0.91      0.91       634



##### applying decesion tree classifier to predict voice

In [29]:
from sklearn.tree import DecisionTreeClassifier

dec_tree_class_model = DecisionTreeClassifier()
dec_tree_class_model.fit(x_train, y_train)

DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,
                       max_features=None, max_leaf_nodes=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, presort=False,
                       random_state=None, splitter='best')

In [31]:
y_pred2 = dec_tree_class_model.predict(x_test)

In [32]:
confusion_matrix(y_test, y_pred2)

array([[285,  16],
       [ 12, 321]], dtype=int64)

In [33]:
accuracy_score(y_test, y_pred2)

0.9558359621451105

In [36]:
print(classification_report(y_test, y_pred2))

              precision    recall  f1-score   support

           0       0.96      0.95      0.95       301
           1       0.95      0.96      0.96       333

    accuracy                           0.96       634
   macro avg       0.96      0.96      0.96       634
weighted avg       0.96      0.96      0.96       634



##### applying random forest to predict voice

In [37]:
from sklearn.ensemble import RandomForestClassifier

rand_for_class_model = RandomForestClassifier(n_estimators=20)
rand_for_class_model.fit(x_train, y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
                       max_depth=None, max_features='auto', max_leaf_nodes=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=20,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)

In [39]:
y_pred3 = rand_for_class_model.predict(x_test)

In [40]:
confusion_matrix(y_test, y_pred3)

array([[296,   5],
       [  7, 326]], dtype=int64)

In [60]:
accuracy_score(y_test, y_pred3)

0.9810725552050473

##### applying naive bayes to predict voice

In [61]:
from sklearn.naive_bayes import GaussianNB

gnb_model = GaussianNB()
gnb_model.fit(x_train, y_train)

GaussianNB(priors=None, var_smoothing=1e-09)

In [63]:
y_pred4 = gnb_model.predict(x_test)

In [64]:
confusion_matrix(y_test, y_pred4)

array([[261,  40],
       [ 38, 295]], dtype=int64)

In [65]:
accuracy_score(y_test, y_pred4)

0.8769716088328076

##### applying k nearest neighbour to predict voice

In [66]:
# applying standardscaler to make independant variables values into same scale

from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
x = sc_x.fit_transform(x)

In [67]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

In [68]:
from sklearn.neighbors import KNeighborsClassifier
knn_class_model = KNeighborsClassifier()
knn_class_model.fit(x_train, y_train)

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
                     metric_params=None, n_jobs=None, n_neighbors=5, p=2,
                     weights='uniform')

In [70]:
y_pred5 = knn_class_model.predict(x_test)

In [71]:
confusion_matrix(y_test, y_pred5)

array([[293,   8],
       [  6, 327]], dtype=int64)

In [72]:
accuracy_score(y_test, y_pred5)

0.9779179810725552

##### applying svm to predict voice

In [73]:
from sklearn.svm import SVC

svc_model = SVC(kernel='linear')
svc_model.fit(x_train, y_train)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='linear', max_iter=-1, probability=False, random_state=None,
    shrinking=True, tol=0.001, verbose=False)

In [75]:
y_pred6 = svc_model.predict(x_test)

In [76]:
confusion_matrix(y_test, y_pred6)

array([[289,  12],
       [  5, 328]], dtype=int64)

In [77]:
accuracy_score(y_test, y_pred6)

0.973186119873817

##### Applying grid search algorithm to check SVC with different types of kernel and C values

In [78]:
from sklearn.model_selection import GridSearchCV

parameters = [{'C':[1,2,3,4,5,6,7,8,9,10], 'kernel':['sigmoid', 'rbf', 'linear', 'poly']}]

In [79]:
grid_search = GridSearchCV(estimator=svc_model, param_grid=parameters,
                          scoring='accuracy', n_jobs=-1)

# estimator - checking for which algorithm, parameter_grid - for which parameters we have to check best accuracy
# scoring - accuracy score, n_jobs=-1 means entire CPU dedicating forthis process

grid_search = grid_search.fit(x_train, y_train)
best_accuracy = grid_search.best_score_
best_parameters = grid_search.best_params_



In [80]:
best_accuracy

0.9810576164167324

In [81]:
best_parameters

{'C': 9, 'kernel': 'rbf'}

## Best classifier model for voice type prediction is svm classifier with rbf kernel and C=9

Accuracy = 98.1%