## Evaluating classification techniques for speaker characterization
### Laura Fernández Gallardo

Using the models trained with clean speech, analyze the effects of testing with distorted speech on the classification performance.

* WAAT classification
* multilabel classification

As done when tuning models, the evaluation metric considered is the **average per-class accuracy**.

In [1]:
import io
import requests
import pickle # save / load models

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

In [2]:
# fix random seed for reproducibility
seed = 2302
np.random.seed(seed)

## Speech degradations

The speech degradations employed in this study are outline in the [README.md file](https://github.com/laufergall/ML_Speaker_Characteristics) of this repository.


In [None]:
# load features from clean speech


In [5]:
# load features from degraded speech

# (gitignored file)
file_feats = r'..\data\extracted_features\eGeMAPSv01a_semispontaneous_splitted_distorted.csv'
feats_dist = pd.read_csv(file_feats) # shape feats_dist

# distortion code
feats_dist['distnum'] = feats_dist['name'].str.slice(5,7)

# add column with packet loss info
feats_dist['packetlossrate'] = feats_dist['name'].str.extract('(?<=P)(.*?)(?=-)', expand=False)

## add column with jitter info
feats_dist['jitterms'] = feats_dist['name'].str.extract('(?<=J)(.*?)(?=%)', expand=False)

# load mapping of degradation names
path = "https://raw.githubusercontent.com/laufergall/ML_Speaker_Characteristics/master/data/distortions/"
url = path + "dist_mapping.csv"
s = requests.get(url).content
dist_mapping =pd.read_csv(io.StringIO(s.decode('utf-8')), sep = ';')

# padding distortion codes with zeroes
dist_mapping['distnum'] = dist_mapping['distnum'].astype(str).str.zfill(2)

# merge feats with distmapping to know name of distortions
feats_dist = feats_dist.merge(dist_mapping)



In [10]:
# how many degradations
print(feats_dist['distcode'].value_counts()) # around 4490 instances for each degradation
print(len(feats_dist['distcode'].value_counts())) # 53 degradations

SWB_Opus_24       4493
SWB_EVS_7_2       4482
WB_Speex_23_8     4478
SWB_G7221C_48     4477
WB_AMRWB+_24      4477
NB_GSMEFR_12_2    4476
NB_Speex_2_15     4471
SWB_EVS_128       4469
WB_AMRWB+_13_6    4466
WB_Speex_42_2     4465
WB_AMRWB_12_65    4465
SWB_Opus_32       4464
SWB_Opus_64       4464
NB_AMRNB_6_7      4460
WB_AMRWB_23_85    4460
WB_Speex_3_95     4459
SWB_EVS_24_4      4459
WB_AMRWB_23_05    4458
WB_AMRWB+_20_8    4456
NB_AMRNB_7_4      4456
NB_G7231_6_3      4455
SWB_EVS_48        4454
NB_G711_A_64      4454
NB_AMRNB_5_9      4454
NB_G7231_5_3      4453
WB_G722_64        4452
NB_AMRNB_4_75     4451
NB_AMRNB_10_2     4451
SWB_EVS_96        4451
WB_AMRWB_8_85     4448
WB_AMRWB_19_85    4446
WB_AMRWB+_15_2    4445
NB_AMRNB_5_15     4445
WB_AMRWB_14_25    4444
SWB_G7221C_24     4443
WB_AMRWB+_16_8    4441
SWB_Opus_160      4440
NB_Speex_11       4439
SWB_EVS_64        4439
WB_AMRWB+_10_4    4437
WB_AMRWB+_12      4437
NB_G711_u_64      4432
NB_AMRNB_7_95     4429
SWB_EVS_32 

In [11]:
dist_mapping

Unnamed: 0,distnum,distcode
0,0,Clean
1,1,NB_G711_A_64
2,2,NB_G711_u_64
3,3,NB_G7231_5_3
4,4,NB_G7231_6_3
5,5,NB_GSMEFR_12_2
6,6,NB_AMRNB_4_75
7,7,NB_AMRNB_5_15
8,8,NB_AMRNB_5_9
9,9,NB_AMRNB_6_7


## WAAT classification

The WAAT (warmth-attractiveness) score distribution was already explored in Part I.

In [None]:
# load WAAT scores (averaged across listeners)

path = "https://raw.githubusercontent.com/laufergall/Subjective_Speaker_Characteristics/master/data/generated_data/"

url = path + "factorscores_malespk.csv"
s = requests.get(url).content
scores_m =pd.read_csv(io.StringIO(s.decode('utf-8')))

url = path + "factorscores_femalespk.csv"
s = requests.get(url).content
scores_f =pd.read_csv(io.StringIO(s.decode('utf-8')))

# rename dimensions
scores_m.columns = ['sample_heard', 'warmth', 'attractiveness', 'confidence', 'compliance', 'maturity']
scores_f.columns = ['sample_heard', 'warmth', 'attractiveness', 'compliance', 'confidence', 'maturity']

# join male and feame scores
scores = scores_m.append(scores_f)
scores['gender'] = scores['sample_heard'].str.slice(0,1)
scores['spkID'] = scores['sample_heard'].str.slice(1,4).astype('int')

scores.head()

In [None]:
# scatter plot

sns.lmplot('warmth', 'attractiveness', data = scores, hue="gender")

In [None]:
# histogram, kernel density estimation
sns.jointplot('warmth', 'attractiveness', data = scores, kind="kde").set_axis_labels("warmth", "attractiveness")

Get 3 clusters of speakers based on the WAAT distribution.
Each cluster with approx. the same number of instances.

In [None]:
# applying k-means

n_clusters=3

kmeans = KMeans(n_clusters=n_clusters, random_state=2302).fit(scores[['warmth','attractiveness']])

scores['class'] = pd.Categorical(kmeans.labels_).rename_categories(['low','high','mid'])

sns.lmplot('warmth', 'attractiveness', data = scores, hue="class",fit_reg=False)
 
print(scores['class'].value_counts())    

## Select trait for binary classification and perform data partition

**Removing** speakers in the mid class to address binary classification.

In [None]:
# remove speakers in the mid class

scores = scores.loc[ scores['class'] != 'mid', ['spkID','gender','class']]

scores.head()


In [None]:
scores['class'] = pd.Categorical(scores['class'], categories=['low','high'])

print(scores.groupby(['gender','class']).count())

Split speakers into train (75%) and test (25%) speakers with class and gender balance (stratified) by creating the dummy "gendertrait" class.

In [None]:
# get stratified random partition for train and test

scores['genderclass']=scores[['gender', 'class']].apply(lambda x: ''.join(x), axis=1)

indexes = np.arange(0,len(scores))
classes = scores['class']
train_i, test_i, train_y, test_y = train_test_split(indexes, 
                                                    classes, 
                                                    test_size=0.25, 
                                                    stratify = scores['genderclass'], 
                                                    random_state=2302)

scores_train = scores.iloc[train_i,:] 
scores_test = scores.iloc[test_i,:] 

print('Number of speakers in Train:',len(scores_train))
print('Number of speakers in Test:',len(scores_test))

print('Number of w-high speakers in Train:', len(scores_train.loc[scores_train['genderclass']=='whigh']) )
print('Number of m-high speakers in Train:', len(scores_train.loc[scores_train['genderclass']=='mhigh']) )
print('Number of w-low speakers in Train:', len(scores_train.loc[scores_train['genderclass']=='wlow']) )
print('Number of m-high speakers in Train:', len(scores_train.loc[scores_train['genderclass']=='mlow']) )

print('Number of w-high speakers in Test:', len(scores_test.loc[scores_test['genderclass']=='whigh']) )
print('Number of m-high speakers in Test:', len(scores_test.loc[scores_test['genderclass']=='mhigh']) )
print('Number of w-low speakers in Test:', len(scores_test.loc[scores_test['genderclass']=='wlow']) )
print('Number of m-low speakers in Test:', len(scores_test.loc[scores_test['genderclass']=='mlow']) )


# # save these data for other evaluations
scores_train.iloc[:,0:3].to_csv(r'..\data\generated_data\speakerIDs_cls_WAAT_train.csv', index=False)
scores_test.iloc[:,0:3].to_csv(r'..\data\generated_data\speakerIDs_clss_WAAT_test.csv', index=False)

## Speech features

Speech features have been extracted from the semi-spontaneous dialogs uttered by the 300 speakers of the [NSC corpus](http://www.qu.tu-berlin.de/?id=nsc-corpus). 

Each semi-spontaneous dialog was splitted into 3 segments of approx. 20s, and the 88 [eGeMAPS](http://ieeexplore.ieee.org/document/7160715/) speech features were extracted from each segment (see ..\feature_extraction).

299 speakers recorded 4 semi-spontaneous dialogs, and 1 female speaker recorded 1 semi-spontaneous dialog. Total = 1197 dialogs * 3 segments = 3591 speech files.

Unfortunately, no subjective ratings have been collected for the spontaneous dialogs d5, d7, or d8. However, we use the speech features in order to have more instances with which to train and test the models.

**I assume** that the speakers' trait classes remain constant across recordings, that is, is a speaker is perceived as 'high' in the _intelligent_ trait for dialog 6 (d6, pizza dialog), then this perception would be the same for the other dialogs uttered by the same speaker.

In [None]:
# load speech features

path = "https://raw.githubusercontent.com/laufergall/ML_Speaker_Characteristics/master/data/extracted_features/"

url = path + "/eGeMAPSv01a_semispontaneous_splitted.csv"
s = requests.get(url).content
feats =pd.read_csv(io.StringIO(s.decode('utf-8')), sep = ';') # shape: 3591, 89

feats.describe()

Pre-processing features with the transformation **learnt with training data**:

* center and scale speech features

In [None]:
# Separate instances according to the train and test partition
# instances corresponding to speakers in the mid class will be left out

# extract speaker ID from speech file name
feats['spkID'] = feats['name'].str.slice(2, 5).astype('int')

# appending class label
feats_class_train = pd.merge(feats, scores_train[['spkID','genderclass','class']], how='inner')
feats_class_test = pd.merge(feats, scores_test[['spkID','genderclass','class']], how='inner')

print('Number of high instances in Train:', len(feats_class_train.loc[feats_class_train['class']=='high']) )
print('Number of low instances in Train:', len(feats_class_train.loc[feats_class_train['class']=='low']) )
print('Number of high instances in Test:', len(feats_class_test.loc[feats_class_test['class']=='high']) )
print('Number of low instances in Test:', len(feats_class_test.loc[feats_class_test['class']=='low']) )

feats_class_train.head()

In [None]:
# Standardize speech features  

# save feature names
feats_names = list(feats_class_train.drop(['name','spkID','genderclass','class'],axis=1))

myfile = open(r'.\data_while_tuning\feats_names.csv', 'w')
for item in feats_names:
    myfile.write("%r\n" % item)

# learn transformation on training data
scaler = StandardScaler()
scaler.fit(feats_class_train.drop(['name','spkID','genderclass','class'],axis=1))

# numpy n_instances x n_feats
feats_s_train = scaler.transform(feats_class_train.drop(['name','spkID','genderclass','class'],axis=1))
feats_s_test = scaler.transform(feats_class_test.drop(['name','spkID','genderclass','class'],axis=1)) 

## Model tuning with feature selection

Use the train data to find the classifier and its hyperparameters leading to the best performance. 

Perform feature selection with "SelectKBest": selecting best k features based on ANOVA F-value computed between class label and feature. k ranging from 2 to total number of features in intervals of 5.

In [None]:
"""
Summarize results of cross-validation on set A for hyperparameter tuning

Inputs:
- cname: classifier name
- grid_result: gridsearch results for this classifier, output of grid.fit(AX, Ay)  
- filename: filename with path to write the summary to
"""
def summary_tuning(cname, grid_result, filename):
    
    means = grid_result.cv_results_['mean_test_score']
    stds = grid_result.cv_results_['std_test_score']
    params = grid_result.cv_results_['params']

    # print best result and append to our lists
    print("%r -> Best cross-val score on A set: %f using %s" % (cname, grid_result.best_score_, grid_result.best_params_))
    
    # dataframe with summary    
    d = {
        'model': cname, 
        'mean_acc_A': means, 
        'stdev_acc_A': stds, 
        'params': params, 
    }
    df = pd.DataFrame(data = d) 
    df.to_csv(filename, index=False)
      
                           
        

"""
Perform nested hyperparameter tuning.
Given training data splitted into A, B sets and for each classifier type:
Stratified cross-validation for feature selection and hyperparameter tuning using set A
Generates csv file with summary of hp tuning (set A)
Evaluate the performance on set B and return accs

Input:
- AX and BX: features of the train set, splitted
- Ay and By: labels of the train set, splitted
- get_cls_functions: list of functions tho get classifier and dict of hp to tune

Output: 
- cls_acc_hps: pandas dataframe with:
    - classifiers names
    - classifiers hyperparameters
    - selected features
    - accuracies on B set
    of each tuned classifier corresponding to get_cls_functions
- trained_cls_list: list of classifiers tuned and trained on (AX+BX)
"""    

def hp_tuner(AX, BX, Ay, By, get_cls_functions):

    # init lists (there will be one element per classifier in get_cls_functions)
    
    classifiers_names = []
    classifiers = []
    hparam_grids = []
    best_accs = [] # on the B set
    best_hps = [] # determined with CV on A
    sel_feats_i = [] # indexes of selected features
    sel_feats = [] # names of selected features
    trained_cls_list = [] # tuned classifier trained on X,y
    
    # iterate over list of functions 
    # to get classifiers and parameters and append to our lists

    for fn in get_cls_functions:     
        clsname, cls, hp = fn()
        classifiers_names.append(clsname)
        classifiers.append(cls)
        hparam_grids.append(hp)
        
    # tune hyperparameters with GridSearchCV for each classifier
    
    for i in np.arange(len(classifiers)):

        # create pipeline
        pipe = Pipeline([
            ('selecter', SelectKBest(f_classif, k=4)),
            ('classifier', classifiers[i])
        ])
        
        # feature selection params
        fsel_params = dict(
            selecter__k = np.arange(2, AX.shape[1], 5)
        )

        # feature selection params and classifier's params for param_grid: 
        all_params = {**fsel_params, **hparam_grids[i]}
        
        # perform grid search
        grid = GridSearchCV(estimator=pipe,
                            param_grid=all_params,
                            scoring='recall_macro', # average per-class accuracy 
                            n_jobs=1,
                            cv=10)
        
        # This might take a while:
        grid_result = grid.fit(AX, Ay) 
        
        # summary of hp tuning on set A
        # generate one csv file per classifier
        summary_tuning(classifiers_names[i], 
                       grid_result, 
                       r'.\data_while_tuning\%s_tuning.csv' % classifiers_names[i])

        # get selected features on set A
        sel_i = grid_result.best_estimator_.named_steps['selecter'].get_support()
        selected = [i for indx, i in enumerate(feats_names) if sel_i[indx]]
        print("%r -> Selected features: %r" % (classifiers_names[i], selected))
        sel_feats_i.append(sel_i)
        sel_feats.append(selected)
        
        # evaluate classifier on set B
        By_pred = grid_result.best_estimator_.predict(BX)
        score_on_B = recall_score(By, By_pred, average='macro')
        print("%r -> Average per-class accuracy on B set: %f\n" % (classifiers_names[i], score_on_B))
        
        # add score on B and hyperparams for output
        best_accs.append(score_on_B)
        best_hps.append(grid_result.best_params_)
        
        # train classifier using all training data with this classifier
        X = np.concatenate((AX, BX), axis=0)
        y = np.concatenate((Ay, By), axis=0)
        trained_cls = grid_result.best_estimator_.fit(X,y)
        trained_cls_list.append(trained_cls)
    
    # create the output dataframe    
    d = {
        'classifiers_names': classifiers_names, 
        'best_accs': best_accs, 
        'best_hps': best_hps, 
        'sel_feats': sel_feats, 
        'sel_feats_i': sel_feats_i
    }
    cls_acc_hps = pd.DataFrame(data = d) 
    
    return cls_acc_hps, trained_cls_list

Main code snippet to evaluate classification accuracy:
    
* Choose data (feature and labels) for train X and y and test Xt and yt
* Split train data into A and B sets
* Hyperparameter tuner using A and B sets data by calling hp_tuner()
    * For each classifier type:
        * Stratified cross-validation for hyperparameter tuning using set A
        * Evaluate the performance on set B
* Select classifier based on the best performance on set B and train it using all training data   
* Get performance on test set

(Nested hyperparameter tuning inspired by [A. Zheng](http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp))


In [None]:
# training data. Features and labels
X = feats_s_train
y = feats_class_train['class'].cat.codes

# test data. Features and labels
Xt = feats_s_test
yt = feats_class_test['class'].cat.codes

# split train data into 80% and 20% subsets - with balance in trait and gender
# give subset A to the inner hyperparameter tuner
# and hold out subset B for meta-evaluation
AX, BX, Ay, By = train_test_split(X, y, test_size=0.20, stratify = feats_class_train['genderclass'], random_state=2302)

print('Number of instances in A (hyperparameter tuning):',AX.shape[0])
print('Number of instances in B (meta-evaluation):',BX.shape[0])
    

In [None]:
# dataframe with results from hp tuner to be appended
tuning_all = pd.DataFrame()

# list with tuned classifiers trained on training data, to be appended
trained_all = []

In [None]:
# save splits

import csv

# original features and class
feats_class_train.to_csv(r'.\data_while_tuning\feats_class_train.csv', index=False)
feats_class_test.to_csv(r'.\data_while_tuning\feats_class_test.csv', index=False)

# train/test partitions, features and labels
np.save(r'.\data_while_tuning\X.npy', X)
np.save(r'.\data_while_tuning\y.npy', y)
np.save(r'.\data_while_tuning\Xt.npy', Xt)
np.save(r'.\data_while_tuning\yt.npy', yt)

# # A/B splits, features and labels
np.save(r'.\data_while_tuning\AX.npy', AX)
np.save(r'.\data_while_tuning\BX.npy', BX)
np.save(r'.\data_while_tuning\Ay.npy', Ay)
np.save(r'.\data_while_tuning\By.npy', By)



In [None]:
"""
Saving outpus of hp tuning to disk
Called after tuning each classifier

Input:
- tuning_all: pandas df with tuning results
- trained_all: list of all classifiers trained on training data
""" 
def save_tuning(tuning_all, trained_all):
    
    # save tuning_all
    tuning_all.to_csv(r'.\data_while_tuning\tuning_all.csv', index=False)
    
    # save trained_all
    for i in np.arange(len(trained_all)):
        filename = r'.\data_while_tuning\trained_' + tuning_all.loc[i, 'classifiers_names'] + '.sav'
        pickle.dump(trained_all[i], open(filename, 'wb'))
        
"""
Loading outpus of hp tuning from disk
Called to recover what was tuned and trained in previous sessions

Output:
- tuning_all: pandas df with tuning results
- trained_all: list of all classifiers trained on training data
""" 
def load_tuning():
    
    # load tuning_all
    tuning_all = pd.read_csv(r'.\data_while_tuning\tuning_all.csv')
    
    # load trained_all
    trained_all=[]
    for i in np.arange(len(tuning_all)):
        filename = r'.\data_while_tuning\trained_' + tuning_all.loc[i, 'classifiers_names'] + '.sav'
        loaded_model = pickle.load(open(filename, 'rb'))
        trained_all.append(loaded_model)
        
    return tuning_all, trained_all

### Calling hp_tuner() for each classifier individually

** Recover ** when new ipynb session started
(Workaround for working with hyperparameter tuning during several days)

In [None]:

# original features and class
feats_class_train = pd.read_csv(r'.\data_while_tuning\feats_class_train.csv')
feats_class_test = pd.read_csv(r'.\data_while_tuning\feats_class_test.csv')
feats_names = pd.read_csv(r'.\data_while_tuning\feats_names.csv')

# train/test partitions, features and labels
X = np.load(r'.\data_while_tuning\X.npy')
y = np.load(r'.\data_while_tuning\y.npy')
Xt = np.load(r'.\data_while_tuning\Xt.npy')
yt = np.load(r'.\data_while_tuning\yt.npy')

# A/B splits, features and labels
AX = np.load(r'.\data_while_tuning\AX.npy')
BX = np.load(r'.\data_while_tuning\BX.npy')
Ay = np.load(r'.\data_while_tuning\Ay.npy')
By = np.load(r'.\data_while_tuning\By.npy')


In [None]:
# Loading outpus of hp tuning from disk
tuning_all, trained_all = load_tuning()

Call this after each experiment: 

In [None]:
# save tuning_all (.csv) and trained_all (nameclassifier.sav)
save_tuning(tuning_all, trained_all)

#### GaussianNB

*class sklearn.naive_bayes.GaussianNB(priors=None)*

No parameters to tune for this classifier. Priors not specified, so they will be adjusted given the data.

In [None]:
from sklearn.naive_bayes import GaussianNB

"""
Naive Bayes Classifier
"""
def get_GaussianNB2tune():

    model = GaussianNB()
    hp = dict()
    return 'GaussianNB', model, hp

# Hyperparameter tuning with this model
tuning, trained = hp_tuner(AX, BX, Ay, By, [get_GaussianNB2tune])

# update lists of tuning info and trained classifiers
tuning_all = tuning_all.append(tuning, ignore_index=True)
trained_all.append(trained)

In [None]:
# open generated file with results of fitting GridSearchCV
 
sgrid = pd.read_csv(r'.\data_while_tuning\GaussianNB_tuning.csv')
print(sgrid['params'].head())

# params to dataframe
params_dict = sgrid['params'].apply(lambda x: literal_eval(x) ).to_dict()
params_df = pd.DataFrame(data = params_dict).transpose()

# plot acc vs. k
sns.pointplot(x='selecter__k', y='mean_acc_A', data=sgrid.join(params_df)) 

Not so good performance with naive bayes, and no trend can be seen to detect which number of selected features is better.

#### LogisticRegression

*class sklearn.linear_model.LogisticRegression(penalty=’l2’, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver=’liblinear’, max_iter=100, multi_class=’ovr’, verbose=0, warm_start=False, n_jobs=1)*

Tuning C (inverse of regularization strength). The 'liblinear' solver (a good choice for small datasets) handles L1 penalty.


In [None]:
from sklearn.linear_model import LogisticRegression

"""
Logistic Regression
"""
def get_LogisticRegression2tune():

    model = LogisticRegression()
    hp = dict(
        #classifier__penalty = ['l1','l2'],
        classifier__C = np.logspace(-3,3,num=7)
    )
    return 'LogisticRegression', model, hp

# Hyperparameter tuning with this model
tuning, trained = hp_tuner(AX, BX, Ay, By, [get_LogisticRegression2tune])

# update lists of tuning info and trained classifiers
tuning_all = tuning_all.append(tuning, ignore_index=True)
trained_all.append(trained)

In [None]:
# open generated file with results of fitting GridSearchCV
 
sgrid = pd.read_csv(r'.\data_while_tuning\LogisticRegression_tuning.csv')

# params to dataframe
params_dict = sgrid['params'].apply(lambda x: literal_eval(x) ).to_dict()
params_df = pd.DataFrame(data = params_dict).transpose()

# plot acc vs. params
sns.pointplot(x='selecter__k', y='mean_acc_A',hue='classifier__C', data=sgrid.join(params_df)) 

Including more features is beneficial for the performance of logistic regression. Similar behavior when C >= 0.1

#### K Nearest Neighbors

In [None]:
from sklearn.neighbors import KNeighborsClassifier

"""
K Nearest Neighbors
"""
def get_KNeighborsClassifier2tune():

    model = KNeighborsClassifier()
    hp = dict(
        classifier__n_neighbors = list(range(1,40))
    )
    return 'KNeighborsClassifier', model, hp

# Hyperparameter tuning with this model
tuning, trained = hp_tuner(AX, BX, Ay, By, [get_KNeighborsClassifier2tune])

# update lists of tuning info and trained classifiers
tuning_all = tuning_all.append(tuning, ignore_index=True)
trained_all.append(trained)

In [None]:
# open generated file with results of fitting GridSearchCV
 
sgrid = pd.read_csv(r'.\data_while_tuning\KNeighborsClassifier_tuning.csv')

# params to dataframe
params_dict = sgrid['params'].apply(lambda x: literal_eval(x) ).to_dict()
params_df = pd.DataFrame(data = params_dict).transpose()

# plot acc vs. params
params_df = params_df.loc[params_df['classifier__n_neighbors']<10,:] # selecting only lower k
sns.pointplot(x='selecter__k', y='mean_acc_A',hue='classifier__n_neighbors', data=sgrid.join(params_df)) 

The performance of KNN classifiction tends to be better with more features with 6 neighbors or less.

#### Support Vector Machines

In [None]:
from sklearn.svm import SVC

"""
Support Vector Machines
"""
def get_SVC2tune():
    
    model = SVC()
    hp = dict(
        classifier__C = np.logspace(-5,3,num=9),
        classifier__kernel = ['poly'], #['linear', 'poly', 'rbf', 'sigmoid'],
        classifier__degree = [2] #, # only 'poly' kernel
        #classifier__gamma = np.logspace(-5,3,num=9)
    )
    return 'SVC', model, hp

# Hyperparameter tuning with this model
tuning, trained = hp_tuner(AX, BX, Ay, By, [get_SVC2tune])

# # update lists of tuning info and trained classifiers
# tuning_all = tuning_all.append(tuning, ignore_index=True)
# trained_all.append(trained)

#### DecisionTreeClassifier

In [None]:
from sklearn.tree import DecisionTreeClassifier

"""
Decision Trees
"""
def get_DecisionTreeClassifier2tune():
    
    model = DecisionTreeClassifier()
    hp = dict(
        classifier__max_depth = np.arange(2,4)#np.arange(2,11)
    )
    return 'DecisionTreeClassifier', model, hp

# Hyperparameter tuning with this model
tuning, trained = hp_tuner(AX, BX, Ay, By, [get_DecisionTreeClassifier2tune])

# # update lists of tuning info and trained classifiers
# tuning_all = tuning_all.append(tuning, ignore_index=True)
# trained_all.append(trained)

#### RandomForestClassifier

In [None]:
from sklearn.ensemble import RandomForestClassifier

"""
Random Forest
"""
def get_RandomForestClassifier2tune():
    
    model = RandomForestClassifier()
    hp = dict(
        classifier__n_estimators = np.arange(2,4)#np.arange(2,51)
    )
    return 'RandomForestClassifier', model, hp

# Hyperparameter tuning with this model
tuning, trained = hp_tuner(AX, BX, Ay, By, [get_RandomForestClassifier2tune])

# # update lists of tuning info and trained classifiers
# tuning_all = tuning_all.append(tuning, ignore_index=True)
# trained_all.append(trained)

In [None]:
# select the classifier that gave the maximum acc on B set
best_accs = tuning_all['best_accs']
i_best = best_accs.idxmax()

print('Selected classifier based on the best performance on B: %r (accB = %0.2f)' % (tuning_all.loc[i_best,'classifiers_names'], round(best_accs[i_best],2)))

In [None]:
# predictions on the test set 
# feat sel on Xt is performed by the classifier
yt_pred = trained_all[i_best][0].predict(Xt)

score_on_test = recall_score(yt, yt_pred, average='macro')

print("Average per-class accuracy on test: %f" % score_on_test) 

cm = confusion_matrix(yt, yt_pred)
print(classification_report(yt, yt_pred, digits = 2))
print(cm)
 