## Evaluating regression techniques for speaker characterization
### Laura Fernández Gallardo

I am addressing the prediction of each of the 5 traits of interpersonal speaker [characteristics](https://github.com/laufergall/Subjective_Speaker_Characteristics): 'warmth', 'attractiveness', 'confidence', 'compliance', and 'maturity'. These were obtained after factor analysis on the 34-dimensional ratings of speaker characteristics.

In this notebook, I will only only concentrate on the 'warmth' trait, while other notebooks deal with the detection of other characteristics and traits, or multi-output regression.

I will consider the common RMSE (root mean squared error) as evaluation metric.

In [4]:
import io
import requests
import time # for timestamps

import numpy as np
import pandas as pd
from ast import literal_eval # parsing hp after tuner

from reg_tuning import * # my helper functions

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

In [5]:
# fix random seed for reproducibility
seed = 2302
np.random.seed(seed)

In [6]:
# features and ratings from the regression task with 1d output

feats_ratings_train = pd.read_csv(r'.\data_while_tuning\feats_ratings_train.csv')
feats_ratings_test = pd.read_csv(r'.\data_while_tuning\feats_ratings_test.csv')

sc_names = ['non_likable', 'secure', 'attractive', 'unsympathetic', 'indecisive', 'unobtrusive', 'distant', 'bored', 'emotional', 'not_irritated', 'active', 'pleasant', 'characterless', 'sociable', 'relaxed', 'affectionate', 'dominant', 'unaffected', 'hearty', 'old', 'personal', 'calm', 'incompetent', 'ugly', 'friendly', 'masculine', 'submissive', 'indifferent', 'interesting', 'cynical', 'artificial', 'intelligent', 'childish', 'modest']

dropcolumns = ['name','spkID','speaker_gender'] + sc_names
feats_names = list(feats_ratings_train.drop(dropcolumns, axis=1))

## Speakers' WAAT

WAAT (warmth-attractiveness) can be seen as the first two dimensions of the perceived speaker characteristics. Scaled scores (with mean = 0 and std = 1) of speakers on these dimensions were already extracted in the [subjective analysis](https://github.com/laufergall/Subjective_Speaker_Characteristics), for males and for females separately.

In [7]:
# speaker scores

path = "https://raw.githubusercontent.com/laufergall/Subjective_Speaker_Characteristics/master/data/generated_data/"

url = path + "factorscores_malespk.csv"
s = requests.get(url).content
scores_m =pd.read_csv(io.StringIO(s.decode('utf-8')))

url = path + "factorscores_femalespk.csv"
s = requests.get(url).content
scores_f =pd.read_csv(io.StringIO(s.decode('utf-8')))

# rename dimensions
scores_m.columns = ['sample_heard', 'warmth', 'attractiveness', 'confidence', 'compliance', 'maturity']
scores_f.columns = ['sample_heard', 'warmth', 'attractiveness', 'compliance', 'confidence', 'maturity']

# join male and feame scores
scores = scores_m.append(scores_f)
scores['gender'] = scores['sample_heard'].str.slice(0,1)
scores['spkID'] = scores['sample_heard'].str.slice(1,4).astype('int')

scores.head()

Unnamed: 0,attractiveness,compliance,confidence,maturity,sample_heard,warmth,gender,spkID
0,-0.579301,-0.921918,0.608503,0.27658,m004_linden_stimulus.wav,-0.284638,m,4
1,0.442865,-0.950212,0.588889,0.630295,m005_nicosia_stimulus.wav,-0.494019,m,5
2,-0.507534,0.139302,-0.151077,-0.669449,m006_rabat_stimulus.wav,1.533478,m,6
3,1.180748,-0.108982,0.962166,1.026359,m007_klaksvik_stimulus.wav,0.478983,m,7
4,1.070247,-0.284278,-0.875589,-1.291311,m016_beirut_stimulus.wav,1.861551,m,16


In [8]:
# save names of speaker traits

traits_names = list(scores.drop(['spkID','gender','sample_heard'], axis=1))

myfile = open(r'.\data_while_tuning\traits_names.csv', 'w')
for item in sc_names:
    myfile.write("%r\n" % item)

In [9]:
# merge scores and features

feats_ratings_scores_train = feats_ratings_train.merge(scores) # (2700, 132)
feats_ratings_scores_test = feats_ratings_test.merge(scores) # (891, 132)

# drop unnecessary columns
feats_ratings_scores_train = feats_ratings_scores_train.drop(['speaker_gender','sample_heard'] + sc_names, axis = 1)
feats_ratings_scores_test = feats_ratings_scores_test.drop(['speaker_gender','sample_heard'] + sc_names, axis = 1)

# 'name' + 88 features + 5 traits + 'gender' + 'spkID'
# shape train: (2700, 96), shape test: (891, 96) 

In [10]:
# Standardize speech features  

dropcolumns = ['name','gender','spkID'] + list(scores_m.columns)[1:]

# learn transformation on training data
scaler = StandardScaler()
scaler.fit(feats_ratings_scores_train.drop(dropcolumns, axis=1))

# numpy n_instances x n_feats
feats_s_train = scaler.transform(feats_ratings_scores_train.drop(dropcolumns, axis=1))
feats_s_test = scaler.transform(feats_ratings_scores_test.drop(dropcolumns, axis=1)) 

## Model tuning with feature selection

Use the train data to find the classifier and its hyperparameters leading to the best performance. 

In [11]:
target_trait = 'warmth'

In [12]:
# training data. Features and labels
X = feats_s_train # (2700, 88)
y = feats_ratings_scores_train[target_trait].as_matrix() # (2700, 2)

# test data. Features and labels
Xt = feats_s_test # (891, 88)
yt = feats_ratings_scores_test[target_trait].as_matrix() # (891, 2)

# split train data into 80% and 20% subsets - with balance in trait and gender
# give subset A to the inner hyperparameter tuner
# and hold out subset B for meta-evaluation
AX, BX, Ay, By = train_test_split(X, y, test_size=0.20, stratify = feats_ratings_scores_train['gender'], random_state=2302)

print('Number of instances in A (hyperparameter tuning):',AX.shape[0])
print('Number of instances in B (meta-evaluation):',BX.shape[0])

Number of instances in A (hyperparameter tuning): 2160
Number of instances in B (meta-evaluation): 540


In [13]:
# save splits

# train/test partitions, features and labels
np.save(r'.\data_while_tuning\X_' + target_trait + '.npy', X)
np.save(r'.\data_while_tuning\y_' + target_trait + '.npy', y)
np.save(r'.\data_while_tuning\Xt_' + target_trait + '.npy', Xt)
np.save(r'.\data_while_tuning\yt_' + target_trait + '.npy', yt)

# # A/B splits, features and labels
np.save(r'.\data_while_tuning\AX_' + target_trait + '.npy', AX)
np.save(r'.\data_while_tuning\BX_' + target_trait + '.npy', BX)
np.save(r'.\data_while_tuning\Ay_' + target_trait + '.npy', Ay)
np.save(r'.\data_while_tuning\By_' + target_trait + '.npy', By)

In [14]:
# dataframe with results from hp tuner to be appended
tuning_all = pd.DataFrame()

# list with tuned classifiers trained on training data, to be appended
trained_all = []

### Calling hp_tuner() for each target and each regressor

** Recover ** when new ipynb session started.

(Workaround for working with hyperparameter tuning during several days)

In [None]:
# original features and ratings

feats_ratings_train = pd.read_csv(r'.\data_while_tuning\feats_ratings_train.csv')

feats_ratings_test = pd.read_csv(r'.\data_while_tuning\feats_ratings_test.csv')

feats_names = pd.read_csv(r'.\data_while_tuning\feats_names.csv', header = None)
feats_names = feats_names.values.tolist()

traits_names = pd.read_csv(r'.\data_while_tuning\sc_names.csv', header = None)
traits_names = traits_names.values.tolist()


In [None]:
# select a trait
# perform this on a loop later
target_trait = traits_names[0][0].strip('"\'')

# train/test partitions, features and labels
X = np.load(r'.\data_while_tuning\X_' + target_trait + '.npy')
y = np.load(r'.\data_while_tuning\y_' + target_trait + '.npy')
Xt = np.load(r'.\data_while_tuning\Xt_' + target_trait + '.npy')
yt = np.load(r'.\data_while_tuning\yt_' + target_trait + '.npy')

# A/B splits, features and labels
AX = np.load(r'.\data_while_tuning\AX_' + target_trait + '.npy')
BX = np.load(r'.\data_while_tuning\BX_' + target_trait + '.npy')
Ay = np.load(r'.\data_while_tuning\Ay_' + target_trait + '.npy')
By = np.load(r'.\data_while_tuning\By_' + target_trait + '.npy')

In [None]:
# Loading outpus of hp tuning from disk
tuning_all, trained_all = load_tuning(target_trait)

Call this after each experiment **to recover later**: 

In [None]:
# save tuning_all (.csv) and trained_all (nameregressor.sav)
save_tuning(tuning_all, trained_all, target_trait)

In [None]:
from sklearn.linear_model import LinearRegression

"""
Linear Regression
"""
def get_LinearRegression2tune():

    model = LinearRegression()
    hp = dict()
    return 'LinearRegression', model, hp

# Hyperparameter tuning with this model
tuning, trained = hp_tuner(AX, BX, Ay, By, 
                           [get_LinearRegression2tune], 
                           target_trait,
                           feats_names,
                           np.arange(1, AX.shape[1]), 
                           'grid')

# update lists of tuning info and trained regressors
tuning_all = tuning_all.append(tuning, ignore_index=True)
trained_all.append(trained)

In [2]:
import pandas as pd

feats_ratings_scores_train = pd.read_csv(r'.\data_while_tuning\feats_ratings_scores_train.csv')
feats_ratings_scores_test = pd.read_csv(r'.\data_while_tuning\feats_ratings_scores_test.csv')

In [3]:
feats_ratings_scores_test.head()

Unnamed: 0,name,F0semitoneFrom27.5Hz_sma3nz_amean,F0semitoneFrom27.5Hz_sma3nz_stddevNorm,F0semitoneFrom27.5Hz_sma3nz_percentile20.0,F0semitoneFrom27.5Hz_sma3nz_percentile50.0,F0semitoneFrom27.5Hz_sma3nz_percentile80.0,F0semitoneFrom27.5Hz_sma3nz_pctlrange0-2,F0semitoneFrom27.5Hz_sma3nz_meanRisingSlope,F0semitoneFrom27.5Hz_sma3nz_stddevRisingSlope,F0semitoneFrom27.5Hz_sma3nz_meanFallingSlope,...,MeanUnvoicedSegmentLength,StddevUnvoicedSegmentLength,equivalentSoundLevel_dBp,spkID,attractiveness,compliance,confidence,maturity,warmth,gender
0,'m018_rotterdam_d5_01.wav',24.66507,0.154307,21.50342,24.24933,28.13913,6.635715,174.9657,224.6961,56.81863,...,0.192258,0.318299,-33.36011,18,-0.108876,-1.550649,1.035135,-0.723008,-0.754499,m
1,'m018_rotterdam_d5_02.wav',26.94494,0.252551,23.42789,25.06512,30.00163,6.573738,347.4387,952.5995,225.4744,...,0.206333,0.288438,-33.65467,18,-0.108876,-1.550649,1.035135,-0.723008,-0.754499,m
2,'m018_rotterdam_d5_03.wav',25.49493,0.200455,23.1493,25.7672,28.17774,5.02844,129.9314,96.66322,106.0074,...,0.490588,0.57281,-35.34972,18,-0.108876,-1.550649,1.035135,-0.723008,-0.754499,m
3,'m018_rotterdam_d6_01.wav',26.37638,0.269255,23.2099,25.70293,28.59468,5.384777,197.6909,145.6062,73.44101,...,0.291429,0.481637,-33.43142,18,-0.108876,-1.550649,1.035135,-0.723008,-0.754499,m
4,'m018_rotterdam_d6_02.wav',26.74776,0.22475,23.83732,25.42774,29.04288,5.205564,107.6499,136.2754,451.4274,...,0.350667,0.495358,-32.69361,18,-0.108876,-1.550649,1.035135,-0.723008,-0.754499,m
