# SVM Models

## Objective

The objective of this notebook is to train and test different SVM models, by changing their hyperparameters, in order to obtain the best Random Forest model.
<br><br>
As discussed in "basic_models.ipynb", the transformations that will be used are Minimum and Geometric.

## Loading libraries and data

In [1]:
# importing important libraries

# transformations library
from transformations import minimum, geometric, minimum2D, geometric2D

# models
from sklearn.svm import SVC

# loading data
import pickle

# other modules
from sklearn.model_selection import cross_val_score
from sklearn.metrics import recall_score, make_scorer
from sklearn.model_selection import GridSearchCV
import numpy as np

In [2]:
# get base_dataset
data_path = "TrainTestData/train_data.pickle"
data = pickle.load(open(data_path, "rb"))

In [3]:
# function that calculates weighted_accuracy
# weights are basead on the frequency of the letters in the portuguese alphabet 
# source: https://pt.wikipedia.org/wiki/Alfabeto_portugu%C3%AAs#Frequ%C3%AAncia_da_ocorr%C3%AAncia_de_letras
# H, K, J, X and Z are not present
LETTERS_FREQUENCY = [
    14.63,
    1.04,
    3.88,
    5.01,
    12.57,
    1.02,
    1.30,
    6.18,
    2.78,
    4.74,
    5.05,
    10.73,
    2.52,
    1.20,
    6.53,
    7.81,
    4.34,
    4.63,
    1.67,
    0.01,
    0.01,
]
def weighted_accuracy(y_true, y_pred):
    recall_array = recall_score(y_true, y_pred, average=None)
    weights_total = 0
    result = 0
    for recall, weight in zip(recall_array, LETTERS_FREQUENCY):
        weights_total += weight
        result += recall * weight
    return result / weights_total
weighted_accuracy_score = make_scorer(weighted_accuracy)

## Choosing hyperparameters and transformations

In [4]:
# Minumum transformation
minimum_X = []
for observation in data["features"]:
    minimum_X.append(minimum(observation))

# Geometric transformation
geometric_X = []
for observation in data["features"]:
    geometric_X.append(geometric(observation))

# Minumum 2D transformation
minimum2D_X = []
for observation in data["features"]:
    minimum2D_X.append(minimum2D(observation))

# Geometric 2D transformation
geometric2D_X = []
for observation in data["features"]:
    geometric2D_X.append(geometric2D(observation))

In [5]:
# hyperparameters for first Grid Search
param_grid  = {
    "C": [1, 10, 20],
    "kernel": ["poly", "rbf"],
    "gamma": ["scale", 0.1, 5]
}

## First Grid Search

In [6]:
# Minimum transformation
svm = SVC()
grid_search_minimum = GridSearchCV(svm, param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_minimum.fit(minimum_X, data["labels"])

In [7]:
cvres = grid_search_minimum.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.971515288978226 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.9704526974525187 {'C': 1, 'gamma': 5, 'kernel': 'poly'}
0.9703484393052498 {'C': 20, 'gamma': 5, 'kernel': 'poly'}
0.9688467238090357 {'C': 10, 'gamma': 5, 'kernel': 'poly'}
0.9682809486471115 {'C': 10, 'gamma': 5, 'kernel': 'rbf'}
0.9674574971991469 {'C': 20, 'gamma': 'scale', 'kernel': 'rbf'}
0.9592333312490492 {'C': 20, 'gamma': 'scale', 'kernel': 'poly'}
0.9590039639024027 {'C': 10, 'gamma': 'scale', 'kernel': 'rbf'}
0.9518784535342245 {'C': 10, 'gamma': 'scale', 'kernel': 'poly'}
0.9356183254911506 {'C': 1, 'gamma': 5, 'kernel': 'rbf'}
0.9001516993348521 {'C': 20, 'gamma': 0.1, 'kernel': 'rbf'}
0.8770635394367041 {'C': 1, 'gamma': 'scale', 'kernel': 'poly'}
0.86689549199641 {'C': 10, 'gamma': 0.1, 'kernel': 'rbf'}
0.8667936147095936 {'C': 1, 'gamma': 'scale', 'kernel': 'rbf'}
0.6688023091459663 {'C': 1, 'gamma': 0.1, 'kernel': 'rbf'}
0.4834250869458242 {'C': 20, 'gamma': 0.1, 'kernel': 'poly'}
0.3916160561132518 {'C': 10, 

In [8]:
# Geometric transformation
svm = SVC()
grid_search_geometric= GridSearchCV(svm,  param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_geometric.fit(geometric_X, data["labels"])

In [9]:
cvres = grid_search_geometric.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.9693318827198724 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.9684256265743064 {'C': 10, 'gamma': 5, 'kernel': 'rbf'}
0.9670249937252402 {'C': 1, 'gamma': 5, 'kernel': 'poly'}
0.9644681233181929 {'C': 20, 'gamma': 'scale', 'kernel': 'rbf'}
0.9639165679896575 {'C': 10, 'gamma': 5, 'kernel': 'poly'}
0.9610468633100003 {'C': 20, 'gamma': 5, 'kernel': 'poly'}
0.9526475205003407 {'C': 10, 'gamma': 'scale', 'kernel': 'rbf'}
0.9451158302184629 {'C': 20, 'gamma': 'scale', 'kernel': 'poly'}
0.9364214871487402 {'C': 1, 'gamma': 5, 'kernel': 'rbf'}
0.9285470662312729 {'C': 10, 'gamma': 'scale', 'kernel': 'poly'}
0.8689717922506117 {'C': 20, 'gamma': 0.1, 'kernel': 'rbf'}
0.8279951974869274 {'C': 1, 'gamma': 'scale', 'kernel': 'rbf'}
0.8251180088707327 {'C': 10, 'gamma': 0.1, 'kernel': 'rbf'}
0.772358864573561 {'C': 1, 'gamma': 'scale', 'kernel': 'poly'}
0.5935211167596406 {'C': 1, 'gamma': 0.1, 'kernel': 'rbf'}
0.4072459352727276 {'C': 20, 'gamma': 0.1, 'kernel': 'poly'}
0.3098374581620392 {'C': 10

In [16]:
# Minimum2D transformation
svm = SVC()
grid_search_minimum2D = GridSearchCV(svm, param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_minimum2D.fit(minimum2D_X, data["labels"])

In [17]:
cvres = grid_search_minimum2D.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.9715409641251099 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.9691366463893164 {'C': 10, 'gamma': 5, 'kernel': 'rbf'}
0.968384414629047 {'C': 1, 'gamma': 5, 'kernel': 'poly'}
0.9678931542134483 {'C': 20, 'gamma': 'scale', 'kernel': 'rbf'}
0.9664029875743518 {'C': 20, 'gamma': 5, 'kernel': 'poly'}
0.9660823314122352 {'C': 10, 'gamma': 5, 'kernel': 'poly'}
0.9615250503686182 {'C': 20, 'gamma': 'scale', 'kernel': 'poly'}
0.9559210805931058 {'C': 10, 'gamma': 'scale', 'kernel': 'rbf'}
0.9518970962764053 {'C': 10, 'gamma': 'scale', 'kernel': 'poly'}
0.9304651427063458 {'C': 1, 'gamma': 5, 'kernel': 'rbf'}
0.8925732065532014 {'C': 20, 'gamma': 0.1, 'kernel': 'rbf'}
0.8836840980670508 {'C': 1, 'gamma': 'scale', 'kernel': 'poly'}
0.8702460597970904 {'C': 1, 'gamma': 'scale', 'kernel': 'rbf'}
0.8514182778155199 {'C': 10, 'gamma': 0.1, 'kernel': 'rbf'}
0.6290361828414912 {'C': 1, 'gamma': 0.1, 'kernel': 'rbf'}
0.46929625146615095 {'C': 20, 'gamma': 0.1, 'kernel': 'poly'}
0.3811243932745022 {'C': 1

In [11]:
# Geometric2D transformation
svm = SVC()
grid_search_geometric2D = GridSearchCV(svm, param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_geometric2D.fit(geometric2D_X, data["labels"])

In [12]:
cvres = grid_search_geometric2D.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.9700429965939914 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.9683711625700694 {'C': 10, 'gamma': 5, 'kernel': 'rbf'}
0.9653497603425798 {'C': 1, 'gamma': 5, 'kernel': 'poly'}
0.963743866736972 {'C': 20, 'gamma': 'scale', 'kernel': 'rbf'}
0.9634501562929383 {'C': 10, 'gamma': 5, 'kernel': 'poly'}
0.9626830907891886 {'C': 20, 'gamma': 5, 'kernel': 'poly'}
0.9487641249280229 {'C': 10, 'gamma': 'scale', 'kernel': 'rbf'}
0.9395911038503113 {'C': 20, 'gamma': 'scale', 'kernel': 'poly'}
0.9365061731659807 {'C': 1, 'gamma': 5, 'kernel': 'rbf'}
0.9219483363591532 {'C': 10, 'gamma': 'scale', 'kernel': 'poly'}
0.8540261277440571 {'C': 20, 'gamma': 0.1, 'kernel': 'rbf'}
0.8209098545164704 {'C': 1, 'gamma': 'scale', 'kernel': 'rbf'}
0.7964820251651245 {'C': 10, 'gamma': 0.1, 'kernel': 'rbf'}
0.7507057401676055 {'C': 1, 'gamma': 'scale', 'kernel': 'poly'}
0.5347261931610161 {'C': 1, 'gamma': 0.1, 'kernel': 'rbf'}
0.3829733655155009 {'C': 20, 'gamma': 0.1, 'kernel': 'poly'}
0.28819588952072067 {'C': 1

All types of transformations had similar results, so we will keep all of them. 
<br><br>
The highest value of C and gamma where used, so might be good to try higher values than this.

## Second Grid Search

In [22]:
# hyperparameters for second Grid Search
param_grid  = {
    "C": [20, 40, 60],
    "kernel": ["rbf"],
    "gamma": [5, 10, 20]
}

In [23]:
# Minimum transformation
svm = SVC()
grid_search_minimum = GridSearchCV(svm, param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_minimum.fit(minimum_X, data["labels"])

In [24]:
cvres = grid_search_minimum.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.973303028842485 {'C': 40, 'gamma': 5, 'kernel': 'rbf'}
0.9725262706798803 {'C': 60, 'gamma': 5, 'kernel': 'rbf'}
0.971515288978226 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.9688361267806126 {'C': 60, 'gamma': 10, 'kernel': 'rbf'}
0.9688361267806126 {'C': 60, 'gamma': 10, 'kernel': 'rbf'}
0.9682412067663378 {'C': 20, 'gamma': 10, 'kernel': 'rbf'}
0.9539643732825599 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}
0.9539643732825599 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}
0.9539643732825599 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}


In [25]:
# Geometric transformation
svm = SVC()
grid_search_geometric= GridSearchCV(svm,  param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_geometric.fit(geometric_X, data["labels"])

In [26]:
cvres = grid_search_geometric.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.9696044956061387 {'C': 60, 'gamma': 5, 'kernel': 'rbf'}
0.9693418796876088 {'C': 40, 'gamma': 5, 'kernel': 'rbf'}
0.9693318827198724 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.9634781865603141 {'C': 60, 'gamma': 10, 'kernel': 'rbf'}
0.9634781865603141 {'C': 60, 'gamma': 10, 'kernel': 'rbf'}
0.9627834051208465 {'C': 20, 'gamma': 10, 'kernel': 'rbf'}
0.9503716804278373 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}
0.9503716804278373 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}
0.9503716804278373 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}


In [27]:
# Minimum2D transformation
svm = SVC()
grid_search_minimum2D = GridSearchCV(svm, param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_minimum2D.fit(minimum2D_X, data["labels"])

In [28]:
cvres = grid_search_minimum2D.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.9720178115717 {'C': 40, 'gamma': 5, 'kernel': 'rbf'}
0.9716078180199208 {'C': 60, 'gamma': 5, 'kernel': 'rbf'}
0.9715409641251099 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.9687927481767369 {'C': 60, 'gamma': 10, 'kernel': 'rbf'}
0.9684369009176639 {'C': 40, 'gamma': 10, 'kernel': 'rbf'}
0.9681563586801323 {'C': 20, 'gamma': 10, 'kernel': 'rbf'}
0.959536655619962 {'C': 20, 'gamma': 20, 'kernel': 'rbf'}
0.959448213596036 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}
0.959448213596036 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}


In [29]:
# Geometric2D transformation
svm = SVC()
grid_search_geometric2D = GridSearchCV(svm, param_grid, cv=5, scoring=weighted_accuracy_score, return_train_score=True, n_jobs=-1)

grid_search_geometric2D.fit(geometric2D_X, data["labels"])

In [30]:
cvres = grid_search_geometric2D.cv_results_ 
results = dict(zip(cvres["mean_test_score"], cvres["params"]))
scores = sorted(cvres["mean_test_score"], reverse=True)
for mean_score in scores:
    print(mean_score, results[mean_score])

0.9700429965939914 {'C': 20, 'gamma': 5, 'kernel': 'rbf'}
0.970024842223749 {'C': 40, 'gamma': 5, 'kernel': 'rbf'}
0.969252608761343 {'C': 60, 'gamma': 5, 'kernel': 'rbf'}
0.9686671437322671 {'C': 20, 'gamma': 10, 'kernel': 'rbf'}
0.9672389194323248 {'C': 40, 'gamma': 10, 'kernel': 'rbf'}
0.9662086267461181 {'C': 60, 'gamma': 10, 'kernel': 'rbf'}
0.9600218944999426 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}
0.9600218944999426 {'C': 60, 'gamma': 20, 'kernel': 'rbf'}
0.9598063752521652 {'C': 20, 'gamma': 20, 'kernel': 'rbf'}


The best models used Minimum transformation, with 2D and 3D. We will consider both of them to the next analysis.

## Analysing Time Performance

In [32]:
# Minimum 3D
# average time per prediction
from time import time
best_svm = grid_search_minimum.best_estimator_
best_svm.fit(minimum_X, data["labels"])

start = time()
best_svm.predict(minimum_X)
end = time()
print((end - start) / len(minimum_X))

0.0003291403425150904


In [33]:
# Minimum 2D
# average time per prediction
from time import time
best_svm = grid_search_minimum2D.best_estimator_
best_svm.fit(minimum2D_X, data["labels"])

start = time()
best_svm.predict(minimum2D_X)
end = time()
print((end - start) / len(minimum2D_X))

0.0004945628601929237


## Conclusion

The best SVM model uses Minimum (2D or 3D) transformation, with kernel = "rbf", C = 40 and gamma = 5, with a performance of 97.33% and 97.20%, respectively.
<br><br>
The average time per prediction is 0.0.00033 seconds por 3D and 0.00049 seconds for 2D.