# Description

This notebook is part of an assignment made during the subject of Computational Intelligence, 7th semester at Federal University of Pará.

Professor: Aldebaro Klautau

Authors:

    - Bruno Martins
    - Claudio Matheus

## Packages

In [None]:
import os
import matplotlib.pyplot as plt
import matplotlib as mpl
import pandas as pd
import numpy as np

## Assure the libraries have the same version as used throughout the code

In [None]:
assert mpl.__version__ == '3.5.3'
assert pd.__version__ == '1.3.5'
assert np.__version__ == '1.19.5'

## Read dataset

In [None]:
DATASET_ROOT_PATH = './datasets/svm_homework/'

VALIDATION_PATH = os.path.join(DATASET_ROOT_PATH, 'dataset_validation.txt')
TRAIN_PATH = os.path.join(DATASET_ROOT_PATH, 'dataset_train.txt')
TEST_PATH = os.path.join(DATASET_ROOT_PATH, 'dataset_test.txt')

In [None]:
validation = pd.read_csv(VALIDATION_PATH, header=None)
train = pd.read_csv(TRAIN_PATH, header=None)
test = pd.read_csv(TEST_PATH, header=None)

## Separate features from labels

In [None]:
X_train, X_test, X_val = train.iloc[:, :-1], test.iloc[:, :-1], validation.iloc[:, :-1]
y_train, y_test, y_val = train.iloc[:, -1], test.iloc[:, -1], validation.iloc[:, -1]

## First Question

In [None]:
from sklearn.svm import SVC, LinearSVC

In [None]:
def train_svm_classifiers(X: list, y: list) -> list:
    """
        Trains 4 different svm classifiers
    """
    models = []
    model_1 = LinearSVC()
    model_1.fit(X,y)
    models.append(model_1)
    model_2 = SVC()
    model_2.fit(X,y)
    models.append(model_2)
    model_3 = SVC()
    model_3.fit(X,y)
    models.append(model_3)
    model_4 = SVC()
    model_4.fit(X,y)
    models.append(model_4)
    return models

In [None]:
models = train_svm_classifiers(X_train, y_train)



In [None]:
fig, ax = plt.subplots(nrows=2, ncols=2)
ax[0,0].plot()
ax[0,0].set_xlabel()
ax[0,1].plot()
ax[0,1].set_xlabel()
ax[1,0].plot()
ax[1,0].set_xlabel()
ax[1,1].plot()
ax[1,1].set_xlabel()

## Second Question

In sckit-learn there is a hyperparameter that crontols the quantities of support vectors of a model. This parameter is called C. Therefore, it works in this way: if C is substantially a high value, then, the numbers of support vectors will be smaller. Alternatively, if C is considerably a low value, then, the quantities of support vectors will be huge.

![SVM_C_values](https://raw.githubusercontent.com/Euronym/computational_intelligence_2022/main/images/C_values_SVM.png)

In the words, the width of the way between the convex-hull and the hyperplan of the model will be minor for the high value of C, and larger for the low value of C. Provoking, respectively, a decrease of support vectors and an increase of them.

## Third Question

## Fourth Question

svm.n_support_= [1 2]

svm.support_vectors_= [[ 1. 4.] [-2.  3.] [-2. -5.]]

svm.dual_coef_= [[-0.5 -0.3 0.8]]

svc.intercept_= [-2]

### Item (A):
Considering these attributes above, there is below the decision model:

$f(z) = -0.5 \times <z, [1, 4]>  -0.3 \times <z, [-2, 3]> + 0.8 \times <z, [1, 4]> -2$


### Item (B):
For a SVM as a perceptron it's quite similar to the previous models, nevertheless, there is important difference, as we'll see in the next:

First of all, it's known that weight $w$ of perceptron defintion ($f(z)= <z, w> + b$) is defined by:

$w = \displaystyle\sum_{i = 0}^{N-1}\lambda_{n} x_i$

where $\lambda$ in sckit-learn API is identified by a variable called *dual_coef_*.

Therefore, using the general definition of SVM and the associative property of dot product, there is the development of definition of SVM as a perceptron:

$f(z) = \displaystyle\sum_{i = 0}^{N-1}\lambda_{n}K(z, x_n) + b$

- Assuming that is a linear kernel:

$f(z) = \displaystyle\sum_{i = 0}^{N-1}\lambda_{n}<z, x_n> + b$

- And finally, using associative property of dot product:

$f(z) = \displaystyle\sum_{i = 0}^{N-1}<z, \lambda_{n}x_n> + b$

$f(z) = \displaystyle\sum_{i = 0}^{N-1}<z, w> + b$

Now, applying this definition for the SVM in this problem, we have the final result:

$f(z) = <z, [-0.5, 2]> + <z, [0.6, 0.9]> + <z, [-1.6, -4]> -2$

### item (C):
First of all considering $I(f(z))$ "indicative" function, where is defined like this:

$I(f(z))=\begin{cases}
    1, & f(z) > 0\\
    0, & \text{otherwise}.
  \end{cases}$

So, evaluate $f(z)$ for $z = [0, 0]$, we have this:

$f([0, 0]) = -0.5 \times <[0, 0], [1, 4]>  -0.3 \times <[0, 0], [-2, 3]> + 0.8 \times <[0, 0], [1, 4]> -2$

It's quite clear that dot product between a vector at origin and any other vector result in value 0. Therefore, the result of these three dot product is 0. So:

$f([0, 0]) = 0 - 2$

$f([0, 0]) = -2$

Using this result in "indicative" function, we have this:

$I(f([0, 0])) = 0$

## Fifth Question

## Sixth Question

## Seventh Question