___
<h1> Machine Learning </h1>
<h2> M. Sc. in Electrical and Computer Engineering </h2>
<h3> Instituto Superior de Engenharia / Universidade do Algarve </h3>

[MEEC](https://ise.ualg.pt/en/curso/1477) / [ISE](https://ise.ualg.pt) / [UAlg](https://www.ualg.pt)

Pedro J. S. Cardoso (pcardoso@ualg.pt)
___

# SVM on the normalized/extended boston dataset

`sklearn.preprocessing.MinMaxScaler` - Transform features by scaling each feature to a given range.
This estimator scales and translates each feature individually such that it is in the given range on the training set, e.g. between zero and one.
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html


`sklearn.preprocessing.PolynomialFeatures` - Generate polynomial and interaction features. Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2]. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html


In [None]:
from sklearn.datasets import load_boston
from sklearn.preprocessing import MinMaxScaler, PolynomialFeatures
from sklearn.svm import LinearSVR
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

def load_extended_boston():
    boston = load_boston()
    
    X_extended = boston.data
    X_extended = MinMaxScaler().fit_transform(X_extended)
    X_extended = PolynomialFeatures(degree=2,include_bias=True).fit_transform(X_extended)

    return boston.data, X_extended, boston.target

X, X_extended, y = load_extended_boston()

In [None]:
print(f'Number of features of the extended data: {X_extended.shape[1]}')

In [None]:
print(f'Number of features of the original data: {X.shape[1]}')

With the Boston's original data

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    shuffle=True,
                                                    random_state=1)

svm = LinearSVR(C=100,
                max_iter=10000,
                random_state=1
               ).fit(X_train, y_train)

score = svm.score(X_test, y_test)
score 

Using the Boston's normalized and extended data set

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_extended, y,
                                                    shuffle=True,
                                                    random_state=1)

svm = LinearSVR(C=100,
                max_iter=10000,
                random_state=1
               ).fit(X_train, y_train)

score = svm.score(X_test, y_test)
score 

In [None]:
y_pred = svm.predict(X_test)

plt.figure(figsize=(15,10))

plt.plot(y_test, c='b')
plt.plot(y_pred, c='g')
plt.plot(np.abs(y_pred-y_test), c='r')

plt.legend(["test", "pred", "$\Delta = |y_i-\hat{y_i}|$"])