# Linear models: Support Vector Machines (SVM)

In this notebook we are going to explore linear models and Support Vector Machines (SVM in short).

Let's first import the required packages.

In [None]:
#put here your ``numero di matricola''
numero_di_matricola = 2074282

from sklearn import datasets, preprocessing, linear_model, svm
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

## SVM for linearly separable data

Let's start by creating a simple linearly separable dataset for binary classification, where the instance space is $\mathcal{X} =\mathbb{R}^2$ (so that we can visualize it). Just to make things easier, we are going to rescale it too.

In [None]:
X, y = datasets.make_blobs(n_samples = 500, centers = 2, n_features = 2, random_state=numero_di_matricola)

scaler = preprocessing.StandardScaler()
scaler.fit(X)
X = scaler.transform(X)

The following code plots the dataset, it is useful for later parts too.

In [None]:
plt.title("Plot of dataset")
plt.scatter(X[:, 0], X[:, 1], c=y)

Now let's run the perceptron, using $\texttt{linear\_model.Perceptron(...)}$ from sklearn. WE fix the number of iterations to 100 so that it runs quickly, and $\texttt{random\_state=10}$.

What do we expect in terms of training error? 

In [None]:
#Create a perceptron classifier
# TO DO: COMPLETE
model_perceptron_1 = linear_model.Perceptron(max_iter=100, random_state=10)

#Training the model
# TO DO: COMPLETE
model_perceptron_1.fit(X, y)

#Get the training error as 1 - score()
# TO DO: COMPLETE
training_error = 1-model_perceptron_1.score(X, y)

#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)

The following code plots the *decision boundary* of a model and the training set. It is useful for later parts too.

In [None]:
plt.scatter(X[:, 0], X[:, 1], c=y, s=30)
ax = plt.gca()
plt.title("Plot of perceptron decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
# use the decision_function call to obtain the boundary to be plot.
# TO DO

Z = model_perceptron_1.decision_function(xy).reshape(XX.shape)

# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
linestyles=['-'])

If we change the value of $\texttt{random\_state}$ in the perceptron, it will start from a different model. 

Let's run the perceptron with $\texttt{random\_state}=12$. How will the solution compare to the above?

In [None]:
#Create a perceptron classifier
# TO DO: COMPLETE
model_perceptron_2 = linear_model.Perceptron(max_iter=100, random_state=12)

#Training the model
# TO DO: COMPLETE
model_perceptron_2.fit(X, y)

#Get the training error as 1 - score()
# TO DO: COMPLETE
training_error = 1-model_perceptron_2.score(X, y)

#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)

What about the decision boundary? Let's plot it.

In [None]:
# TO DO: WRITE THE CODE TO PLOT THE DECISION BOUNDARY

plt.scatter(X[:, 0], X[:, 1], c=y, s=30)
ax = plt.gca()
plt.title("Plot of perceptron decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
# use the decision_function call to obtain the boundary to be plot.
# TO DO

Z = model_perceptron_2.decision_function(xy).reshape(XX.shape)

# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
linestyles=['-'])

Which model is better? 

Is any of these the *best* choice?

Now, let's run the hard-SVM on the same data. To obtain (an almost) hard-SVM in sklearn, we can use $\texttt{svm.SVC(...)}$ with a very high value of the parameter $C$.

In [None]:
#Creating a SVM model
# TO DO: COMPLETE
model_svm = svm.SVC(kernel="linear" , C=100000000)

#Training the model
model_svm.fit(X, y)


#Get the training error as 1 - score()
training_error = 1 - model_svm.score(X , y)


#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)


Plot the SVM decision boundary.

In [None]:
# TO DO: WRITE THE CODE TO PLOT THE DECISION BOUNDARY

plt.scatter(X[:, 0], X[:, 1], c=y, s=30)
ax = plt.gca()
plt.title("Plot of perceptron decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
# use the decision_function call to obtain the boundary to be plot.
# TO DO

Z = model_svm.decision_function(xy).reshape(XX.shape)

# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
linestyles=['-'])

Let's see what the support vectors are. They are defined in attribute $\texttt{support_vectors_}$

In [None]:
#print the support vectors (attribute support)
#TO DO: COMPLETE
print(model_svm.support_vectors_)

print(model_svm.dual_coef_)

Let's what happens moving one support vector. We first obtain the indices of the support vectors.

In [None]:
#print the indices of support vectors (attribute support)
#TO DO: COMPLETE
print(model_svm.support_)

Now let's move one support vector closer to the points in the same class.

In [None]:
#let's copy the data and maove one support vector close to the points in the same class
X1 = X.copy()

X1[321 , 0] = -1

#let's plot the new dataset
#TO DO: COMPLETE
plt.title("Plot of dataset")
plt.scatter(X1[:, 0], X1[:, 1], c=y)

Let's run the SVM on the new data.

In [None]:
#Creating a SVM model
# TO DO: COMPLETE
model_svm_1 = svm.SVC(kernel="linear" , C=100000000)

#Training the model
model_svm_1.fit(X1, y)


#Get the training error as 1 - score()
training_error = 1 - model_svm_1.score(X1 , y)


#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)


Plot the SVM decision boundary and the previous decision boundary.

In [None]:
# TO DO: WRITE THE CODE TO PLOT THE DECISION BOUNDARY

plt.scatter(X[:, 0], X[:, 1], c=y, s=30)
ax = plt.gca()
plt.title("Plot of perceptron decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
# use the decision_function call to obtain the boundary to be plot.
# TO DO

Z1 = model_svm_1.decision_function(xy).reshape(XX.shape)

# plot decision boundary and margins
ax.contour(XX, YY, Z1, colors='k', levels=[0], alpha=1,
linestyles=['-'])
ax.contour(XX, YY, Z, colors='b', levels=[0], alpha=1,
linestyles=['-'])



Now let's move one support vector closer to the points in the otherr class.

In [None]:
#let's copy the original data and move one support vector close to the points in the other class
#TO DO: COMPLETE


#let's plot the new dataset
#TO DO: COMPLETE


Let's run the SVM on the new data.

In [None]:
#Creating a SVM model
# TO DO: COMPLETE


#Training the model
# TO DO: COMPLETE


#Get the training error as 1 - score()
# TO DO: COMPLETE


#Print the training error
# TO DO: COMPLETE


Let's plot the new decision boundary, and the old ones too.

In [None]:
# TO DO: COMPLETE



## SVM for non-linearly separable data

Let's make a dataset that is not linearly separable, and let's plot it.

In [None]:
# TO DO: COMPLETE

X_nls, y_nls = datasets.make_blobs(n_samples = 500, centers = 2, n_features = 2, random_state=numero_di_matricola)

scaler = preprocessing.StandardScaler()
scaler.fit(X_nls)
X_nls = scaler.transform(X_nls)

a = np.array([[0.3, 2]])
b = np.array([0])
X_nls = np.concatenate((X_nls, a))
y_nls = np.concatenate((y_nls, b))

a = np.array([[0.1, -1.5]])
b = np.array([0])
X_nls = np.concatenate((X_nls, a))
y_nls = np.concatenate((y_nls, b))

a = np.array([[0, 0.1]])
b = np.array([1])
X_nls = np.concatenate((X_nls, a))
y_nls = np.concatenate((y_nls, b))

plt.title("Plot of dataset")
plt.scatter(X_nls[:,0],X_nls[:,1], c = y_nls)


Let's try to learn a hard-SVM. It means that the parameter C, which is approximately equal to $1/\lambda$ with $\lambda$ as in our slides.

In [None]:
#Creating a SVM model
# TO DO: COMPLETE
model_hard_svm = svm.SVC(kernel="linear" , C=100000000)

#Training the model
model_hard_svm.fit(X_nls, y_nls)


#Get the training error as 1 - score()
training_error = 1 - model_hard_svm.score(X_nls , y_nls)


#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)


The following code plots the decision boundary, as well as the margin.

In [None]:
plt.scatter(X_nls[:, 0], X_nls[:, 1], c=y_nls, s=30)
ax = plt.gca()
plt.title("Plot of hard SVM decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T

#TO DO
Z = model_hard_svm.decision_function(xy).reshape(XX.shape)

ax.contour(
    XX, YY, Z, colors="k", levels=[-1, 0, 1], alpha=0.5, linestyles=["--", "-", "--"]
)

Let's try with a smaller value of C ($10^4$), that corresponds to larger value of $\lambda$.

What do you expect?

In [None]:
#Creating a SVM model
# TO DO: COMPLETE
model_hard_svm = svm.SVC(kernel="linear" , C=10000)

#Training the model
model_hard_svm.fit(X_nls, y_nls)


#Get the training error as 1 - score()
training_error = 1 - model_hard_svm.score(X_nls , y_nls)


#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)


What about the decision boundary and the margin?

In [None]:
plt.scatter(X_nls[:, 0], X_nls[:, 1], c=y_nls, s=30)
ax = plt.gca()
plt.title("Plot of hard SVM decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T

#TO DO
Z = model_hard_svm.decision_function(xy).reshape(XX.shape)

ax.contour(
    XX, YY, Z, colors="k", levels=[-1, 0, 1], alpha=0.5, linestyles=["--", "-", "--"]
)

Let's repeat everything for C=100

In [None]:
#Creating a hard SVM model
# TO DO: COMPLETE
model_hard_svm_3 = svm.SVC(kernel="linear",C=100)

#Training the model
# TO DO: COMPLETE
model_hard_svm_3.fit(X_nls, y_nls)

#Get the training error as 1 - score()
# TO DO: COMPLETE
training_error = 1- model_hard_svm_3.score(X_nls,y_nls)

#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)

In [None]:
# TO DO: COMPLETE

plt.scatter(X_nls[:, 0], X_nls[:, 1], c=y_nls, s=30)
ax = plt.gca()
plt.title("Plot of hard SVM decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = model_hard_svm_3.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
#ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
#linestyles=['-'])
ax.contour(
    XX, YY, Z, colors="k", levels=[-1, 0, 1], alpha=0.5, linestyles=["--", "-", "--"]
)

And for C=1?

In [None]:
#Creating a hard SVM model
# TO DO: COMPLETE
model_hard_svm_4 = svm.SVC(kernel="linear",C=1)

#Training the model
# TO DO: COMPLETE
model_hard_svm_4.fit(X_nls, y_nls)

#Get the training error as 1 - score()
# TO DO: COMPLETE
training_error = 1- model_hard_svm_4.score(X_nls,y_nls)

#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)

In [None]:
# TO DO: COMPLETE

plt.scatter(X_nls[:, 0], X_nls[:, 1], c=y_nls, s=30)
ax = plt.gca()
plt.title("Plot of hard SVM decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = model_hard_svm_4.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
#ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
#linestyles=['-'])
ax.contour(
    XX, YY, Z, colors="k", levels=[-1, 0, 1], alpha=0.5, linestyles=["--", "-", "--"]
)

Let's see what are the support vectors.

In [None]:
#TO DO COMPLETE:
print(model_hard_svm_4.support_vectors_)


Just for comparison, let's run the perceptron on the same dataset with various initial random states

In [None]:
#Create a perceptron classifier
# TO DO: COMPLETE
model_perceptron_nls = linear_model.Perceptron(max_iter=100, random_state = 0)

#Training the model
# TO DO: COMPLETE
model_perceptron_nls.fit(X_nls, y_nls)

#Get the training error as 1 - score()
# TO DO: COMPLETE
training_error = 1- model_perceptron_nls.score(X_nls,y_nls)

#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)

Let's plot the decision boundary.

In [None]:
# TO DO: COMPLETE

plt.scatter(X_nls[:, 0], X_nls[:, 1], c=y_nls, s=30)
ax = plt.gca()
plt.title("Plot of perceptron decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = model_perceptron_nls.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
linestyles=['-'])

In [None]:
#Create a perceptron classifier
# TO DO: COMPLETE
model_perceptron_nls = linear_model.Perceptron(max_iter=100, random_state = 10)

#Training the model
# TO DO: COMPLETE
model_perceptron_nls.fit(X_nls, y_nls)

#Get the training error as 1 - score()
# TO DO: COMPLETE
training_error = 1- model_perceptron_nls.score(X_nls,y_nls)

#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)

plt.scatter(X_nls[:, 0], X_nls[:, 1], c=y_nls, s=30)
ax = plt.gca()
plt.title("Plot of perceptron decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = model_perceptron_nls.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
linestyles=['-'])

In [None]:
#Create a perceptron classifier
# TO DO: COMPLETE
model_perceptron_nls = linear_model.Perceptron(max_iter=100, random_state = 24)

#Training the model
# TO DO: COMPLETE
model_perceptron_nls.fit(X_nls, y_nls)

#Get the training error as 1 - score()
# TO DO: COMPLETE
training_error = 1- model_perceptron_nls.score(X_nls,y_nls)

#Print the training error
# TO DO: COMPLETE
print("Training error: ", training_error)

plt.scatter(X_nls[:, 0], X_nls[:, 1], c=y_nls, s=30)
ax = plt.gca()
plt.title("Plot of perceptron decision boundary")
xlim = ax.get_xlim()
ylim = ax.get_ylim()
# create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = model_perceptron_nls.decision_function(xy).reshape(XX.shape)
# plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[0], alpha=1,
linestyles=['-'])