## ERM for separating hyperlanes (realizable case)

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt 
from sklearn import datasets

First, we load the famous IRIS dataset

In [None]:
data = datasets.load_iris()

Next, we print out the IRIS names and feature names

In [None]:
print data.target_names

In [None]:
for i,name in enumerate(data.feature_names):
    print i, name

Let's plot petal length vs. petal width

In [None]:
leg = []
for i,t in enumerate(data.target_names):
    x = data.data[np.where(data.target==i)[0],2]
    y = data.data[np.where(data.target==i)[0],3]
    plt.scatter(x,y)
    leg.append(t)
plt.legend(leg)

Ok, so **setosa** is linearly separable from the rest (hence, realizability is given). Let's try to find a separating hyperplane. First, we will scale the data (always a good idea)!

In [None]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(data.data[:,2:4])

In [None]:
leg = []
for i,t in enumerate(data.target_names):
    x = X[np.where(data.target==i)[0],0]
    y = X[np.where(data.target==i)[0],1]
    plt.scatter(x,y)
    leg.append(t)
plt.legend(leg)

Ok, now we can formulate our **linear program** and solve it.

We start by creating a label vector of {+1,-1} for setosa (+1) vs. the rest (-1).

In [None]:
labels = np.where(data.target == 0, 1 , -1)

Now, we set up the linear program

```
max <u,w> s.t.
Aw >= v (*)
```
Our rows in A will be [x_i 1] which we then need to multiply the the label y_i. This will give us a matrix ```A_ub``` that when multiplied by ```w``` gives the values for the upper-bound inequality costraint (*).

To do this, we multiply the data first (**Step 1**), and then simply add the labels as a separate (last) column (**Step 2**).



In [None]:
# Step 1
A = np.array(labels.reshape(-1,1) * X)

# construct a matrix A_ub, s.t. when multiplied by w gives the 
# values of the upper-bound inequality constraint.
A_ub = np.append(A, labels.reshape(-1,1), 1)

Now, we can setup ```v``` as a vector of all minus ones.