# Experiments with kernel machines

In this notebook we will use simple two-dimensional data sets to illustrate the behavior of the support vector machine and the Perceptron, when used with quadratic and RBF kernels.

## 1. Basic training procedure

In [1]:
%matplotlib inline
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from sklearn.svm import SVC
matplotlib.rc('xtick', labelsize=14) 
matplotlib.rc('ytick', labelsize=14)

The directory containing this notebook should also contain two-dimensional data files, `data1.txt` through `data5.txt`. These files contain one data point per line, along with a label (either -1 or 1), like:
* `3 8 -1` (meaning that point `x=(3,8)` has label `y=-1`)

The next procedure, **learn_and_display_SVM**, loads one of these data sets, invokes `sklearn.SVC` to learn a classifier, and then displays the data as well as the boundary. It is invoked as follows:
* `learn_and_display_SVM(datafile, kernel_type, C_value, s_value)`
 
**`NOTE: the C value is for the soft margin SVM method and the s value is for the RBF Kernel`**

where
* `datafile` is one of `'data1.txt'` through `'data5.txt'` (or another file in the same format)
* `kernel_type` is either `'quadratic'` or `'rbf'`
* `C_value` is the setting of the soft-margin parameter `C` (default: 1.0)
* `s_value` (for the RBF kernel) is the scaling parameter `s` (default: 1.0)

In [None]:
def learn_and_display_SVM(datafile, kernel_type='rbf', C_value=1.0, s_value=1.0):
    data = np.loadtxt(datafile)
    n,d = data.shape
    # Create training set x and labels y
    x = data[:,0:2]
    y = data[:,2]
    # Now train a support vector machine and identify the support vectors
    if kernel_type == 'rbf':
        clf = SVC(kernel='rbf', C=C_value, gamma=1.0/(s_value*s_value))
    if kernel_type == 'quadratic':
        clf = SVC(kernel='poly', degree=2, C=C_value, coef0=1.0)
    clf.fit(x,y) #clf = classifier
    sv = np.zeros(n,dtype=bool) #0 == False 
    sv[clf.support_] = True 
    save_this = clf.support_  # 'clf.support_' gives the indice of the support vectors
    notsv = np.logical_not(sv) 
    # False in 'sv' will be True in 'notsv' and vice and versa
    # Determine the x1- and x2- limits of the plot
    x1min = min(x[:,0]) - 1
    x1max = max(x[:,0]) + 1
    x2min = min(x[:,1]) - 1
    x2max = max(x[:,1]) + 1
    plt.xlim(x1min,x1max)
    plt.ylim(x2min,x2max)
    # Plot the data points, enlarging those that are support vectors
    plt.plot(x[(y==1)*notsv,0], x[(y==1)*notsv,1], 'ro')  #y==1 & notsv ==> red + smaller
    plt.plot(x[(y==1)*sv,0], x[(y==1)*sv,1], 'ro', markersize=10) #y==1 & sv ==> red + bigger
    plt.plot(x[(y==-1)*notsv,0], x[(y==-1)*notsv,1], 'k^')
    plt.plot(x[(y==-1)*sv,0], x[(y==-1)*sv,1], 'k^', markersize=10)
    # Construct a grid of points and evaluate classifier at each grid points
    grid_spacing = 0.05
    xx1, xx2 = np.meshgrid(np.arange(x1min, x1max, grid_spacing), np.arange(x2min, x2max, grid_spacing))
    grid = np.c_[xx1.ravel(), xx2.ravel()]
    Z = clf.decision_function(grid)
    # Quantize the values to -1, -0.5, 0, 0.5, 1 for display purposes
    for i in range(len(Z)):
        Z[i] = min(Z[i],1.0)
        Z[i] = max(Z[i],-1.0)
        if (Z[i] > 0.0) and (Z[i] < 1.0):
            Z[i] = 0.5
        if (Z[i] < 0.0) and (Z[i] > -1.0):
            Z[i] = -0.5
    # Show boundary and margin using a color plot
    Z = Z.reshape(xx1.shape)
    plt.pcolormesh(xx1, xx2, Z, cmap=plt.cm.PRGn, vmin=-2, vmax=2)
    plt.show()
    
    #return save_this, x,y,notsv

## 2. Experiments with the quadratic kernel

Let's try out SVM on some examples, starting with the quadratic kernel.

In [None]:
learn_and_display_SVM('data1.txt', 'quadratic', 1.0)

In [None]:
learn_and_display_SVM('data1.txt', 'quadratic', 2.0)

In [None]:
learn_and_display_SVM('data1.txt', 'quadratic', 30.0)

Also try `data2.txt` through `data5.txt`. Also try changing the value of `C` (the third parameter) to see how that affects the boundary and margin.

In [None]:
learn_and_display_SVM('data2.txt', 'quadratic', 1.0)

In [None]:
learn_and_display_SVM('data3.txt', 'quadratic', 30.0)

In [None]:
learn_and_display_SVM('data4.txt', 'quadratic', 1.0)

In [None]:
learn_and_display_SVM('data4.txt', 'quadratic', 10000000000000.0) 
# a little bit of slack costs a lot in the minimization. Thus, it is like a hard-margin SVM.

In [None]:
learn_and_display_SVM('data1.txt', 'quadratic', 1000000.0) 

## 3. Experiments with the RBF kernel

Now experiment with the RBF kernel, on the same five data sets. This time there are two parameters to play with: `C` and `sigma`.

In [None]:
learn_and_display_SVM('data1.txt', 'rbf', 10.0, 10.0)

## 4. The kernel Perceptron

<font color="magenta">**For you to do:**</font> Implement the kernel Perceptron algorithm as specified in lecture. Your algorithm should allow both the quadratic and RBF kernel, and should follow roughly the same signature as the SVM routine above:
* `learn_and_display_Perceptron(datafile, kernel_type, s_value)`

Recall that the Perceptron algorithm does not always converge; you will need to explicitly check for this.

# Kernels

**Radial Basis Function (RBF) Kernel** : 
    k(x,z) = exp(-1*(Euclidean_dist^2)/s^2)
    
**Quadratic Kernel** :
    k(x,z) = (1+x*z)^2
    
    

In [None]:
# i=30
# x_picked=x[i]
# y_picked=y[i]
# n=len(x)
# alpah = np.zeros(n)
# b = 0
# s = 10
# kernel_type = "rbf"

In [2]:
def evaluate_classifier(x,y,x_picked,y_picked,alpah,b,s,kernel_type):
    sum_final = 0
    if kernel_type =='quadratic':
        print('use quadratic kernel')
        for num_x in range(len(x)):
            sum_for = alpah[num_x]*y[num_x]*(np.dot(x[num_x],x_picked)+1)**2
            sum_final = sum_final + sum_for
        if y_picked*(sum_final+b) <= 0:
            results = -1   
        if y_picked*(sum_final+b) > 0:
            results = 1
            
    if kernel_type =='rbf':
        print('use RBF kernel')
        for num_x in range(len(x)):
            sum_for = alpah[num_x]*y[num_x]*np.exp(-1*((x[num_x,0]-x_picked[0])**2+(x[num_x,1]-x_picked[1])**2)/s**2)
            sum_final = sum_final + sum_for
        if y_picked*(sum_final+b) <= 0:
            results = -1
        if y_picked*(sum_final+b) > 0:
            results = 1
    return results

In [3]:
def train_perceptron(x,y,s,n_iters=100):
    n=len(x)
    alpah = np.zeros(n)
    b = 0 
    done = False
    converged = True
    iters = 0
    #kernel_type='rbf'
    np.random.seed(None)
    while not(done):
        done = True
        I = np.random.permutation(n)
        for k in range(n):
            i = I[k]
            x_picked=x[i]
            y_picked=y[i]
            results=evaluate_classifier(x,y,x_picked,y_picked,alpah,b,s,kernel_type)
            if results == -1: 
                alpah[i]=alpah[i] + 1
                b = b + y[i]
                done = False
        iters = iters + 1
        
        if iters > n_iters:
            done = True
            converged = False
            
    if converged:
        print("Perceptron algorithm: iterations until convergence: ", iters)
    else:
        print("Perceptron algorithm: did not converge within the specified number of iterations")
    return alpah, b, converged

In [4]:
def learn_and_display_Perceptron(datafile, kernel_type, s_value=1.0):
    
    data = np.loadtxt(datafile)
    n,d = data.shape
    x = data[:,0:2]
    y = data[:,2]
    s = s_value
    alpah,b,converged=train_perceptron(x,y,s,n_iters=100)
    # trained parameters
    
    #
    x1min = min(x[:,0]) - 1
    x1max = max(x[:,0]) + 1
    x2min = min(x[:,1]) - 1
    x2max = max(x[:,1]) + 1
    
    plt.xlim(x1min,x1max)
    plt.ylim(x2min,x2max)
    
    plt.plot(x[(y==1),0], x[(y==1),1], 'ro', markersize=10)
    plt.plot(x[(y==-1),0], x[(y==-1),1], 'k^', markersize=10)
    
    grid_spacing = 0.05
    xx1, xx2 = np.meshgrid(np.arange(x1min, x1max, grid_spacing), np.arange(x2min, x2max, grid_spacing))
    grid = np.c_[xx1.ravel(), xx2.ravel()]
    

    
    
    plt.show()
    
    
    
    return alpah,b,converged,grid

In [6]:
alpah,b,converged,grid=learn_and_display_Perceptron('data5.txt','quadratic', s_value=10)

NameError: name 'kernel_type' is not defined

1. Fix the error : 'kernel_type' variable is not properly loaded. 

2. plot the prediction using the 'grid' test datasets for the boundary 
>    (np.dot(x[num_x],x_picked)+1)**2 :quadratic kernel
>    np.exp(-1*((x[num_x,0]-x_picked[0])**2+(x[num_x,1]-x_picked[1])**2)/s**2) :RBF kernel



<font color="magenta">Experiment with your routine, on the same five data sets.</font>