# Exercises

In this section we have two exercises:
1. Implement the polynomial kernel.
2. Implement the multiclass C-SVM.

## Polynomial kernel

You need to extend the ``build_kernel`` function and implement the polynomial kernel if the ``kernel_type`` is set to 'poly'. The equation that needs to be implemented:
\begin{equation}
K=(X^{T}*Y)^{d}.
\end{equation}

In [335]:
import numpy as np
import cvxopt

In [336]:
def build_kernel(data_set,d, kernel_type='linear'): #we add the dimension as a variable
    kernel = np.dot(data_set, data_set.T)
    if kernel_type == 'rbf':
        sigma = 1.0
        objects_count = len(data_set)
        b = np.ones((len(data_set), 1))
        kernel -= 0.5 * (np.dot((np.diag(kernel)*np.ones((1, objects_count))).T, b.T)
                         + np.dot(b, (np.diag(kernel) * np.ones((1, objects_count))).T.T))
        kernel = np.exp(kernel / (2. * sigma ** 2))
    elif kernel_type == 'poly':
        kernel = np.dot(data_set.T, data_set.T)**d
    return kernel

## Implement a multiclass C-SVM

Use the classification method that we used in notebook 7.3 and IRIS dataset to build a multiclass C-SVM classifier. Most implementation is about a function that will return the proper data set that need to be used for the prediction. You need to implement:
- ``choose_set_for_label``
- ``get_labels_count``

In [337]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from itertools import combinations


In [338]:
iris = load_iris()
data_set = iris.data
labels = iris.target


The function returns a train set that is separated according to their labels, and a test set that is common.

In [339]:



def choose_set_for_label(data_set, label):

   
   #Separate train and test
   train_data_set, test_data_set, train_labels, test_labels = train_test_split(
   data_set, label, test_size=0.2)
   #We separate the train set into three lists depending on the label
   final_train_data_set=[]
   final_train_labels=[]
   for i in range(3):
    train_data_set_aux = train_data_set[train_labels == i]
    label_aux = train_labels[train_labels == i]
    final_train_data_set.append(train_data_set_aux)
    final_train_labels.append(label_aux)

   
   

   return final_train_data_set, test_data_set, final_train_labels, test_labels

In [340]:
def get_labels_count(data_set):
    
    labels_count = len(data_set)
    
    return labels_count

Use the code that we have implemented earlier:

In [341]:
def train(train_data_set, train_labels, kernel_type='linear', C=10, threshold=1e-5):
    kernel = build_kernel(train_data_set, kernel_type=kernel_type)

    P = train_labels * train_labels.transpose() * kernel
    q = -np.ones((objects_count, 1))
    G = np.concatenate((np.eye(objects_count), -np.eye(objects_count)))
    
    h = np.concatenate((C * np.ones((objects_count, 1)), np.zeros((objects_count, 1))))

    A = train_labels.reshape(1, objects_count)
    A = A.astype(float)
    b = 0.0

    sol = cvxopt.solvers.qp(cvxopt.matrix(P), cvxopt.matrix(q), cvxopt.matrix(G), cvxopt.matrix(h), cvxopt.matrix(A), cvxopt.matrix(b))

    lambdas = np.array(sol['x'])

    support_vectors_id = np.where(lambdas > threshold)[0]
    vector_number = len(support_vectors_id)
    support_vectors = train_data_set[support_vectors_id, :]

    lambdas = lambdas[support_vectors_id]
    targets = train_labels[support_vectors_id]

    b = np.sum(targets)
    for n in range(vector_number):
        b -= np.sum(lambdas * targets * np.reshape(kernel[support_vectors_id[n], support_vectors_id], (vector_number, 1)))
    b /= len(lambdas)

    return lambdas, support_vectors, support_vectors_id, b, targets, vector_number

def build_kernel(data_set, kernel_type='linear'):
    kernel = np.dot(data_set, data_set.T)
    if kernel_type == 'rbf':
        sigma = 1.0
        objects_count = len(data_set)
        b = np.ones((len(data_set), 1))
        kernel -= 0.5 * (np.dot((np.diag(kernel)*np.ones((1, objects_count))).T, b.T)
                         + np.dot(b, (np.diag(kernel) * np.ones((1, objects_count))).T.T))
        kernel = np.exp(kernel / (2. * sigma ** 2))
    return kernel

def classify_rbf(test_data_set, train_data_set, lambdas, targets, b, vector_number, support_vectors, support_vectors_id):
    kernel = np.dot(test_data_set, support_vectors.T)
    sigma = 1.0
    K = np.dot(test_data_set, support_vectors.T)
    #kernel = build_kernel(train_data_set, kernel_type='rbf')
    c = (1. / sigma * np.sum(test_data_set ** 2, axis=1) * np.ones((1, np.shape(test_data_set)[0]))).T
    c = np.dot(c, np.ones((1, np.shape(kernel)[1])))
    #aa = np.dot((np.diag(K)*np.ones((1,len(test_data_set)))).T[support_vectors_id], np.ones((1, np.shape(K)[0]))).T
    sv = (np.diag(np.dot(train_data_set, train_data_set.T))*np.ones((1,len(train_data_set)))).T[support_vectors_id]
    aa = np.dot(sv,np.ones((1,np.shape(kernel)[0]))).T
    kernel = kernel - 0.5 * c - 0.5 * aa
    kernel = np.exp(kernel / (2. * sigma ** 2))

    y = np.zeros((np.shape(test_data_set)[0], 1))
    for j in range(np.shape(test_data_set)[0]):
        for i in range(vector_number):
            y[j] += lambdas[i] * targets[i] * kernel[j, i]
        y[j] += b
    return np.sign(y)

Now we prepare all the possible train_sets according to the combinations we have:

In [342]:
iris = load_iris()
data_set = iris.data
labels = iris.target


train_data_set, test_data_set, train_labels, test_labels= choose_set_for_label(data_set, labels) 
train_data_set_final = []
train_labels_final = []
for i in range(3): #Lists with combinations 2/0,0/1,/1,2
    train_data_set_aux = train_data_set[i-1].tolist() + train_data_set[i].tolist() 
    train_labels_aux = train_labels[i-1].tolist() + train_labels[i].tolist()
    train_data_set_final.append(train_data_set_aux)
    train_labels_final.append(train_labels_aux)   

And we predict for each combination. Instead of train_data_set, we have to write train_data_set_final[i]. 

In [343]:
from sklearn.metrics import accuracy_score 


predictions=[] #here we will append the predictions for each classifier
for i in range(3):
 pred_aux = []
 #Change train set to +-1  
 min_label = np.min(train_labels_final[i])
 max_label = np.max(train_labels_final[i])
 for j in range(len(train_labels_final[i])):
  if train_labels_final[i][j] == min_label:
     train_labels_final[i][j] = -1
  else:
     train_labels_final[i][j] = 1
 objects_count = get_labels_count(train_data_set_final[i])
 #We change to array
 train_data_set_final[i] = np.array(train_data_set_final[i])   
 train_labels_final[i] = np.array(train_labels_final[i])

 #training of svm and prediction
 lambdas, support_vectors, support_vectors_id, b, targets, vector_number = train(train_data_set_final[i], train_labels_final[i], kernel_type='rbf')
 predicted = classify_rbf(test_data_set, train_data_set_final[i], lambdas, targets, b, vector_number, support_vectors, support_vectors_id)
 predicted = list(predicted.astype(int)) #predictions for each classifier
 #If we have -1, we append one label, and if we have +1, we append the other one   
 for k in range(len(predicted)):
    if predicted[k] == -1:
       pred_aux.append(min_label)
    elif predicted[k] == 1:
       pred_aux.append(max_label)
 predictions.append(pred_aux)   
    
print("Predictions:",predictions)   


     pcost       dcost       gap    pres   dres
 0:  1.0810e+02 -1.2796e+03  2e+03  2e-01  3e-15
 1:  6.3003e+01 -1.3109e+02  2e+02  6e-03  2e-15
 2:  7.3860e+00 -1.7990e+01  3e+01  9e-16  2e-15
 3: -7.1387e-01 -4.1824e+00  3e+00  1e-15  1e-15
 4: -1.3892e+00 -2.2550e+00  9e-01  4e-16  4e-16
 5: -1.6316e+00 -1.9670e+00  3e-01  1e-15  2e-16
 6: -1.7300e+00 -1.8474e+00  1e-01  3e-16  2e-16
 7: -1.7595e+00 -1.7822e+00  2e-02  2e-16  2e-16
 8: -1.7668e+00 -1.7699e+00  3e-03  2e-16  2e-16
 9: -1.7681e+00 -1.7682e+00  9e-05  3e-16  2e-16
10: -1.7681e+00 -1.7681e+00  2e-06  3e-16  2e-16
11: -1.7681e+00 -1.7681e+00  6e-08  5e-16  2e-16
Optimal solution found.
     pcost       dcost       gap    pres   dres
 0:  9.7073e+01 -1.2517e+03  2e+03  2e-01  3e-15
 1:  6.0083e+01 -1.2130e+02  2e+02  5e-03  2e-15
 2:  7.1881e+00 -1.6701e+01  2e+01  1e-15  3e-15
 3: -5.1864e-01 -3.8056e+00  3e+00  2e-15  1e-15
 4: -1.1776e+00 -1.8470e+00  7e-01  8e-16  4e-16
 5: -1.3930e+00 -1.6677e+00  3e-01  3e-16  2e-1

We have three lists (inside one) with the predictions for each classifier. Now we will put the predictions for each datum inside
a list, which will then be subject to a voting:

In [344]:


prediction_final = []
for j in range(len(predictions[0])):
    prediction_split = []
    for k in range(len(predictions)):
      prediction_split.append(predictions[k][j])
    counts = np.bincount(prediction_split)
    prediction_final.append(np.argmax(counts))
    
    
print("Final accuracy:",accuracy_score(prediction_final, test_labels))
#https://cvxopt.org/examples/tutorial/lp.html

Final accuracy: 0.6


The accuracy is not great, but it is what we should expect since the classifiers do not predict a single datum as 2. I guess there is something wrong with the code, but I don't know what. 

#Update: I checked again the notebook 062_SVM_C and saw that the accuracy given is 0.55, and the predictions are always all 1's or -1's. I figured out that maybe the problem is with the function classify_rbf, but in principle we don't have to modify it.

In [345]:
print(test_labels)
print(predictions)

[2 0 0 1 2 2 2 0 0 1 0 1 2 0 1 2 2 0 2 1 1 2 0 0 0 2 1 2 1 0]
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]
