<a href="https://colab.research.google.com/github/abdhmohammadi/DataScience/blob/main/SVM_Algorithm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a name="top"></a>
<font size='20'>SVM algorithm</font>

Abdullah Mohammadi, Ferdowsi University of Mashad

abdhmohammady@gmail.com

In this notebook, we want to examine the selection of the best parameters  in the <b>svm</b> algorithm
   

*   [Import libraries](#Import_libraries)
*   [About dataset](#about_dataset)
*   [Reading the dataset](#reding_dataset)
*  [Preprocessing]()
    *  [Mapping categorical columns](#map_columns)
    *  [Split dataset](#split_dataset)
*  [Implementation for linear kernel](#implementation_for_linear)
*  [Implementation for none linear kernel](#implementation_for_none_linear)
* [Test for Linear kernel](#test_for_linear)
* [Test for RBF kernel](#test_gamma_for_rbf)
* [Test for Polynomial kernel](#test_for_poly)
* [Test for GridSearchCV](#gridsearchCV)
* [<font color='red'><b>Implementation for Least Square SVM (LS-SVM)</b></font>](#lssvm)
  * [Helper functions](#get_kernel)
  * [Least Square class defination](#lssvm_class_defination)
  * [LS-SVM Usage](#use_lssvm)

<a name="Import_libraries">Import libraries</a>

In [1]:
import pandas as pd                  # To use Dataframe tools
import os                                    # To change working directory
import time                                 # To calulate execution time
import matplotlib.pyplot as plt  # To plot 'Score' graph
#In order to create support vector machine classifiers in sklearn, we can use the SVC class as part of the svm module.
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score             # To calculate the accuracy of svm 
from sklearn.model_selection import GridSearchCV # To build GridSearchCV and choose best parameters

import numpy as np
from numpy import dot, exp
from scipy.spatial.distance import cdist
#from sklearn.datasets import load_digits
from sklearn.preprocessing import MinMaxScaler

<a name="about_dataset">About dataset</a>

This article uses the dataset dementia_dataset.csv from the website https://www.kaggle.com, which was collected about a syndrome called dementia. In short, dementia is caused by a variety of diseases and injuries that mainly affect the brain, such as Alzheimer's disease or stroke. This collection includes a collection of 150 subjects aged 60 to 96 years. Each subject was scanned in two or more visits with an interval of at least one year and a total of 373 imaging sessions. For each subject, 3 or 4 T1-weighted MRI scans acquired in single-scan sessions were included. The subjects are all right-handed and include men and women. 72 subjects were identified as not having dementia during the study. Of the included subjects, 64 were diagnosed as having dementia at their initial visit and remained so for follow-up scans, including 51 with mild to moderate Alzheimer's disease. Another 14 were found to be dementia-free at the time of their initial visit and were subsequently diagnosed as having dementia at a follow-up visit.[5] The target column in this dataset is named Group, which contains two values, Nondemented meaning no dementia and Demented meaning having dementia syndrome.



We are going to investigate the effect of c parameter values in different kernels on the accuracy of SVM algorithm.

<a name="reading_dataset">Reading dataset</a>

In this notebook, we read the data directly from the Kaggle server, You can download this dataset from <a href="https://storage.googleapis.com/kagglesdsdata/datasets/1519608/2509179/dementia_dataset.csv?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20230304%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20230304T105356Z&X-Goog-Expires=259200&X-Goog-SignedHeaders=host&X-Goog-Signature=6d8916f91dd821b016f3c9dfd04c79b8fd51829e621879188bd3b9ccc915776cc5d12731aef67dfad298758a3a8a53a58d92c0724c9d14acc1679ef7ee2467d5bfd10745a64687ac60dcd46bf82f325a2b558a8347d8fe3ed9718fbc46f0dbe472ff73090f164a808f50547a37671d52de1924d6406af8276debd87d9f55cfe8b0da0d6f01b470867df4aeb25d7b7d30963cf888bc0241eb8f5510458d965634e26b9d90d363afb81355286f81bc7ccb4b508df8ab8e3eabc7424c6d68cd9a5049916051fe4c22021f8abba5a2be5bffae6ec9e5afa2655ba315fbee84c02d80d6026015d7f870508d688eddeebe5c711e7dcb094db2d898b0c68ece6c767038">here</a>

In [2]:

url ="https://storage.googleapis.com/kagglesdsdata/datasets/1519608/2509179/dementia_dataset.csv?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20230304%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20230304T105356Z&X-Goog-Expires=259200&X-Goog-SignedHeaders=host&X-Goog-Signature=6d8916f91dd821b016f3c9dfd04c79b8fd51829e621879188bd3b9ccc915776cc5d12731aef67dfad298758a3a8a53a58d92c0724c9d14acc1679ef7ee2467d5bfd10745a64687ac60dcd46bf82f325a2b558a8347d8fe3ed9718fbc46f0dbe472ff73090f164a808f50547a37671d52de1924d6406af8276debd87d9f55cfe8b0da0d6f01b470867df4aeb25d7b7d30963cf888bc0241eb8f5510458d965634e26b9d90d363afb81355286f81bc7ccb4b508df8ab8e3eabc7424c6d68cd9a5049916051fe4c22021f8abba5a2be5bffae6ec9e5afa2655ba315fbee84c02d80d6026015d7f870508d688eddeebe5c711e7dcb094db2d898b0c68ece6c767038"

try:
      #reading the dementia dataset
      dementias = pd.read_csv(url)
except Exception as e:
      print(e)
      print("Now we read archived data from google colab")
      dementias = pd.read_csv("/content/drive/MyDrive/DATA/dementia_dataset.csv")

print("Size of the dataset before dropping the records with missed values:",dementias.shape[0],"rows &",dementias.shape[1],"columns" )

# Since machine learning algorithms cannot work with missing data, we have to drop these records.
# Dropping the records with missing value
dementias = dementias.dropna()
print("Size of the dataset after  dropping the records with missed values:",dementias.shape[0],"rows &",dementias.shape[1],"columns" )


HTTP Error 400: Bad Request
Now we read archived data from google colab
Size of the dataset before dropping the records with missed values: 373 rows & 15 columns
Size of the dataset after  dropping the records with missed values: 354 rows & 15 columns


<a name="map_columns">Mapping categorical columns</a>

We have two columns of discrete data that must be converted to numerical data.

In [3]:
# The dataset has tow categorical columns
# Mapping categorical columns to 0 and 1
dementias['M/F'] = dementias['M/F'].map({'M': 0, 'F': 1})
dementias['Hand'] = dementias['Hand'].map({'R': 0, 'L': 1})

dementias.head()

Unnamed: 0,Subject ID,MRI ID,Group,Visit,MR Delay,M/F,Hand,Age,EDUC,SES,MMSE,CDR,eTIV,nWBV,ASF
0,OAS2_0001,OAS2_0001_MR1,Nondemented,1,0,0,0,87,14,2.0,27.0,0.0,1987,0.696,0.883
1,OAS2_0001,OAS2_0001_MR2,Nondemented,2,457,0,0,88,14,2.0,30.0,0.0,2004,0.681,0.876
5,OAS2_0004,OAS2_0004_MR1,Nondemented,1,0,1,0,88,18,3.0,28.0,0.0,1215,0.71,1.444
6,OAS2_0004,OAS2_0004_MR2,Nondemented,2,538,1,0,90,18,3.0,27.0,0.0,1200,0.718,1.462
7,OAS2_0005,OAS2_0005_MR1,Nondemented,1,0,0,0,80,12,4.0,28.0,0.0,1689,0.712,1.039


<a name="split_dataset">Split dataset</a>

Splite train and test data

the 'Subject ID' and 'MRI ID' must be remove befor pridicate.

In [4]:
# Splitting our data
# By default, Sklearn will reserve 25% of the dataset for training.
# we do not need to  'Subject ID' & 'MRI ID' columns
X = dementias[['Visit','MR Delay','M/F','Hand','Age','EDUC','SES','MMSE','CDR','eTIV','nWBV','ASF']]

# Target
y = dementias['Group']

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=100)

<a name="implementation_for_linear"><b>implementation for linear kernel</b></a>

In [None]:
# kernel{‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’} or callable, default=’rbf’
#   Specifies the kernel type to be used in the algorithm. If none is given, ‘rbf’ will be used.
#   If a callable is given it is used to pre-compute the kernel matrix from data matrices; 
#   that matrix should be an array of shape (n_samples, n_samples).

# C:Regularization parameter.
#   The strength of the regularization is inversely proportional to C.
#   Must be strictly positive. The penalty is a squared l2 penalty.

# d, default=3
#   Degree of the polynomial kernel function (‘poly’). Must be non-negative. Ignored by all other kernels.

# gamma{‘scale’, ‘auto’} or float, default=’scale’
#   Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.
#   if gamma='scale' (default) is passed then it uses 1 / (n_features * X.var()) as value of gamma,
#   if ‘auto’, uses 1 / n_features
#   if float, must be non-negative.
# *** Changed in version 0.22: The default value of gamma changed from ‘auto’ to ‘scale’.

def print_svm_results(plot=False ,start_ = 0.5, end_ =1.5, step =0.1, kernel_='linear',degree_ =2, gamma_="scale"):
    
    results = pd.DataFrame(columns=['C','Accuracy Score','Time(second)'])
    
    if start_>=end_ : end_ = start_ + 0.5*step
    
    i = 0
    
    total_time = 0
    # Computing SVC results by given parameters sequently and collect accuracy score for each repeatation
    while start_ < end_ :

        exec_start = time.time()
        # Building and training our model
        classifier = SVC(C=start_, kernel=kernel_, degree= degree_, gamma= gamma_)
        classifier.fit(X_train, y_train)  
        predictions = classifier.predict(X_test)
        
        score = accuracy_score(y_test, predictions) 

        exec_end = time.time()

        exec_time = exec_end - exec_start
        
        results.at[i,'Time(second)'] = exec_time
        results.at[i,'C']= start_
        results.at[i,'Accuracy Score'] = score
        total_time = total_time + exec_time

        i = i + 1

        start_ = start_ + step
    # ================================================================    
    # ===== End of while  ============================================
    # ================================================================
    if plot == True:
        plt.xlabel("\nC: Regularization parameter \n\nSVM with "+kernel_+" kernel")
        plt.ylabel("Accuracy Score")
        plt.title = "SVM with " + kernel_ + " kernel"

        plt.plot(results['C'], results['Accuracy Score'],color='green', linewidth = 1,marker='o', markerfacecolor='blue', markersize=2)

        i=0
        while i<len(results) :
            plt.annotate(str(round(results['Accuracy Score'].loc[results.index[i]],2))
                     ,xy=(results['C'].loc[results.index[i]]-0.025,results['Accuracy Score'].loc[results.index[i]]-0.001))
            i = i +1

        plt.show()

    return total_time,results


<a name="implementation_for_none_linear">implementation for None linear methods</a>

In [None]:
# gamma{‘scale’, ‘auto’} or float, default=’scale’
#   Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.
#   if gamma='scale' (default) is passed then it uses 1 / (n_features * X.var()) as value of gamma,
#   if ‘auto’, uses 1 / n_features
#   if float, must be non-negative.

g= [0.00001,0.0001,0.001,0.01,0.1,1,10,100,1000,10000]

def print_none_linear_results(plot=False ,start = 0.1,end =1,step_ =0.1, kernel='rbf',degree =3,gamma= g ):

        gamma_results = pd.DataFrame(columns=['Best C','Gamma','Accuracy Score','Time(second)'])

        gamma_executed_time = 0
 
        i = 0
        while i<len(gamma):

              executed_time, results = print_svm_results(plot = False, start_ = start, end_ = end, step =step_,degree_=degree, kernel_=kernel, gamma_=gamma[i])
    
              # Store gamma value    
              gamma_results.at[i,'Gamma']= gamma[i]
              # Choose maximom 'Accuracy Score' for the gamma given by g[i]
              gamma_results.at[i,'Accuracy Score'] = results.max()['Accuracy Score']
              # Choose index of maximum 'Accuracy Score' to find best value of the 'C'
              best_idx = results["Accuracy Score"].astype(float).idxmax()
              # Set best C value
              gamma_results.at[i,'Best C']= results["C"][best_idx]
              # execution time for print_svm_reults' function
              gamma_results.at[i,'Time(second)'] = executed_time
              # store total execution time for collecting gamma results
              gamma_executed_time = gamma_executed_time + executed_time
              # next step
              i = i + 1

        if(plot):
              plt.xlabel("\nGamma \n\nSVM with "+ kernel+" kernel")
              plt.ylabel("Accuracy Score")
              plt.title = "SVM with "+ kernel +" kernel"

              plt.plot(gamma_results['Gamma'], gamma_results['Accuracy Score'],color='green', linewidth = 1,marker='o', markerfacecolor='blue', markersize=5)

              i=0
              while i<len(gamma_results) :
                    plt.annotate(str(round(gamma_results['Accuracy Score'].loc[gamma_results.index[i]],2))
                          ,xy=(gamma_results['Gamma'].loc[gamma_results.index[i]]-0.025,gamma_results['Accuracy Score'].loc[gamma_results.index[i]]-0.001))
                    i = i +1

              plt.show()

        return gamma_executed_time, gamma_results


<a name="test_for_linear"></a>
<html>
    <body>
        <h1>Test the parameter C values for linear kernel</h1>
    </body>
</html>

[Go to Implemenation](#implementation_for_linear)

In [None]:
#Collecting svm results my be takes a few time.
executed_time,results = print_svm_results(start_ = 0.1, end_ =2.1, n_steps =20, kernel_='linear',gamma_=0.001)
print("Total time: ",executed_time, "seconds")
results.head(len(results))

<a name="test_gamma_for_rbf"></a>
<html>
    <body>
        <h1>Test for RBF kernel</h1>
    </body>
</html>

[Go to Implemenation](#implementation_for_none_linear)

In [None]:

gamma_executed_time , gamma_results = print_none_linear_results (plot=False ,start = 0.1,end =2.1,n_steps_ =20, kernel='rbf',gamma=g)

print("Total time: ",gamma_executed_time, "seconds")

gamma_results.head(len(gamma_results))


Total time:  2.9977381229400635 seconds


Unnamed: 0,Best C,Gamma,Accuracy Score,Time(second)
0,0.8,1e-05,0.561798,0.272326
1,0.3,0.0001,0.573034,0.288679
2,1.0,0.001,0.606742,0.323706
3,0.9,0.01,0.58427,0.317725
4,0.1,0.1,0.550562,0.29809
5,0.1,1.0,0.550562,0.309421
6,0.1,10.0,0.550562,0.297776
7,0.1,100.0,0.550562,0.291605
8,0.1,1000.0,0.550562,0.301867
9,0.1,10000.0,0.550562,0.296542


The best result of above parameters is in row 3, which is indexed by 2, but it is a very bad result compared to the linear kernel.


<a name="test_for_poly"></a>
<html>
    <body>
        <h1>Test for Polynominal kernel</h1>
        <h4>In this case the SVM algorithm is very slow and we had to test the parameters manually by ignoring the loop.For this reason, we consider the value of start, end, and step to be equal so that the loop is executed only once</h4>
    </body>
</html>

In [None]:
gamma_executed_time , gamma_results = print_none_linear_results (plot=False ,start = 0.1,end =0.1, step_ =0.1,degree = 3, kernel='poly',gamma=[0.001])

print("Total time: ",gamma_executed_time, "seconds")

gamma_results.head(len(gamma_results))


Total time:  230.8882541656494 seconds


Unnamed: 0,Best C,Gamma,Accuracy Score,Time(second)
0,0.1,0.001,0.876404,230.888254


<a name="gridsearchCV"/>
<html>
    <body>
        <h1>Test for GridSearchCV</h1>
        <h3>GridSearchCV is a method to find optimal parameter values from a given set of parameters in a grid. It is basically a cross-validation technique that performs hyperparameter tuning to determine optimal values for a given model.</h3>
    </body>
</html>

In [None]:

# Building and training our model
classifier = SVC()

parameters = {'kernel':['linear', 'rbf','poly'], 'C':[0.1,0.6, 1],'gamma':[0.001],'degree':[2]}

clf = GridSearchCV(classifier, parameters, refit = True, verbose = 3)
clf.fit(X_train, y_train)


Fitting 5 folds for each of 9 candidates, totalling 45 fits
[CV 1/5] END C=0.1, degree=2, gamma=0.001, kernel=linear;, score=0.811 total time=   6.7s
[CV 2/5] END C=0.1, degree=2, gamma=0.001, kernel=linear;, score=0.774 total time=   4.7s
[CV 3/5] END C=0.1, degree=2, gamma=0.001, kernel=linear;, score=0.811 total time=   3.0s
[CV 4/5] END C=0.1, degree=2, gamma=0.001, kernel=linear;, score=0.811 total time=   8.2s
[CV 5/5] END C=0.1, degree=2, gamma=0.001, kernel=linear;, score=0.887 total time=   5.8s
[CV 1/5] END C=0.1, degree=2, gamma=0.001, kernel=rbf;, score=0.547 total time=   0.0s
[CV 2/5] END C=0.1, degree=2, gamma=0.001, kernel=rbf;, score=0.528 total time=   0.0s
[CV 3/5] END C=0.1, degree=2, gamma=0.001, kernel=rbf;, score=0.528 total time=   0.0s
[CV 4/5] END C=0.1, degree=2, gamma=0.001, kernel=rbf;, score=0.528 total time=   0.0s
[CV 5/5] END C=0.1, degree=2, gamma=0.001, kernel=rbf;, score=0.528 total time=   0.0s
[CV 1/5] END C=0.1, degree=2, gamma=0.001, kernel=poly;

In [None]:
print("Best params",clf.best_params_)
print("Best score",clf.best_score_)
print("Best estimator:",clf.best_estimator_)

Best params {'C': 0.1, 'degree': 2, 'gamma': 0.001, 'kernel': 'poly'}
Best score 0.909433962264151
Best estimator: SVC(C=0.1, degree=2, gamma=0.001, kernel='poly')


The best results of GridSearchCV are very close to the parameters we obtained. In terms of kernels, both are the same (linear kernels were obtained for both), but in terms of accuracy, GridSearchCV performed better. The accuracy obtained by our function is equal to 0.921348 for c = 0.6 and the accuracy of GridSearchCV is equal to 0.962 for c = 1.0, but if we also pay attention to the classification execution time, the results are exactly the same as our results.



[Go to top](#top)

<h1><a name="lssvm"><b>Least Square SVM</b></a></h1>

<a name="get_kernel"><b>Helper functions</b></a>

In [5]:
def get_kernel(name='linear', **params):
    """The method that returns the kernel function, given the 'kernel'
    parameter. """
    
    def linear(x_i, x_j):  return dot(x_i, x_j.T)

    def poly(x_i, x_j, d=params.get('d', 3)): return (dot(x_i, x_j.T) + 1) ** d

    def rbf(x_i, x_j, sigma=params.get('sigma', 1)): return exp(-cdist(x_i, x_j) ** 2 / sigma ** 2)

    kernels = {'linear': linear, 'poly': poly, 'rbf': rbf}

    if kernels.get(name) is None:
        raise KeyError( f"Kernel '{name}' is not defined, try one in the list: "  f"{list(kernels.keys())}." )
    else:
        return kernels[name]

In [6]:
import codecs
import json

def dump_model(model_dict, file_encoder, filepath='model'):
    with open(f"{filepath.replace('.json', '')}.json", 'w') as fp:
        json.dump(model_dict, fp, default=file_encoder)

def load_model(filepath='model'):
    helper_filepath = filepath if filepath.endswith('.json') else f"{filepath}.json"
    file_text = codecs.open(helper_filepath, 'r', encoding='utf-8').read()
    model_json = json.loads(file_text)

    return model_json

In [7]:
def dummie2multilabel(X):
    """Convert dummies to multilabel"""
    N = len(X)
    X_multi = np.zeros((N,1),dtype='int')
    for i in range(N):
        temp = np.where(X[i]==1)[0] # find 1 in the array
        if temp.size == 0: # is a empty array, there is no '1' in the X[i] array
            X_multi[i] = 0 # so we denote this class '0'
        else:
            X_multi[i] = temp[0] + 1
    return X_multi.T[0]

def numpy_json_encoder(obj):
      if type(obj).__module__ == np.__name__:
          if isinstance(obj, np.ndarray):
             return obj.tolist()
          else:
             return obj.item()
      raise TypeError(f"""Unable to  "jsonify" object of type :', {type(obj)}""")


<a name="lssvm_class_defination"><b>Least Square SVM(LS-SVM) class implementation  </b>[To End](#use_lssvm)</a>

In [29]:
class LSSVM():
    """A class that implements the Least Squares Support Vector Machine 
    for classification tasks.

    It uses Numpy pseudo-inverse function to solve the dual optimization 
    problem with ordinary least squares. In multiclass classification 
    problems the approach used is one-vs-all, so, a model is fit for each 
    class while considering the others a single set of the same class.
    
    # Parameters:
    - gamma: float, default = 1.0
        Constant that control the regularization of the model, it may vary 
        in the set (0, +infinity). The closer gamma is to zero, the more 
        regularized the model will be.
    - kernel: {'linear', 'poly', 'rbf'}, default = 'rbf'
        The kernel used for the model, if set to 'linear' the model 
        will not take advantage of the kernel trick, and the LSSVC maybe only
        useful for linearly separable problems.
    - kernel_params: **kwargs, default = depends on 'kernel' choice
        If kernel = 'linear', these parameters are ignored. If kernel = 'poly',
        'd' is accepted to set the degree of the polynomial, with default = 3. 
        If kernel = 'rbf', 'sigma' is accepted to set the radius of the 
        gaussian function, with default = 1. 
     
    # Attributes:
    - All hyperparameters of section "Parameters".
    - alpha: ndarray of shape (1, n_support_vectors) if in binary 
             classification and (n_classes, n_support_vectors) for 
             multiclass problems
        Each column is the optimum value of the dual variable for each model
        (using the one-vs-all approach we have n_classes == n_classifiers), 
        it can be seen as the weight given to the support vectors 
        (sv_x, sv_y). As usually there is no alpha == 0, we have 
        n_support_vectors == n_train_samples.
    - b: ndarray of shape (1,) if in binary classification and (n_classes,) 
         for multiclass problems 
        The optimum value of the bias of the model.
    - sv_x: ndarray of shape (n_support_vectors, n_features)
        The set of the supporting vectors attributes, it has the shape 
        of the training data.
    - sv_y: ndarray of shape (n_support_vectors, n)
        The set of the supporting vectors labels. If the label is represented 
        by an array of n elements, the sv_y attribute will have n columns.
    - y_labels: ndarray of shape (n_classes, n)
        The set of unique labels. If the label is represented by an array 
        of n elements, the y_label attribute will have n columns.
    - K: function, default = rbf()
        Kernel function.
    """
    
    def __init__(self, gamma=1, kernel='rbf', **kernel_params): 
        # Hyperparameters
        self.gamma = gamma
        self.kernel_ = kernel
        self.kernel_params = kernel_params
        
        # Model parameters
        self.alpha = None
        self.b = None
        self.sv_x = None
        self.sv_y = None
        self.y_labels = None
        
        self.K = get_kernel(kernel, **kernel_params)
    
    def optimize(self, X, y_values):
        """Help function that optimizes the dual variables through the 
           use of the kernel matrix pseudo-inverse.
        """
        sigma = np.multiply(y_values*y_values.T, self.K(X,X))
        
        A = np.block([ [0, y_values.T], [y_values, sigma + self.gamma**-1 * np.eye(len(y_values))]])

        B = np.array([0]+[1]*len(y_values))
        
        A_cross = np.linalg.pinv(A)

        solution = dot(A_cross, B)
        b = solution[0]
        alpha = solution[1:]
        
        return (b, alpha)
    
    def fit(self, X, y):
        """Fits the model given the set of X attribute vectors and y labels.
        - X: ndarray of shape (n_samples, n_attributes)
        - y: ndarray of shape (n_samples,) or (n_samples, n)
            If the label is represented by an array of n elements, the y 
            parameter must have n columns.
        """
        y_reshaped = y.to_numpy().reshape(-1,1) if y.ndim==1 else y

        scaler = MinMaxScaler()
        scaler.fit(X)
        X_normalized = scaler.transform(X)

        self.sv_x = X_normalized
        self.sv_y = y_reshaped
        self.y_labels = np.unique(y_reshaped)#, axis=0)

        if len(self.y_labels) == 2: # binary classification
            # converting to -1/+1
            y_values = np.where((y_reshaped == self.y_labels[0]).all(axis=1) ,-1,+1)[:,np.newaxis] # making it a column vector
            
            self.b, self.alpha = self.optimize(X_normalized, y_values)
        
        # Multi-class need to be revised
        else: # multiclass classification, one-vs-all approach
            n_classes = len(self.y_labels)
            self.b = np.zeros(n_classes)
            self.alpha = np.zeros((n_classes, len(y_reshaped)))
            
            for i in range(n_classes):
                # converting to +1 for the desired class and -1 for all other classes
                y_values = np.where((y_reshaped == self.y_labels[i]).all(axis=1),+1,-1)[:,np.newaxis]
       
                self.b[i], self.alpha[i] = self.optimize(X_normalized, y_values)
         
    def predict(self, X):
        """Predicts the labels of data X given a trained model.
        - X: ndarray of shape (n_samples, n_attributes)
        """
        if self.alpha is None: raise Exception("The model doesn't see to be fitted, try running .fit() method first" )
        
        scaler = MinMaxScaler()
        scaler.fit(X)
        X_normalized = scaler.transform(X)

        X_reshaped = X_normalized.reshape(1,-1) if X.ndim==1 else X_normalized
        KxX = self.K(self.sv_x, X_reshaped)
        
        if len(self.y_labels)==2: # binary classification
            y_values = np.where(
                (self.sv_y == self.y_labels[0]).all(axis=1),
                -1,+1)[:,np.newaxis]

            y = np.sign(dot(np.multiply(self.alpha, y_values.flatten()), KxX) + self.b)
            
            y_pred_labels = np.where(y==-1, self.y_labels[0], self.y_labels[1])
        
        else: # multiclass classification, one-vs-all approach
            y = np.zeros((len(self.y_labels), len(X_normalized)))
            for i in range(len(self.y_labels)):
                y_values = np.where(
                    (self.sv_y == self.y_labels[i]).all(axis=1),
                    +1, -1)[:,np.newaxis]
                y[i] = dot(np.multiply(self.alpha[i], y_values.flatten()), KxX) + self.b[i]
            
            predictions = np.argmax(y, axis=0)
            y_pred_labels = np.array([self.y_labels[i] for i in predictions])
            
        return y_pred_labels

    def dump(self, filepath='model', only_hyperparams=False):
        """This method saves the model in a JSON format.
        - filepath: string, default = 'model'
            File path to save the model's json.
        - only_hyperparams: boolean, default = False
            To either save only the model's hyperparameters or not, it 
            only affects trained/fitted models.
        """
        model_json = {
            'type': 'LSSVM',
            'hyperparameters': {
                'gamma': self.gamma,
                'kernel': self.kernel_,
                'kernel_params': self.kernel_params
            }           
        }

        if (self.alpha is not None) and (not only_hyperparams):
            model_json['parameters'] = {
                'alpha': self.alpha,
                'b': self.b,
                'sv_x': self.sv_x,
                'sv_y': self.sv_y,
                'y_labels': self.y_labels
            }
        
        dump_model(model_dict=model_json, file_encoder=numpy_json_encoder, filepath=filepath)
        
    @classmethod
    def load(cls, filepath, only_hyperparams=False):
        """This class method loads a model from a .json file.
        - filepath: string
            The model's .json file path.
        - only_hyperparams: boolean, default = False
            To either load only the model's hyperparameters or not, it 
            only has effects when the dump of the model as done with the
            model's parameters.
        """
        model_json = load_model(filepath=filepath)

        if model_json['type'] != 'LSSVM':
            raise Exception(f"Model type '{model_json['type']}' doesn't match 'LSSVM'" )

        lssvc = LSSVM(gamma = model_json['hyperparameters']['gamma'], kernel = model_json['hyperparameters']['kernel'], **model_json['hyperparameters']['kernel_params'] )

        if (model_json.get('parameters') is not None) and (not only_hyperparams):
            lssvc.alpha = np.array(model_json['parameters']['alpha'])
            lssvc.b = np.array(model_json['parameters']['b'])
            lssvc.sv_x = np.array(model_json['parameters']['sv_x'])
            lssvc.sv_y = np.array(model_json['parameters']['sv_y'])
            lssvc.y_labels = np.array(model_json['parameters']['y_labels'])

        return lssvc


<a name="use_lssvm"><b>Use LS-SVM </b> </a>[Go to top](#top)

In [31]:

kernels=['linear','rbf','poly']
lssvm_results = pd.DataFrame(columns=['Kernel','Accuracy Score','Time(second)'])
  
for i in range(len(kernels)):
    start_ = time.time()

    lssvc = LSSVM(gamma=1, kernel=kernels[i], sigma=.5) # Class instantiation
    lssvc.fit(X_train, y_train) # Fitting the model
    y_pred = lssvc.predict(X_test) # Making predictions with the trained model
    acc = accuracy_score(y_test,y_pred)
    
    end_ = time.time()
    executed_time = end_ - start_
    lssvm_results.at[i,'Kernel'] =kernels[i]
    lssvm_results.at[i,'Accuracy Score'] = acc
    lssvm_results.at[i,'Time(second)'] = executed_time

print("Least Square SVM results")
lssvm_results.head()

Least Square SVM results


Unnamed: 0,Kernel,Accuracy Score,Time(second)
0,linear,0.876404,0.124478
1,rbf,0.898876,0.135238
2,poly,0.88764,0.100998
