# m-arcsinh: An Efficient and Reliable Function for SVM and MLP in scikit-learn
 

### Source link: https://arxiv.org/pdf/2009.07530

#### Author: 
Luca Parisi
 - Coventry, United Kingdom
 arXiv:2009.07530v1  [cs.LG]  16 Sep 2020
 - PhD in Machine Learning for Clinical Decision Support Systems
 - MBA Candidate with Artificial Intelligence Specialism

### Abstract
 This paper describes the ’m-arcsinh’, a modified (’m-’) version of the inverse hyperbolic
 sine function (’arcsinh’). Kernel and activation functions enable Machine Learning (ML)
based algorithms, such as Support Vector Machine (SVM) and Multi-Layer Perceptron
 (MLP), to learn from data in a supervised manner. m-arcsinh, implemented in the open
 source Python library ’scikit-learn’, is hereby presented as an efficient and reliable kernel
 and activation function for SVM and MLP respectively. Improvements in reliability and
 speed to convergence in classification tasks on fifteen (N = 15) datasets available from scikit
learn and the University California Irvine (UCI) Machine Learning repository are discussed.
 Experimental results demonstrate the overall competitive classification performance of both
 SVM and MLP, achieved via the proposed function. This function is compared to gold
 standard kernel and activation functions, demonstrating its overall competitive reliability
 regardless of the complexity of the classification tasks involved.

In [23]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score
from sklearn.neural_network import MLPClassifier
from sklearn import svm

from sklearn.neural_network._base import ACTIVATIONS, DERIVATIVES

### 70% of the data was selected for training, whilst the remaining 30% for testing for the ’Handwritten Digits’ dataset;

In [24]:
digits = pd.read_csv('../m_arcsinh/data/digits.csv')
digits

Unnamed: 0,pixel_0_0,pixel_0_1,pixel_0_2,pixel_0_3,pixel_0_4,pixel_0_5,pixel_0_6,pixel_0_7,pixel_1_0,pixel_1_1,...,pixel_6_7,pixel_7_0,pixel_7_1,pixel_7_2,pixel_7_3,pixel_7_4,pixel_7_5,pixel_7_6,pixel_7_7,target
0,0.0,0.0,5.0,13.0,9.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,6.0,13.0,10.0,0.0,0.0,0.0,0
1,0.0,0.0,0.0,12.0,13.0,5.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,11.0,16.0,10.0,0.0,0.0,1
2,0.0,0.0,0.0,4.0,15.0,12.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,3.0,11.0,16.0,9.0,0.0,2
3,0.0,0.0,7.0,15.0,13.0,1.0,0.0,0.0,0.0,8.0,...,0.0,0.0,0.0,7.0,13.0,13.0,9.0,0.0,0.0,3
4,0.0,0.0,0.0,1.0,11.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,2.0,16.0,4.0,0.0,0.0,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1792,0.0,0.0,4.0,10.0,13.0,6.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,2.0,14.0,15.0,9.0,0.0,0.0,9
1793,0.0,0.0,6.0,16.0,13.0,11.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,6.0,16.0,14.0,6.0,0.0,0.0,0
1794,0.0,0.0,1.0,11.0,15.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,2.0,9.0,13.0,6.0,0.0,0.0,8
1795,0.0,0.0,2.0,10.0,7.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,5.0,12.0,16.0,12.0,0.0,0.0,9


In [25]:
X_digits = digits.drop(columns='target')
y_digits = digits['target']

In [26]:
X_train_digits, X_test_digits, y_train_digits, y_test_digits = train_test_split(X_digits, y_digits, shuffle=False, test_size=0.3)

In [27]:
for activation in ('identity', 'logistic', 'tanh', 'relu'):
    classifier_1 = MLPClassifier(
        activation=activation,
        random_state=1,
        max_iter=300
    )
    classifier_1.fit(X_train_digits, y_train_digits)
    
    y_pred = classifier_1.predict(X_test_digits)
    acc = accuracy_score(y_test_digits, y_pred)
    
    print(f"MLP ({activation}) -> Accuracy: {acc:.4f}")

MLP (identity) -> Accuracy: 0.9111
MLP (logistic) -> Accuracy: 0.9352
MLP (tanh) -> Accuracy: 0.9296
MLP (relu) -> Accuracy: 0.9167


In [28]:
for kernel in ('linear', 'poly', 'rbf', 'sigmoid'):
    classifier_2 = svm.SVC(
        kernel=kernel,
        gamma=0.001,
        random_state=13,
        class_weight='balanced'
    )
    classifier_2.fit(X_train_digits, y_train_digits)
    
    y_pred = classifier_2.predict(X_test_digits)
    acc = accuracy_score(y_test_digits, y_pred)
    
    print(f"SVM ({kernel}) -> Accuracy: {acc:.4f}")

SVM (linear) -> Accuracy: 0.9333
SVM (poly) -> Accuracy: 0.9481
SVM (rbf) -> Accuracy: 0.9685
SVM (sigmoid) -> Accuracy: 0.6759


### 80% for training, 20% for testing for the ’Statlog’, ’Olivetti faces’, ’Parkinson’s’, ’Wi-Fi localization’, ’Breast Cancer Coimbra’, ’Haberman’ and ’Heart Failure’ datasets;

In [29]:
german_credit = pd.read_csv('../m_arcsinh/data/german_credit.csv')
german_credit

Unnamed: 0,feature_1,feature_2,feature_3,feature_4,feature_5,feature_6,feature_7,feature_8,feature_9,feature_10,...,feature_12,feature_13,feature_14,feature_15,feature_16,feature_17,feature_18,feature_19,feature_20,target
0,A11,6,A34,A43,1169,A65,A75,4,A93,A101,...,A121,67,A143,A152,2,A173,1,A192,A201,1
1,A12,48,A32,A43,5951,A61,A73,2,A92,A101,...,A121,22,A143,A152,1,A173,1,A191,A201,2
2,A14,12,A34,A46,2096,A61,A74,2,A93,A101,...,A121,49,A143,A152,1,A172,2,A191,A201,1
3,A11,42,A32,A42,7882,A61,A74,2,A93,A103,...,A122,45,A143,A153,1,A173,2,A191,A201,1
4,A11,24,A33,A40,4870,A61,A73,3,A93,A101,...,A124,53,A143,A153,2,A173,2,A191,A201,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,A14,12,A32,A42,1736,A61,A74,3,A92,A101,...,A121,31,A143,A152,1,A172,1,A191,A201,1
996,A11,30,A32,A41,3857,A61,A73,4,A91,A101,...,A122,40,A143,A152,1,A174,1,A192,A201,1
997,A14,12,A32,A43,804,A61,A75,4,A93,A101,...,A123,38,A143,A152,1,A173,1,A191,A201,1
998,A11,45,A32,A43,1845,A61,A73,4,A93,A101,...,A124,23,A143,A153,1,A173,1,A192,A201,2
