## A Machine Learning approach to classifying $b$-matrices by cluster algebra: Support Vector Machines

The preceding notebook aims to use artificial neural networks in Keras to classify $b$-matrices by cluster algebra. Since we are trying to classify vectors in real space here we attempt to use support vector machines (in scikit) to solve this problem. Firstly we experiment with a binary classification of $b$-matrices of $A_5$ and $D_5$ type, as see that by choosing a support vector machine with a quartic kernel we are able to achieve perfect accuracy on our test data. We then apply this same model to the $A_6$ and $D_6$ case, and see that the accuracy suffers ($0.95$) but is still very good. Finally we introduce $b$-matrices of $E_6$ type to attempt a ternary classification for $A_6$, $D_6$ and $E_6$. With the same model as before we have accuracies of $0.8$. Moreover we see that if we take a degree $6$ kernel we can improve this slightly to $0.825$.

We reiterate our gratefulness to the papers [1, 2], where the authors first applied machine learning to various problems in cluster algebras. A more detailed introduction is given in our Keras notebook.

<cite data-cite="bao">[1] Bao, Jiakang, et al. "Quiver mutations, Seiberg duality, and machine learning." Physical Review D 102.8 (2020): 086013.</cite> https://arxiv.org/abs/2006.10783
    
<cite data-cite="dechant">[2] Dechant, Pierre-Philippe, et al. "Cluster Algebras: Network Science and Machine Learning." arXiv preprint arXiv:2203.13847 (2022).

## Contents:
* [Importing the $A_5$ and $D_5$ data](#1)
* [Support vector machines for $A_5$ and $D_5$](#2)
* [Support vector machines for $A_6$ and $D_6$](#3)
* [Ternary classification with $A_6$, $D_6$ and $E_6$](#4)

## Importing the $A_5$ and $D_5$ data <a class="anchor" id="1"></a>

We import all the necessary data we will experiment with with little explanation. More details about this process are given in the preceeding Keras notebook.

In [1]:
import numpy as np
import csv
import sklearn
from sklearn.model_selection import *
from sklearn import *
from sklearn.preprocessing import StandardScaler

In [24]:
# A5 data

with open('cluster_data_A5_depth_100.csv') as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    data = [row for row in reader]

data=data[0] # All stored in first row, so just take that

cluster_type = data[0] # Cluster type stored as first entry

data = [np.array(np.matrix(data[i])).ravel() for i in range(1, len(data))] # The vectors have been converted to strings,
                                                                           # so need to undo this. Also discard cluster type
data = [np.append(i, np.array([0])) for i in data] # A5 encoded as 0
A5_data = data

A5_array = A5_data[0]
for i in range(1, len(A5_data)):
    A5_array = np.vstack([A5_array, A5_data[i]])

# D5 data
    
with open('cluster_data_D5_depth_100.csv') as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    data = [row for row in reader]    
data=data[0]
cluster_type = data[0]
data = [np.array(np.matrix(data[i])).ravel() for i in range(1, len(data))]
data = [np.append(i, np.array([1])) for i in data] # D5 encoded as 1
D5_data = data

D5_array = D5_data[0]
for i in range(1, len(D5_data)):
    D5_array = np.vstack([D5_array, D5_data[i]])
    
# The features
X = np.vstack([A5_array[:,:-1], D5_array[:,:-1]])

y = np.vstack([A5_array[:,-1:], D5_array[:,-1:]])

y = y.ravel()

## Support vector machines for $A_5$ and $D_5$ <a class="anchor" id="2"></a>

Here we try appling scikit's support vector machines to the $A_5$ and $D_5$ data. We see that in this case, with the C support vector classification we are able to obtain accuracies of $1$, which is very encouraging.

Firstly we apply a scaler to the data. In this case this has little effect, however.

In [26]:
scaler = StandardScaler()
scaler.fit_transform(X)

# Train / test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=11)

First of all we try the linear SVC. With the default settings the accuracy is poor.

In [17]:
machine = svm.LinearSVC()
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'Default accuracy is {metrics.accuracy_score(y_test, y_predict)}')

Default accuracy is 0.3260445983979216


We can set the C parameter which is inversely proportional to the regularization, we need to set max_iter parameter higher for this to work, but it doesn't give good results anyway.

In [18]:
machine = svm.LinearSVC(C=10, max_iter=10000)
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'Accuracy is {metrics.accuracy_score(y_test, y_predict)}')

Accuracy is 0.3260445983979216


Next we try the C-support vector classification. Again, by default the accuracy is poor.

In [6]:
machine = svm.SVC()
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'Default accuracy is {metrics.accuracy_score(y_test, y_predict)}')

Default accuracy is 0.5730909090909091


We have control over a parameter $C$, which is inversely proportional to the regularization. This seems to be very fruitful.

In [7]:
machine = svm.SVC(C=30)
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With C as 30 accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With C as 30 accuracy is 0.7992727272727272


We also have a kernel parameter, which is 'rbf' by default. We can try the other options. In particular we have a polynomial kernel, for which we need to select a degree. With even degrees this does very well, in particular degree $4$.

In [8]:
# Linear kernel, not very good.

machine = svm.SVC(C=30, kernel='linear')
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With linear kernel accuracy is {metrics.accuracy_score(y_test, y_predict)}')

# Polynomial kernel, we also need to set a degree here. Here even degrees seems to be very good, odd degrees are very bad. 
# We see very good results for quartic. 

for i in range(8):
    machine = svm.SVC(C=30, kernel='poly', degree=i)
    machine.fit(X_train, y_train)
    y_predict = machine.predict(X_test)
    print(f'With polynomial kernel and degree {i} accuracy is {metrics.accuracy_score(y_test, y_predict)}')
    
# Sigmoid kernel, accuracy is not good.    

machine = svm.SVC(C=30, kernel='sigmoid')
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With sigmoid kernel accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With linear kernel accuracy is 0.5301818181818182
With polynomial kernel and degree 0 accuracy is 0.5301818181818182
With polynomial kernel and degree 1 accuracy is 0.5301818181818182
With polynomial kernel and degree 2 accuracy is 0.5621818181818182
With polynomial kernel and degree 3 accuracy is 0.37454545454545457
With polynomial kernel and degree 4 accuracy is 1.0
With polynomial kernel and degree 5 accuracy is 0.1781818181818182
With polynomial kernel and degree 6 accuracy is 0.9127272727272727
With polynomial kernel and degree 7 accuracy is 0.4109090909090909
With sigmoid kernel accuracy is 0.5381818181818182


Finally we have $\nu$-support vector classification. Apparently this is a reparametrization of the SVC. The default accuracy here is quite good.

In [9]:
machine = svm.NuSVC()
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'Default accuracy is {metrics.accuracy_score(y_test, y_predict)}')

Default accuracy is 0.8072727272727273


The name of this is due to the $\nu$ parameter in $(0,1]$. We can get some good accuracies for $\nu$ close to $0.1$.

In [10]:
for nu in np.linspace(0.06, 0.15, 5):
    machine = svm.NuSVC(nu=nu)
    machine.fit(X_train, y_train)
    y_predict = machine.predict(X_test)
    print(f'With nu as {nu}, accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With nu as 0.06, accuracy is 0.8807272727272727
With nu as 0.08249999999999999, accuracy is 0.8770909090909091
With nu as 0.105, accuracy is 0.8792727272727273
With nu as 0.1275, accuracy is 0.8785454545454545
With nu as 0.15, accuracy is 0.88


The same as before, we have a kernel parameter, which is 'rbf' by default. Again we see the best results with a quartic polynomial kernel.

In [27]:
# Linear

machine = svm.NuSVC(nu=0.1, kernel='linear')
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With linear kernel accuracy is {metrics.accuracy_score(y_test, y_predict)}')

# Polynomial

for i in range(2, 7):
    machine = svm.NuSVC(nu=0.1, kernel='poly', degree=i)
    machine.fit(X_train, y_train)
    y_predict = machine.predict(X_test)
    print(f'With polynomial kernel and degree {i} accuracy is {metrics.accuracy_score(y_test, y_predict)}')    
    
# Sigmoid

machine = svm.NuSVC(nu=0.1, kernel='sigmoid')
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With sigmoid kernel accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With linear kernel accuracy is 0.488
With polynomial kernel and degree 2 accuracy is 0.45381818181818184
With polynomial kernel and degree 3 accuracy is 0.5243636363636364
With polynomial kernel and degree 4 accuracy is 1.0
With polynomial kernel and degree 5 accuracy is 0.632
With polynomial kernel and degree 6 accuracy is 0.9141818181818182
With sigmoid kernel accuracy is 0.4749090909090909


## Support vector machines for $A_6$ and $D_6$ <a class="anchor" id="3"></a>

Now we look at $b$-matrices for $A_6$ and $D_6$, to see how our machine performs.

In [28]:
# A6

with open('cluster_data_A6_depth_6.csv') as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    data = [row for row in reader]
data=data[0]
cluster_type = data[0]
data = [np.array(np.matrix(data[i])).ravel() for i in range(1, len(data))]
data = [np.append(i, np.array([0])) for i in data]
A6_data = data
A6_array = A6_data[0]
for i in range(1, len(A6_data)):
    A6_array = np.vstack([A6_array, A6_data[i]])
    
# D6

with open('cluster_data_D6_depth_6.csv') as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    data = [row for row in reader]
data=data[0]
cluster_type = data[0]
data = [np.array(np.matrix(data[i])).ravel() for i in range(1, len(data))]
data = [np.append(i, np.array([1])) for i in data]
D6_data = data
D6_array = D6_data[0]
for i in range(1, len(D6_data)):
    D6_array = np.vstack([D6_array, D6_data[i]])
    
# Features, targets

X = np.vstack([A6_array[:,:-1], D6_array[:,:-1]])
y = np.vstack([A6_array[:,-1:], D6_array[:,-1:]])
y = y.ravel()
scaler = StandardScaler()
scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=11)

Our best machine from the previous case was SVC with a quartic kernel, which here achieves accuracies of $0.95$, which is good.

In [13]:
machine = svm.SVC(C=30, kernel='poly', degree=4)
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With polynomial kernel of degree 4 accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With polynomial kernel of degree 4 accuracy is 0.9525897061211236


Since we have more data we check to see how a higher degree even polynomial kernel performs. We can see it is good but still worse than degree $4$.

In [29]:
machine = svm.SVC(C=30, kernel='poly', degree=6)
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With polynomial kernel of degree 6 accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With polynomial kernel of degree 6 accuracy is 0.9363533041078097


## Ternary classification with $A_6$, $D_6$ and $E_6$ <a class="anchor" id="4"></a>

Now we introduce $b$-matrices of $E_6$ type to see how our model can handle a ternary classification.

In [30]:
# A6

with open('cluster_data_A6_depth_6.csv') as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    data = [row for row in reader]
data=data[0]
cluster_type = data[0]
data = [np.array(np.matrix(data[i])).ravel() for i in range(1, len(data))]
data = [np.append(i, np.array([0])) for i in data] # Now we have 3 labels need to relabel targets as 3-d basis vectors
A6_data = data
A6_array = A6_data[0]
for i in range(1, len(A6_data)):
    A6_array = np.vstack([A6_array, A6_data[i]])
    
# D6
    
with open('cluster_data_D6_depth_6.csv') as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    data = [row for row in reader]
data=data[0]
cluster_type = data[0]
data = [np.array(np.matrix(data[i])).ravel() for i in range(1, len(data))]
data = [np.append(i, np.array([1])) for i in data]
D6_data = data
D6_array = D6_data[0]
for i in range(1, len(D6_data)):
    D6_array = np.vstack([D6_array, D6_data[i]])

# E6     
    
with open('cluster_data_E6_depth_6.csv') as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    data = [row for row in reader]
data=data[0]
cluster_type = data[0]
data = [np.array(np.matrix(data[i])).ravel() for i in range(1, len(data))]
data = [np.append(i, np.array([2])) for i in data]
E6_data = data
E6_array = E6_data[0]
for i in range(1, len(E6_data)):
    E6_array = np.vstack([E6_array, E6_data[i]])
    
# Features, targets

X = np.vstack([A6_array[:,:-1], D6_array[:,:-1]])
X = np.vstack([X, E6_array[:,:-1]])
y = np.vstack([A6_array[:,-1:], D6_array[:,-1:]])
y = np.vstack([y, E6_array[:,-1:]])
y = y.ravel()
scaler = StandardScaler()
scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=11)

We first try our best machine so far. It's accuracy is around $0.8$, which is good for a ternary classification.

In [15]:
machine = svm.SVC(C=30, kernel='poly', degree=4)
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With polynomial kernel of degree 4 accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With polynomial kernel of degree 4 accuracy is 0.8063433643645811


We again check to see how degree $6$ performs, and see that in this case it is slightly better. 

In [128]:
machine = svm.SVC(C=30, kernel='poly', degree=6)
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With polynomial kernel of degree 6 accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With polynomial kernel of degree 6 accuracy is 0.8256116042433427


Finally we check to see if degree $8$ is reliable, but it is worse than degree $6$.

In [132]:
machine = svm.SVC(C=30, kernel='poly', degree=8)
machine.fit(X_train, y_train)
y_predict = machine.predict(X_test)
print(f'With polynomial kernel of degree 8 accuracy is {metrics.accuracy_score(y_test, y_predict)}')

With polynomial kernel of degree 8 accuracy is 0.8060186187486469
