# Support Vector Classification with SKLearn

Authored by [_Mark (Zixuan) Song_](https://marksong.tech)
- - -

We use the `SKLearn` library to implement `SVC` in the following tutorial.

# Overview

The aim of this tutorial is to implant a quantum machine learning (QML) transformer into SVC pipeline. And this is a general introduction to connect `tensorcircuit` with `scikit-learn`.

## Setup

Install `scikit-learn` and download dataset that is going to be used in this model [`GCN`](https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data) as `german.data-numeric`.

```bash
pip install scikit-learn
```

In [1]:
import tensorcircuit as tc
import tensorflow as tf
from sklearn.svm import SVC
from sklearn import metrics
from time import time

K = tc.set_backend("tensorflow")

## Data Preprocessing

The data has 24 variables and each is a integer value. In order for the model to use the data, we need to first convert the data into a matrix of either 4x6 or 5x5 (the case of this tutorial) and then normalize the data to between 0 and 1.

In [2]:
def load_GCN_data():
    f = open("german.data-numeric")
    line = f.readline()
    X = []
    while line:
        ll = line
        while '  ' in ll:
            ll = ll.replace('  ',' ')
        if ll[0]==' ':
            ll = ll[1:]
        if ll[-1]=='\n':
            ll = ll[:-1]
        if ll[-1]==' ':
            ll = ll[:-1]
        x = ll.split(' ')
        x_int = []
        for i in x:
            x_int.append(int(i))
        X.append(x_int)
        line = f.readline()
    f.close()
    X_temp = K.convert_to_tensor(X)
    X = []
    Y = []
    X_temp_transpose = K.transpose(K.convert_to_tensor(X_temp))
    X_temp_max = []
    for i in range(len(X_temp_transpose)):
        X_temp_max.append(max(X_temp_transpose[i]))
    X_temp_max = K.convert_to_tensor(X_temp_max)
    final_digit = K.cast([0],'int32')
    for i in X_temp:
        Y.append(i[-1]-1)
        X.append(K.divide(K.concat([i[:24],final_digit],0), X_temp_max))
    Y = K.cast(K.convert_to_tensor(Y),'float32')
    X = K.cast(K.convert_to_tensor(X),'float32')
    return (X[:800],Y[:800]),(X[800:],Y[800:])

(x_train, y_train), (x_test, y_test) = load_GCN_data()

## Quantum Model

This quantum model takes in 5x5 matrices as input and output the state of 5 qbits. The model is shown below:

In [3]:
def quantumTran(inputs):
    c = tc.Circuit(5)
    for i in range(5):
        if i%2 == 0:
            for j in range(5):
                c.rx(j, theta=(0 if i*5+j >= 25 else inputs[i*5+j]))
            for j in range(4):
                c.cnot(j, j+1)
        else:
            for j in range(5):
                c.rz(j, theta=(0 if i*5+j >= 25 else inputs[i*5+j]))
    return c.state()

func_qt =  tc.interfaces.tensorflow_interface(quantumTran, ydtype=tf.complex64, jit=True)

## Wrapping Quantum Model into a SVC

Convert quantum model into svc that can be trained.

In [4]:
def quantum_kernel(quantumTran, data_x, data_y):
    def kernel(x,y):
        x = K.convert_to_tensor(x)
        y = K.convert_to_tensor(y)
        x_qt = None
        for i, x1 in enumerate(x):
            if i == 0:
                x_qt = K.convert_to_tensor([quantumTran(x1)])
            else:
                x_qt = K.concat([x_qt,[quantumTran(x1)]],0)
        y_qt = None
        for i, x1 in enumerate(y):
            if i == 0:
                y_qt = K.convert_to_tensor([quantumTran(x1)])
            else:
                y_qt = K.concat([y_qt,[quantumTran(x1)]],0)
        data_ret = K.cast(K.power(K.abs(x_qt @ K.transpose(y_qt)), 2), "float32")
        return data_ret
    clf = SVC(kernel=kernel)
    clf.fit(data_x, data_y)
    return clf

## Create Traditional SVC

In [5]:
def standard_kernel(data_x, data_y, method):
    methods = ['linear', 'poly', 'rbf', 'sigmoid']
    if method not in methods:
        raise ValueError("method must be one of %r." % methods)
    clf = SVC(kernel=method)
    clf.fit(data_x, data_y)
    return clf

## Test

Test the accuracy of the quantum model SVC with the test data and compare it with traditional SVC.

In [6]:
methods = ['linear', 'poly', 'rbf', 'sigmoid']

for method in methods:
    
    print()
    t = time()

    k = standard_kernel(data_x=x_train, data_y=y_train, method=method)
    y_pred = k.predict(x_test)
    print("Accuracy:(%s as kernel)" % method,metrics.accuracy_score(y_test, y_pred))

    print("time:",time()-t,'seconds')

print()
t = time()

k = quantum_kernel(quantumTran=func_qt, data_x=x_train, data_y=y_train)
y_pred = k.predict(x_test)
print("Accuracy:(qml as kernel)",metrics.accuracy_score(y_test, y_pred))

print("time:",time()-t,'seconds')


Accuracy:(linear as kernel) 0.79
time: 0.009650945663452148 seconds

Accuracy:(poly as kernel) 0.77
time: 0.010898828506469727 seconds

Accuracy:(rbf as kernel) 0.775
time: 0.012045145034790039 seconds

Accuracy:(sigmoid as kernel) 0.565
time: 0.01863694190979004 seconds

Accuracy:(qml as kernel) 0.635
time: 6.367200136184692 seconds


## Issue with `SKLearn`

Due to the limitation of `SKLearn`, `SKLearn`'s `SVC` is not fully compatible with quantum machine model (QML). 

This is because QML outputs a result as complex number (coordinate on the bloch sphere) whereas SKLearn only accept float. This is causing the result output by QML must be converted into float before it can be used in SVC, leading to a potential loss of accuracy.

## Conclusion

Due to the present limitation of SKLearn, quantum SVC is worse than traditional SVC in both accuracy and speed. However, if the limitation is removed, quantum SVC might be able to outperform traditional SVC in both accuracy.