# Qutrit Quantum Kernel with Classical Support Vector Machine (SVM) for Wine Dataset

In this notebook, I present the implementation of a quantum kernel combined with a classical SVM approach to address the Wine dataset. The Wine dataset is a multi-class classification problem, containing samples from three different types of wines.

The Wine dataset comprises 178 samples, each with 13 features that represent various chemical properties of the wines. The task is to classify these wines into one of three classes based on these features. The implementation process involves data preprocessing, quantum feature mapping, classical SVM training, and evaluation.

In [None]:
import numpy as np
import torch
from scipy.linalg import expm
import pandas as pd
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
from sklearn.model_selection import train_test_split
import random
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.datasets import make_moons
from matplotlib.colors import ListedColormap
plt.style.use('seaborn-v0_8')

#### Qutrits

Defining the qutrits that have three possible states, denoted as |0⟩, |1⟩, and |2⟩. 

In [None]:
# Define the qutrit states as column vectors
q0 = np.array([[1], [0], [0]])
q1 = np.array([[0], [1], [0]])
q2 = np.array([[0], [0], [1]])

#### Gell-Mann Matrices

The generators of a qutrit are the operators that generate transformations on the qutrit states under some symmetry group. One common set of generators for a qutrit are the Gell-Mann matrices, which are a set of 8 Hermitian operators that span the space of 3x3 complex matrices:

$gm1 = |0⟩⟨1| + |1⟩⟨0| \\
gm2 = -i(|0⟩⟨1| - |1⟩⟨0|) \\
gm3 = |0⟩⟨0| - |1⟩⟨1| \\
gm4 = |0⟩⟨2| + |2⟩⟨0| \\
gm5 = -i(|0⟩⟨2| - |2⟩⟨0|) \\
gm6 = |1⟩⟨2| + |2⟩⟨1| \\
gm7 = -i(|1⟩⟨2| - |2⟩⟨1|) \\
gm8 = 1/√3 (|0⟩⟨0| + |1⟩⟨1| - 2|2⟩⟨2|)$

These generators satisfy the commutation relations of the SU(3) Lie algebra, which is the symmetry group of the qutrit. The Gell-Mann matrices can be used to construct any unitary transformation on the qutrit, making them a useful tool for analyzing the behavior of qutrit systems in quantum mechanics.

In [None]:
# Define the Gell-Mann matrices
gm1 = np.kron(q0, q1.T) + np.kron(q1, q0.T)
gm2 = -1j * (np.kron(q0, q1.T) - np.kron(q1, q0.T))
gm3 = np.kron(q0, q0.T) - np.kron(q1, q1.T)
gm4 = np.kron(q0, q2.T) + np.kron(q2, q0.T)
gm5 = -1j * (np.kron(q0, q2.T) - np.kron(q2, q0.T))
gm6 = np.kron(q1, q2.T) + np.kron(q2, q1.T)
gm7 = -1j * (np.kron(q1, q2.T) - np.kron(q2, q1.T))
gm8 = 1/np.sqrt(3) * (np.kron(q0, q0.T) + np.kron(q1, q1.T) - 2*np.kron(q2, q2.T))

# Collect the Glenn-Mann matrices in a list
generators = [gm1, gm2, gm3, gm4, gm5, gm6, gm7, gm8]

# Print Glenn-Mann 8
print(gm8)

[[ 0.57735027  0.          0.        ]
 [ 0.          0.57735027  0.        ]
 [ 0.          0.         -1.15470054]]


#### Generalised Hadamard for Qutrits

Then defining the Hadamard operator for qutrits.

In [None]:
# Define the Hadamard operator for qutrits
H = (1/np.sqrt(3)) * np.array([[1, 1, 1], [1, np.exp(2j*np.pi/3), np.exp(-2j*np.pi/3)], [1, np.exp(-2j*np.pi/3), np.exp(2j*np.pi/3)]])

# Print the Hadamard operator
print("Hadamard operator for qutrits:")
print(H)

Hadamard operator for qutrits:
[[ 0.57735027+0.j   0.57735027+0.j   0.57735027+0.j ]
 [ 0.57735027+0.j  -0.28867513+0.5j -0.28867513-0.5j]
 [ 0.57735027+0.j  -0.28867513-0.5j -0.28867513+0.5j]]


### Quantum Kernel

To construct the quantum kernel, I'll use custom functions that leverage the principles of quantum computing to transform the input data into a high-dimensional Hilbert space. The quantum kernel captures complex and non-linear relationships between data points, making it advantageous for certain types of datasets, such as those with intricate decision boundaries.

In [None]:
# Encoding four features on a qutrit
def encoding(vector):

    sum = 0
    for i in range(4):
        sum = sum + (1j * vector[i] * generators[i+4]) 

    return np.dot(expm(sum), q0)

Then defining the kernel function, which quantifies the similarity or correlation between the quantum states of x1 and x2 and serves as a critical component in quantum machine learning algorithms like Support Vector Machines (SVM) for classification tasks, where the quantum advantage lies in capturing non-linear relationships between data points in high-dimensional quantum spaces.

In [None]:
def kernel(x1, x2):
    """The quantum kernel."""
    enc1 = encoding(x1[:4])
    enc2 = encoding(x1[4:8])
    enc3 = encoding(x1[8:12])
    kron1 = np.kron(np.kron(enc1, enc2),enc3)
    enc1 = encoding(x2[:4])
    enc2 = encoding(x1[4:8])
    enc3 = encoding(x1[8:12])
    kron2 = np.kron(np.kron(enc1, enc2),enc3)
    return np.real(np.dot(kron1.conj().T, kron2)**2)[0][0]

In [None]:
def kernel_matrix(A, B):
    """Compute the matrix whose entries are the kernel
       evaluated on pairwise data from sets A and B."""
    return np.array([[kernel(a, b) for b in B] for a in A])

### Wine dataset

The Wine dataset is a multi-class classification problem, containing samples from three different types of wines. In the dataset ingestion step, we'll load the Wine dataset using pandas.

In [None]:
data = pd.read_csv('../Datasets/wine.data', delimiter=',', header=None)
X = data.iloc[:, 1:13].to_numpy()
labels = data.iloc[:, 0].to_numpy()
y = labels-2
print(X[0], y[162])

# Print the shapes of X and y to verify they match the expected dimensions
print('Shape of X:', X.shape)

[ 14.23   1.71   2.43  15.6  127.     2.8    3.06   0.28   2.29   5.64
   1.04   3.92] 1
Shape of X: (178, 12)


In [None]:
# Scaling the inputs is important since the embedding we use is periodic
scaler = StandardScaler().fit(X)
X_scaled = scaler.transform(X)
y_scaled = y

print(X_scaled[0])
print('Shape of X:', X_scaled.shape)
print('Shape of y:', y_scaled.shape)
print('x[0] feature example: ', X_scaled[0])
print('y[162]: ', y_scaled[162])

[ 1.51861254 -0.5622498   0.23205254 -1.16959318  1.91390522  0.80899739
  1.03481896 -0.65956311  1.22488398  0.25171685  0.36217728  1.84791957]
Shape of X: (178, 12)
Shape of y: (178,)
x[0] feature example:  [ 1.51861254 -0.5622498   0.23205254 -1.16959318  1.91390522  0.80899739
  1.03481896 -0.65956311  1.22488398  0.25171685  0.36217728  1.84791957]
y[162]:  1


To split the dataset into a training set and a test set into a 80-20 train-test split with equal representation of the two classes in both sets, I'll use the train_test_split function from sklearn.model_selection module with the stratify parameter set to the target variable y.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_scaled, test_size=0.2, stratify=y_scaled, random_state=42)

# Use np.sum to count the number of instances of each class in both sets
print("Class counts in training set:", np.sum(y_train.shape))
print("Class counts in test set:", np.sum(y_test.shape))


Class counts in training set: 142
Class counts in test set: 36


Checking that the kernel of a datapoint with itself equals 1.

In [None]:
kernel(X_train[0],X_train[0])

1.0

### Classical SVM

For the classical SVM component, I'll rely on the popular scikit-learn library, a powerful toolset for machine learning in Python. The SVM aims to find the optimal hyperplane that separates data points of different classes while maximizing the margin between them. By combining the quantum kernel with the classical SVM, we can handle multi-class classification tasks effectively.

In [None]:
svm = SVC(kernel=kernel_matrix, decision_function_shape='ovo').fit(X_train, y_train)

In [None]:
predictions = svm.predict(X_test)
accuracy_score(predictions, y_test)

0.8888888888888888

### Evaluation Results

In [None]:
# To be printed better
y_pred = predictions
y_true = y_test
accuracy = accuracy_score(y_true, y_pred)* 100
f1 = f1_score(y_true, y_pred, average='macro')* 100
precision = precision_score(y_true, y_pred, average='macro')* 100
recall = recall_score(y_true, y_pred, average='macro')* 100

# Print the results
print("Evaluation Results")
print("_____________________________________________")
print(
            f"\nRecall: {recall:.2f}%"
            f"\nPrecision: {precision:.2f}%"
            f"\nAccuracy: {accuracy:.2f}%"
            f"\nMacro Averaged F1-score: {f1:.2f}%"
            )
print("_____________________________________________")

Evaluation Results
_____________________________________________

Recall: 88.57%
Precision: 88.97%
Accuracy: 88.89%
Macro Averaged F1-score: 88.64%
_____________________________________________
