# Qutrit Quantum Kernel with Classical Support Vector Machine (SVM) for Seed Dataset

In this notebook, I present the implementation of a quantum kernel combined with a classical SVM approach to tackle the Seed dataset. The Seed dataset is a multi-class classification problem containing samples of three different varieties of wheat seeds. The implementation process involves data preprocessing, quantum feature mapping, classical SVM training, and evaluation.

In [35]:
import numpy as np
import torch
from scipy.linalg import expm
import pandas as pd
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
from sklearn.model_selection import train_test_split
import random
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.datasets import make_moons
from matplotlib.colors import ListedColormap
plt.style.use('seaborn-v0_8')

#### Qutrits

Defining the qutrits that have three possible states, denoted as |0⟩, |1⟩, and |2⟩. 

In [36]:
# Define the qutrit states as column vectors
q0 = np.array([[1], [0], [0]])
q1 = np.array([[0], [1], [0]])
q2 = np.array([[0], [0], [1]])

#### Gell-Mann Matrices

The generators of a qutrit are the operators that generate transformations on the qutrit states under some symmetry group. One common set of generators for a qutrit are the Gell-Mann matrices, which are a set of 8 Hermitian operators that span the space of 3x3 complex matrices:

$gm1 = |0⟩⟨1| + |1⟩⟨0| \\
gm2 = -i(|0⟩⟨1| - |1⟩⟨0|) \\
gm3 = |0⟩⟨0| - |1⟩⟨1| \\
gm4 = |0⟩⟨2| + |2⟩⟨0| \\
gm5 = -i(|0⟩⟨2| - |2⟩⟨0|) \\
gm6 = |1⟩⟨2| + |2⟩⟨1| \\
gm7 = -i(|1⟩⟨2| - |2⟩⟨1|) \\
gm8 = 1/√3 (|0⟩⟨0| + |1⟩⟨1| - 2|2⟩⟨2|)$

These generators satisfy the commutation relations of the SU(3) Lie algebra, which is the symmetry group of the qutrit. The Gell-Mann matrices can be used to construct any unitary transformation on the qutrit, making them a useful tool for analyzing the behavior of qutrit systems in quantum mechanics.

In [37]:
# Define the Gell-Mann matrices
gm1 = np.kron(q0, q1.T) + np.kron(q1, q0.T)
gm2 = -1j * (np.kron(q0, q1.T) - np.kron(q1, q0.T))
gm3 = np.kron(q0, q0.T) - np.kron(q1, q1.T)
gm4 = np.kron(q0, q2.T) + np.kron(q2, q0.T)
gm5 = -1j * (np.kron(q0, q2.T) - np.kron(q2, q0.T))
gm6 = np.kron(q1, q2.T) + np.kron(q2, q1.T)
gm7 = -1j * (np.kron(q1, q2.T) - np.kron(q2, q1.T))
gm8 = 1/np.sqrt(3) * (np.kron(q0, q0.T) + np.kron(q1, q1.T) - 2*np.kron(q2, q2.T))

# Collect the Glenn-Mann matrices in a list
generators = [gm1, gm2, gm3, gm4, gm5, gm6, gm7, gm8]

# Print Glenn-Mann 8
print(gm8)

[[ 0.57735027  0.          0.        ]
 [ 0.          0.57735027  0.        ]
 [ 0.          0.         -1.15470054]]


#### Generalised Hadamard for Qutrits

Then defining the Hadamard operator for qutrits.

In [38]:
# Define the Hadamard operator for qutrits
H = (1/np.sqrt(3)) * np.array([[1, 1, 1], [1, np.exp(2j*np.pi/3), np.exp(-2j*np.pi/3)], [1, np.exp(-2j*np.pi/3), np.exp(2j*np.pi/3)]])

# Print the Hadamard operator
print("Hadamard operator for qutrits:")
print(H)

Hadamard operator for qutrits:
[[ 0.57735027+0.j   0.57735027+0.j   0.57735027+0.j ]
 [ 0.57735027+0.j  -0.28867513+0.5j -0.28867513-0.5j]
 [ 0.57735027+0.j  -0.28867513-0.5j -0.28867513+0.5j]]


### Quantum Kernel

To construct the quantum kernel, I'll use custom functions that leverage the principles of quantum computing to transform the input data into a high-dimensional Hilbert space. The quantum kernel captures complex and non-linear relationships between data points, making it advantageous for certain types of datasets, such as those with intricate decision boundaries.

In [39]:
# Encoding four features on a qutrit
def encoding(vector):

    sum = 0
    for i in range(4):
        sum = sum + (1j * vector[i] * generators[i]) 

    return np.dot(expm(sum), q0)

Then defining the entangling operator LZZ2.

In [41]:
LZ2 = gm3 + np.sqrt(3)*gm8
LZZ2 = np.kron(LZ2, LZ2)
print(LZ2)
print(LZZ2)

[[ 2.  0.  0.]
 [ 0.  0.  0.]
 [ 0.  0. -2.]]
[[ 4.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0. -4.  0.  0. -0.  0.  0. -0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0. -0.  0.  0. -0.  0.  0. -0.]
 [ 0.  0.  0.  0.  0.  0. -4. -0. -0.]
 [ 0.  0.  0.  0.  0.  0. -0. -0. -0.]
 [ 0.  0. -0.  0.  0. -0. -0. -0.  4.]]


Then defining the quantum kernel function which implements a quantum feature map that operates on two input data points, x1 and x2. The quantum feature map begins by encoding the first four and last four elements of x1 and x2 into two separate qutrit states using the encoding function. Then, an entanglement gate is applied between the three qutrits. The function then calculates the inner product of the quantum states, and its square magnitude is returned as the kernel value.

In [42]:
def kernel(x1, x2):
    """The quantum kernel."""
    qutrit_1 = encoding(x1[:4])
    qutrit_2 = encoding(x1[4:])

    # Entanglement gate
    entangle_gate = 1j*LZZ2

    # Applying entanglement between the three qutrits
    qutrit_1x2 = np.kron(qutrit_1, qutrit_2)
    kron1 = np.dot(expm(entangle_gate), qutrit_1x2)

    qutrit_1 = encoding(x2[:4])
    qutrit_2 = encoding(x2[4:])

    # Applying entanglement between the three qutrits
    qutrit_1x2 = np.kron(qutrit_1, qutrit_2)
    kron2 = np.dot(expm(entangle_gate), qutrit_1x2)

    return np.real(np.dot(kron1.conj().T, kron2)**2)[0][0]

In [45]:
def kernel_matrix(A, B):
    """Compute the matrix whose entries are the kernel
       evaluated on pairwise data from sets A and B."""
    return np.array([[kernel(a, b) for b in B] for a in A])

### Seed Dataset 

The Seed dataset consists of 210 samples, each characterized by seven features that describe various geometric properties of the seeds. The goal is to classify these wheat seeds into one of three classes based on these features.

In [46]:
data = pd.read_csv('../Datasets/seeds.txt', delimiter='\t', header=None)
data = data.fillna(1)

X = data.iloc[:, :7].to_numpy()
labels = data.iloc[:, 7].to_numpy(dtype=int)
y = labels - 2

# Perform PCA to reduce the dimensions to 4
pca = PCA(n_components=1)
reduced_X = pca.fit_transform(X)

# Print the shapes of X and y to verify they match the expected dimensions
print('Shape of X:', X.shape)
print('Reduced  X:', reduced_X.shape)
print('x[0] reduced example: ', reduced_X[0])
print('x[0] reduced example: ', X[0])

Shape of X: (208, 7)
Reduced  X: (208, 1)
x[0] reduced example:  [0.64911212]
x[0] reduced example:  [15.26  14.84   0.871  5.763  3.312  2.221  5.22 ]


In [47]:
XR = np.concatenate((X, reduced_X), axis=1)

# Verify the shape of the concatenated array
print(XR.shape)  # Output: (200, 8)

(208, 8)


In [48]:
# Scaling the inputs is important since the embedding we use is periodic
scaler = StandardScaler().fit(XR)
X_scaled = scaler.transform(XR)
y_scaled = y

print('Shape of X:', X_scaled.shape)
print('Shape of y:', y_scaled.shape)
print('x[0] feature example: ', X_scaled[0])
print('y[0]: ', y_scaled[142])

Shape of X: (208, 8)
Shape of y: (208,)
x[0] feature example:  [ 1.35444590e-01  2.08430865e-01 -1.62386721e-04  2.96821470e-01
  1.34902457e-01 -9.98181036e-01 -3.93186748e-01  1.97746327e-01]
y[0]:  1


To split the dataset into a training set and a test set into a 80-20 train-test split with equal representation of the two classes in both sets, I'll use the train_test_split function from sklearn.model_selection module with the stratify parameter set to the target variable y.

In [49]:
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_scaled, test_size=0.2, stratify=y_scaled, random_state=42)

# Use np.sum to count the number of instances of each class in both sets
print("Class counts in training set:", np.sum(y_train.shape))
print("Class counts in test set:", np.sum(y_test.shape))


Class counts in training set: 166
Class counts in test set: 42


Checking that the kernel of a data point with itself equals 1.

In [50]:
print(kernel(X_train[0], X_train[0]))

1.0000000000000004


### Classical SVM

For the classical SVM component, I'll rely on the popular scikit-learn library, a powerful toolset for machine learning in Python. The SVM aims to find the optimal hyperplane that separates data points of different classes while maximizing the margin between them. By combining the quantum kernel with the classical SVM, we can handle multi-class classification tasks effectively.

In [51]:
svm = SVC(kernel=kernel_matrix).fit(X_train, y_train)

Predicting on the test set.

In [52]:
predictions = svm.predict(X_test)
accuracy_score(predictions, y_test)

0.8809523809523809

### Evaluation Results

In [53]:
# To be printed better
y_pred = predictions
y_true = y_test
accuracy = accuracy_score(y_true, y_pred)* 100
f1 = f1_score(y_true, y_pred, average='macro')* 100
precision = precision_score(y_true, y_pred, average='macro')* 100
recall = recall_score(y_true, y_pred, average='macro')* 100

# Print the results
print("Evaluation Results")
print("_____________________________________________")
print(
            f"\nRecall: {recall:.2f}%"
            f"\nPrecision: {precision:.2f}%"
            f"\nAccuracy: {accuracy:.2f}%"
            f"\nMacro Averaged F1-score: {f1:.2f}%"
            )
print("_____________________________________________")

Evaluation Results
_____________________________________________

Recall: 88.10%
Precision: 89.56%
Accuracy: 88.10%
Macro Averaged F1-score: 87.93%
_____________________________________________
