This file contains my solution (Zain Mughal) to the Screening Task #3 of Cohort 7 of the QOSF Mentorship Program 

The problem statement is mentioned below, and the references of how I am solving the problem. I will be implementing the task using Python and Qiskit. 

#Task 3: QSVM

Generate a Quantum Support Vector Machine (QSVM) using the iris dataset and try to propose a kernel from a parametric quantum circuit to classify the three classes(setosa, versicolor, virginica) using the one-vs-all format, the kernel only works as binary classification. Identify the proposal with the lowest number of qubits and depth to obtain higher accuracy. You can use the UU† format or using the Swap-Test.

<br>

#Background and References
The task is asking to use the one-vs-all format, which is a strategy used in multi-class classification problems to transform a problem with multiple classes into a set of binary classification problems. For each class, a binary classifier is trained to distinguish samples belonging to that class from samples not belonging to that class. This results in a set of binary classifiers, one for each class.

<br>

In the task it says to use UU† format or using the Swap-Test:
  1. The UU† format is a way to represent a unitary matrix using two other unitary matrices, U and U†. Specifically, any unitary matrix can be written in the form UU†e^{iθ}, where θ is a global phase. This format is useful in quantum algorithms because it allows a unitary matrix to be implemented using only two unitary gates, U and U†, which can be easier to implement than a full unitary matrix.
  2. In a quantum support vector machine, the Swap-Test can be used to compute the inner product between two quantum states, which is needed to evaluate the quantum kernel function used in the algorithm. Specifically, given two quantum states encoded as quantum circuits, the Swap-Test circuit can be used to measure the overlap between the two states, which can then be used to evaluate the quantum kernel function.

<br>

I have decided to use the SWAP Test Implementation.

We are also tasked with finding a proposal with the lowest number of qubits and depth to obtain higher accuracy. My plan is to first implement the QSVM in Qiskit, using their QSVC Class and the one-vs-all format, with the Standard Swap Gate. After, I use GridSearchCV to tune the hyperparameters for a higher accuracy.
Then, after doing some research, I found that the Standard Swap Gate Kernel implemented is inefficient, since it requires 3 qubits and a depth=7, and that we can make modifications, using CX-gates to act as the "swap", which reduces the number of qubits. 


References: 
  1. https://en.wikipedia.org/wiki/Swap_test (Given from Problem Statement)
  2. https://learn.qiskit.org/summer-school/2021/lab3-introduction-quantum-kernels-support-vector-machines (Qiskit Lab)
  3. https://arxiv.org/pdf/1804.11326.pdf  (Supervised learning with quantum enhanced feature spaces, Pages 18-19)

In [None]:
!pip install qiskit
!pip install qiskit[machine-learning]

In [95]:
#Import the Libraries
import numpy as np
from sklearn.datasets import load_iris
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister, execute, Aer
from qiskit_machine_learning.algorithms.classifiers import QSVC
from qiskit_machine_learning.kernels import QuantumKernel
from qiskit.circuit.library import HGate, CXGate, CSwapGate
from qiskit.circuit.library import ZFeatureMap 
from qiskit import Aer
from sklearn.model_selection import train_test_split

"""
In this assignment, I had done some experimentation of which feature map to use. 
I originally started with the ZZFeatureMap, which is the traditional choice for implementing QSVC, 
however, I decided to use the ZFeatureMap instead. The ZZFeatureMap is better at dealing with 
non-linear data, however it took 2x the time to run my implementation, achieving accuracies of 70%-85%, after fine tuning. 
Similar using the PauliFeatureMap, it was about 1.5x longer, and yielded lower results from 40%-73%, after fine tuning. 
Thus I decided to use ZFeatureMap. 
"""
#from qiskit.circuit.library import PauliFeatureMap
#from qiskit.circuit.library import ZZFeatureMap

In [90]:
"""
We need to process the data, by loading the Iris Dataset, and splitting the dataset into training and testing sets. 
Here, I decided to use the standard 80% of the dataset for training, and 20% for the testing set. 
"""

# Load the iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [91]:
"""Define the quantum kernel, which will be the SWAP Test Implementation.
Since there are three qubits needed for the standard Swap Test, this implementation
has a circuit depth of 7.

"""
num_qubits = 3
def swap_test_kernel(qc, q, theta):
    qc.h(q[0])
    qc.h(q[1])
    qc.h(q[2])
    qc.cswap(q[0], q[1], q[2])
    qc.h(q[0])
    qc.h(q[1])
    qc.measure_all()

In [None]:
theta = np.random.uniform(0, 2*np.pi, num_qubits) 

"""
Here I tested using different FeatureMaps and also varied their batch sizes. 
Using the different feature maps, changing batch sizes, etc., I was getting accuracies 
from a range of 3.6%-85%. I am including it to show how these feature maps can be implemented, 
incase one would want to test them out. 

"""
# feature_map = PauliFeatureMap(feature_dimension=2, entanglement='linear')
# quantum_instance = Aer.get_backend('qasm_simulator')
# kernel = QuantumKernel(feature_map=feature_map, enforce_psd=True, batch_size=100, 
#                        quantum_instance=quantum_instance, training_parameters=None, 
#                        evaluate_duplicates='off_diagonal')

# feature_map = ZZFeatureMap(feature_dimension=2, reps=2)
# quantum_instance = Aer.get_backend('qasm_simulator')
# kernel = QuantumKernel(feature_map=feature_map, enforce_psd=True, batch_size=40, 
#                        quantum_instance=quantum_instance, training_parameters=None, 
#                        evaluate_duplicates='off_diagonal')


feature_map = ZFeatureMap(feature_dimension=2, reps=2)
quantum_instance = Aer.get_backend('qasm_simulator')
kernel = QuantumKernel(feature_map=feature_map, enforce_psd=True, batch_size=40, 
                       quantum_instance=quantum_instance, training_parameters=None, 
                       evaluate_duplicates='off_diagonal')



In [None]:
"""
The original iris dataset has three classes of flowers: setosa, versicolor, and virginica. We want to train a QSVM to classify each flower into one of these three classes. 
However, the QSVC can only solve binary classification problems, where each sample is classified into one of two classes. To solve a multi-class classification problem using a QSVC, 
we need to convert it into several binary classification problems. Essentially being a QSVM. 

The one-vs-all format is one way to convert a multi-class classification problem into several binary classification problems. 
In this format, we train a QSVM for each class, where the samples of the target class are labeled as 1, and the samples of all other classes are labeled as 0. 
This way, each QSVM learns to distinguish the samples of one class from all other classes.
"""

# Train the QSVM using the one-vs-all approach
qsvms = []
for i in range(len(np.unique(y_train))):

    # Preparing binary labels for the current class
    y_train_binary = np.zeros(len(y_train))
    y_train_binary[y_train == i] = 1
    if np.sum(y_train_binary) == 0:
        continue  # Skip if there are no samples for the current class

    # Train the QSVM for the current class
    qc = QuantumCircuit(num_qubits)
    swap_test_kernel(qc, [0,1,2], theta)
    qsvc = QSVC(quantum_kernel=kernel)
    qsvc.fit(X_train, y_train_binary)
    
    qsvms.append(qsvc)
   # print(f"QSVC for class {i}: {qsvc}")  # Print QSVC classifier for each class, to make sure the implementation is correct 



In [None]:
# Test the QSVM
y_pred = np.zeros((len(X_test), len(np.unique(y_train)))) #len(np.unique(y_train)) should equal 2, since we have converted to binary classification problem 
for i in range(len(X_test)):
    for j in range(len(np.unique(y_train))):
        qc = QuantumCircuit(num_qubits)
        swap_test_kernel(qc, [0,1,2], theta)
        kernel.quantum_circuit = qc
        y_pred[i, j] = qsvms[j].predict(X_test[i:i+1,:])

y_pred = np.argmax(y_pred, axis=1)

# Evaluate the accuracy of the QSVM
accuracy = sum([y_test[i]==y_pred[i] for i in range(len(y_test))])/len(y_test)
print('Accuracy:', accuracy)

<br>
<br>

Before optimizing the number of qubits and depth, I want to tune the hyperparamaters to find a better accuracy. At the moment, the model was achieving a accuracy of 36%, which is very low. And so, I decided to make use of GridSearchCV from the sklearn.model_selection module. GridSearchCV is a technique for finding the optimal parameter values from a given set of parameters in a grid. It does this by trying different combination of all the specified hyperparameters and their values and calculates the performance for each combination and selects the best value for the hyperparameters. 

In [78]:
# Load the iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the quantum kernel
def swap_test_kernel(qc, q, theta):
    qc.h(q[0])
    qc.h(q[1])
    qc.h(q[2])
    qc.cswap(q[0], q[1], q[2])
    qc.h(q[0])
    qc.h(q[1])
    qc.measure_all()

num_qubits = 3
theta = np.random.uniform(0, 2*np.pi, num_qubits)

feature_map = ZFeatureMap(feature_dimension=2, reps=2)
quantum_instance = Aer.get_backend('qasm_simulator')
kernel = QuantumKernel(feature_map=feature_map, enforce_psd=True, batch_size=100, 
                       quantum_instance=quantum_instance, training_parameters=None, 
                       evaluate_duplicates='off_diagonal')

# Train the QSVM 
from sklearn.model_selection import GridSearchCV
from qiskit_machine_learning.algorithms.classifiers import QSVC

param_grid = {'C': [0.1, 0.5, 1, 10],
              'gamma': [0.01, 0.1, 1],
              'degree': [2, 3, 4]}

qsvc = QSVC(quantum_kernel=kernel)
clf = GridSearchCV(qsvc, param_grid, cv=5)
clf.fit(X_train, y_train)

print("Best parameters found: ", clf.best_params_)
print("Best score: ", clf.best_score_)

Best parameters found:  {'C': 1, 'degree': 3, 'gamma': 0.01}
Best score:  0.9333333333333333


<br>
<br>


After finding the best hyperparameter values for my QSVC Implementation, I wanted to start thinking about reducing the number of qubits and depth, and see how it would affect the accuracy of this problem. When doing some research, I realized that we had taken a multi-class classification problem, and turned it into a binary classification problem. And because we did that, it should be possible to reduce the number of qubits from 3 to 2. And that is exactly what I do below. 

<br>

I first changed the implementation of the Swap Test. From Ref#3, which uses a CNOT gate instead of a CSWAP gate. The idea is as follows:

In the swap test, the CSWAP gate is used to entangle three qubits, where the first two qubits contain the state to be compared, and the third qubit acts as the control. The CSWAP gate flips the state of the third qubit if and only if the first two qubits are in the same state.

We can perform a similar swap test with only two qubits, by replacing the CSWAP gate with a CNOT gate. In this case, we only need two qubits: one to hold the input state and the other to act as the control. The CNOT gate flips the second qubit (control qubit) if and only if the first qubit (target qubit) is in the state |1>. Thus, we can use the CNOT gate to achieve the same effect as the CSWAP gate in the swap test.  


In [98]:
# Load the iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the quantum kernel
def swap_test_kernel(qc, q, theta):
    qc.h(q[0])
    qc.cnot(q[0], q[1])
    qc.h(q[0])
    qc.measure_all()

num_qubits = 2
theta = np.random.uniform(0, 2*np.pi, num_qubits)

from qiskit.circuit.library import ZFeatureMap
from qiskit import Aer

feature_map = ZFeatureMap(feature_dimension=2, reps=2)
quantum_instance = Aer.get_backend('qasm_simulator')
kernel = QuantumKernel(feature_map=feature_map, enforce_psd=True, batch_size=40, 
                       quantum_instance=quantum_instance, training_parameters=None, 
                       evaluate_duplicates='off_diagonal')

# Train the QSVM using the one-vs-all approach
qsvms = []
for i in range(len(np.unique(y_train))):
    # Prepare binary labels for the current class
    y_train_binary = np.zeros(len(y_train))
    y_train_binary[y_train == i] = 1
    if np.sum(y_train_binary) == 0:
        continue  # Skip if there are no samples for the current class
        
    # Train the QSVM for the current class
    qc = QuantumCircuit(num_qubits)
    swap_test_kernel(qc, [0,1,2], theta)
    qsvc = QSVC(quantum_kernel=kernel, C=1, gamma=0.01, degree=3)
    qsvc.fit(X_train, y_train_binary)
    
    qsvms.append(qsvc)

# Test the QSVM
y_pred = np.zeros((len(X_test), len(np.unique(y_train))))
for i in range(len(X_test)):
    for j in range(len(np.unique(y_train))):
        qc = QuantumCircuit(num_qubits)
        swap_test_kernel(qc, [0,1,2], theta)
        kernel.quantum_circuit = qc
        y_pred[i, j] = qsvms[j].predict(X_test[i:i+1,:])

y_pred = np.argmax(y_pred, axis=1)

# Evaluate the accuracy of the QSVM
accuracy = sum([y_test[i]==y_pred[i] for i in range(len(y_test))])/len(y_test)
print('Accuracy:', accuracy)

Accuracy: 0.9333333333333333


<br>

As you can see here, I was able to reduce the number of qubits from n=3 to n=2, and also reduce the depth of the circuit from the original Swap Test Kernal implementation of 7 to this modified of 3. All while keeping an accuracy of 93%. 