# Quantum Kernel Methods for IRIS dataset classification with [TorchQuantum](https://github.com/mit-han-lab/torchquantum).
<p align="left">
<img src="https://github.com/mit-han-lab/torchquantum/blob/master/torchquantum_logo.jpg?raw=true" alt="torchquantum Logo" width="250">
</p>

Tutorial Author: Zirui Li, Hanrui Wang


###Outline
1. Introduction to Quantum Kernel Methods.
2. Build and train an SVM using Quantum Kernel Methods.

In this tutorial, we use `tq.op_name_dict`, `tq.functional.func_name_dict` and `tq.QuantumDevice` from TorchQuantum.

You can learn how to build a Quantum kernel function and train an SVM with the quantum kernel from this tutorial.


##Introduction to Quantum Kernel Methods.


###Kernel Methods
Kernels or kernel methods (also called Kernel functions) are sets of different types of algorithms that are being used for pattern analysis. They are used to solve a non-linear problem by a linear classifier. Kernels Methods are employed in SVM (Support Vector Machines) which are often used in classification and regression problems. The SVM uses what is called a “Kernel Trick” where the data is transformed and an optimal boundary is found for the possible outputs.


####Quantum Kernel
Quantum circuit can transfer the data to a high dimension Hilbert space which is hard to simulate on classical computer. Using kernel methods based on this Hilbert space can achieve unexpected performance.

###How to evaluate the distance in Hilbert space?
Assume S(x) is the unitary that transfer data x to the state in Hilbert space. To evaluate the inner product between S(x) and S(y), we add a Transpose Conjugation of S(y) behind S(x) and measure the probability that the state falls on $|00\cdots0\rangle$


<div align="center">
<img src="https://github.com/mit-han-lab/torchquantum/blob/master/figs/kernel.png?raw=true" alt="conv-full-layer" width="600">
</div>

##Build and train an SVM using Quantum Kernel Methods.

###Installation

In [1]:
!pip install qiskit==0.32.1

Collecting qiskit==0.32.1
  Downloading qiskit-0.32.1.tar.gz (13 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting qiskit-terra==0.18.3 (from qiskit==0.32.1)
  Downloading qiskit-terra-0.18.3.tar.gz (5.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.1/5.1 MB[0m [31m13.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mGetting requirements to build wheel[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Getting requirements to build wheel ... [?25l[?25herror
[1;31merror[0m: [1msubprocess-exited-with-error[0m

[31m×[0m [32mGetting requirements to build wheel[0m did not run successfully.
[31m│[0m exit code: [1;36m1[0m
[31m╰─>[0m See above for output.

[1;35mn

Download and cd to the repo.

In [2]:
!git clone https://github.com/mit-han-lab/pytorch-quantum.git

Cloning into 'pytorch-quantum'...
remote: Enumerating objects: 15582, done.[K
remote: Counting objects: 100% (2290/2290), done.[K
remote: Compressing objects: 100% (694/694), done.[K
remote: Total 15582 (delta 1812), reused 1840 (delta 1588), pack-reused 13292[K
Receiving objects: 100% (15582/15582), 100.22 MiB | 15.23 MiB/s, done.
Resolving deltas: 100% (8887/8887), done.
Updating files: 100% (346/346), done.


In [3]:
%cd pytorch-quantum

/content/pytorch-quantum


Install torch-quantum.

In [4]:
!pip install --editable .

Obtaining file:///content/pytorch-quantum
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dill==0.3.4 (from torchquantum==0.1.8)
  Downloading dill-0.3.4-py2.py3-none-any.whl.metadata (9.6 kB)
Collecting nbsphinx (from torchquantum==0.1.8)
  Downloading nbsphinx-0.9.4-py3-none-any.whl.metadata (2.1 kB)
Collecting pathos>=0.2.7 (from torchquantum==0.1.8)
  Downloading pathos-0.3.2-py3-none-any.whl.metadata (11 kB)
Collecting pylatexenc>=2.10 (from torchquantum==0.1.8)
  Downloading pylatexenc-2.10.tar.gz (162 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m162.6/162.6 kB[0m [31m12.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting pyscf>=2.0.1 (from torchquantum==0.1.8)
  Downloading pyscf-2.6.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.3 kB)
Collecting qiskit<1.0.0,>=0.39.0 (from torchquantum==0.1.8)
  Downloading qiskit-0.46.2-py3-none-any.whl.metadata (12 kB)
Collecting 

Change PYTHONPATH and install other packages.

In [5]:
%env PYTHONPATH=.

env: PYTHONPATH=.


Run the following code to store a qiskit token. You can replace it with your own token from your IBMQ account if you like.



In [7]:
%pip install qiskit-ibmq-provider

Collecting qiskit-ibmq-provider
  Downloading qiskit_ibmq_provider-0.20.2-py3-none-any.whl.metadata (14 kB)
Collecting requests-ntlm<=1.1.0 (from qiskit-ibmq-provider)
  Downloading requests_ntlm-1.1.0-py2.py3-none-any.whl.metadata (938 bytes)
Collecting numpy<1.24 (from qiskit-ibmq-provider)
  Downloading numpy-1.23.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.3 kB)
Collecting websockets>=10.0 (from qiskit-ibmq-provider)
  Downloading websockets-12.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting ntlm-auth>=1.0.2 (from requests-ntlm<=1.1.0->qiskit-ibmq-provider)
  Downloading ntlm_auth-1.5.0-py2.py3-none-any.whl.metadata (10 kB)
Downloading qiskit_ibmq_provider-0.20.2-py3-none-any.whl (241 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m241.5/241.5 kB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading numpy-1.23.5-cp310-cp310-manylinux_2_17_x86_64.ma

In [14]:
from qiskit import IBMQ
IBMQ.save_account('88932d7d58efc72a2293259005cf2a11c39901510b6b0d8747bc15103d5b9e017ab3ce765a7c8d7374fc8d75b451b09a0ace5386658666b3eb3b6b9166deb635', overwrite=True)

###Import the module
`SVC` is support vector classification. We use this module to call the support vector machine algorithm.

`load_iris` is to load the famous iris dataset.

`StandardScaler` is to help scale the data by removing the mean and scaling to unit variance.

`train_test_split` is a tool to split the dataset.

`accuracy_score` can check how many samples are correctly predicted and give us the accuracy.

`func_name_dict` is a very important dict under `torchquantum.functional`. If we feed the name of the gates we want, like ‘rx’, ‘ry’, or ‘rzz’, the dict will give us a function. The function plays a central role in our quantum model. It performs the specified unitary operations on a specified quantum state on a specified wire. These three specified things are the three parameters we need to pass to it. You can see that later.


In [20]:
import numpy as np
import torch

from sklearn.svm import SVR
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from torchquantum.functional import func_name_dict
import torchquantum as tq

###Prepare dataset
We use the front 100 samples of IRIS dataset.

Since the phase in quantum gates is 2π-periodic, it is necessary to scale the data in a range from -π to π.

And we change the label from 0 and 1 to -1 and 1.

Split the dataset on a 3-to-1 ratio.


In [26]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
from sklearn import svm
import matplotlib.pyplot as plt
import numpy as np

In [69]:
df = pd.read_csv("/content/city_temperature.csv")
df = df.drop(df.query('AvgTemperature == -99').index).reset_index(drop= True) # adjust index after dropping the data
df = df.drop('State', axis= 1)
df = df.drop('Region', axis= 1)
df = df.drop('City', axis= 1)
df['AvgTemperature'] = df['AvgTemperature'].map(lambda x: (x - 32) * (5/9))
df = df[df['Country'].isin(['Egypt', 'Algeria'])]
df = df.drop(df.query('AvgTemperature == -99').index).reset_index(drop= True)
df.shape

  df = pd.read_csv("/content/city_temperature.csv")


(18458, 5)

In [70]:
df = df.rename(
    columns= {'Country': 'country', 'Month': 'month', 'Day': 'day', 'Year': 'year', 'AvgTemperature': 'avg_temperature'}
)
df=pd.get_dummies(df,columns=['country','month','day','year'])
X=df.drop(columns=['avg_temperature'])
y=df['avg_temperature']
X=X.astype(int)

In [71]:
X

Unnamed: 0,country_Algeria,country_Egypt,month_1,month_2,month_3,month_4,month_5,month_6,month_7,month_8,...,year_2011,year_2012,year_2013,year_2014,year_2015,year_2016,year_2017,year_2018,year_2019,year_2020
0,1,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,1,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,1,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,1,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
18453,0,1,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,1
18454,0,1,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,1
18455,0,1,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,1
18456,0,1,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,1


In [72]:
y

0        17.888889
1         9.666667
2         9.333333
3         8.000000
4         8.833333
           ...    
18453    22.000000
18454    22.277778
18455    23.444444
18456    25.722222
18457    20.555556
Name: avg_temperature, Length: 18458, dtype: float64

In [73]:
from sklearn.model_selection import train_test_split, cross_val_score

x_train, x_test, y_train, y_test = train_test_split(X, y,
                                                    train_size=0.8, test_size=0.2)
x_train,x_val,y_train,y_val=train_test_split(x_train,y_train,train_size=0.8,test_size=0.2)

# Print information about the splits
print(f"Total dataset length: {len(X)}")
print(f"Training set length: {len(x_train)}")
print(f"Validation set length: {len(x_val)}")
print(f"Test set length: {len(x_test)}")
print(x_train)
print(y_train)

Total dataset length: 18458
Training set length: 11812
Validation set length: 2954
Test set length: 3692
       country_Algeria  country_Egypt  month_1  month_2  month_3  month_4  \
14929                0              1        0        0        0        0   
16293                0              1        0        0        0        0   
3420                 1              0        0        0        0        0   
6858                 1              0        0        0        0        0   
7609                 1              0        0        0        0        0   
...                ...            ...      ...      ...      ...      ...   
3787                 1              0        0        0        0        0   
3608                 1              0        0        0        0        0   
17393                0              1        0        0        0        0   
7973                 1              0        0        0        0        0   
12728                0              1        0  

### Build the Ansatz, consist of a unitary and its transpose conjugation.
When initializing the `KernelAnsatz`, we only need to pass a `func_list` and the KernelAnsatz will record the `func_list`. Each entry in the function is a dict, containing 'input_idx', 'func', and 'wires'.

When executing the `KernelAnsatz`, three parameters are passed from outside, `q_device`, `x`, and `y`. `q_device` stores the state vector. We reset the state vector to the $|00\cdots0⟩$. And if you didn’t forget the figure, we will act the S(x) and the transpose conjugation of S(y) to the `q_device`.

Here the gates in the `func_list` with data `x` form the unitary S(x). S(y)'s transpose conjugation is S(y)’s inverse matrix. From the perspective of inverse, we can build the S(y)'s transpose conjugation by inverting the function list with data `y`. So, how to invert a list of gates executed from head to tail? You only need to counteract the list of gates from tail to head one by one. So here, we simply reverse the sequence of function list and flip the phase from positive to negative or from negative to positive.

And in each iteration is to act the unitary gate on the quantum state. We look up the `func_name_dict` with the function name. Here the function name is the gate name, like ‘ry’, ‘rz’ and so so. The dict returns a function. We pass the three parameters to the function and the function will act the gate on the state vector(`self.q_device`), on the `wires`, with the phase, here the `params` mean phase.


In [74]:
class KernalAnsatz(tq.QuantumModule):
    def __init__(self, func_list):
        super().__init__()
        self.func_list = func_list

    @tq.static_support
    def forward(self, q_device: tq.QuantumDevice, x, y):
        self.q_device = q_device
        self.q_device.reset_states(x.shape[0])
        for info in self.func_list:
            if tq.op_name_dict[info['func']].num_params > 0:
                params = x[:, info['input_idx']]
            else:
                params = None
            func_name_dict[info['func']](
                self.q_device,
                wires=info['wires'],
                params=params,
            )
        for info in reversed(self.func_list):
            if tq.op_name_dict[info['func']].num_params > 0:
                params = -y[:, info['input_idx']]
            else:
                params = None
            func_name_dict[info['func']](
                self.q_device,
                wires=info['wires'],
                params=params,
            )

### Build the whole quantum circuit

The whole model initialization is a 4-wire quantum state, the `tq.QuantumDevice` module can store the state vector and a `KernelAnsatz` we just mentioned.

When executing the whole model, as there’s a concept of batch in torchquantum’s model, we set the batch size is 1. After executing the `KernelAnsatz`, we measure the probability that the quantum state falls on the $|00\cdots0\rangle$ as the result. We get the state vector, flatten it, get the first amplitude, which is also the amplitude of the $|00\cdots0\rangle$ state, calculate the absolute value of the amplitude, and get the probability that the quantum state falls on the $|00\cdots0\rangle$ state.


In [75]:
class Kernel(tq.QuantumModule):
    def __init__(self):
        super().__init__()
        self.n_wires = 4
        self.q_device = tq.QuantumDevice(n_wires=self.n_wires)
        self.ansatz = KernalAnsatz(
        [   {'input_idx': [0], 'func': 'ry', 'wires': [0]},
            {'input_idx': [1], 'func': 'ry', 'wires': [1]},
            {'input_idx': [2], 'func': 'ry', 'wires': [2]},
            {'input_idx': [3], 'func': 'ry', 'wires': [3]},])

    def forward(self, x, y, use_qiskit=False):
        # bsz=1
        x = x.reshape(1, -1)
        y = y.reshape(1, -1)
        self.ansatz(self.q_device, x, y)
        result = torch.abs(self.q_device.states.view(-1)[0])
        return result

###Train the svm model from sklearn based on our quantum kernel.

Define a kernel matrix function.

Pass the kernel matrix function to SVC, call `.fit(X_train, y_train)` and the SVC object can start training.

Predict and see the accuracy. The accuracy looks pretty well.


In [80]:
kernel_function = Kernel()
svr = SVR(kernel='rbf').fit(x_train, y_train)
predictions = svr.predict(x_test)

In [81]:
accuracy = svr.score(x_test, y_test)
print(f"Accuracy: {accuracy}")

Accuracy: 0.8793364997801647
