## Topic 2: QSVM4EO - Land classification using quantum-enhanced support vector machines

This notebook will present the an example of using quantum-enhanced support vector machines (QSVM) for classification tasks on multi-spectral Earth Observation (EO) data. The main topics covered are introduction to classical SVM, where quantum computation could be used in the SVM calculations, the intricacies of preparing and encoding the classical data into useful quantum states and training QSVMs on gate-based quantum software simulators. Finally, some results on comparing classical and quantum-enhanced SVM models will be presented.

QSVM code is adapted from Qiskit QSVM example: https://qiskit-community.github.io/qiskit-machine-learning/tutorials/03_quantum_kernel.html

## Table of content

- Introduction
    - Use case
    - Theory
- QSVM Tutorial
- Exercise
- Conclusions

## Introduction

### Use case

What we're doing and why





### Theory

#### Support Vector Machines (SVM)

The SVM algorithm, in it’s original linear definition, is very similar to the linear classifier. But instead of trying to find any hyperplane separating the two data classes, it’s goal is to find the maximum margin between them with the assumiont that they are linearly separable.

#### Support Vector Machines (SVM)

The SVM algorithm, in it’s original linear definition, is very similar to the linear classifier. But instead of trying to find any hyperplane separating the two data classes, it’s goal is to find the maximum margin between them with the assumiont that they are linearly separable.

<div style="background-color: #ffffff; padding: 10px; text-align: center; width: fit-content;">
  <img src="images/SVM_margin.png" width="300"><br>
  <span style="font-style: italic; font-size: 14px;">(Source: https://commons.wikimedia.org/wiki/File:SVM_margin.png)</span>
</div>


Primal formulation of SVM:

$$
\mathbf{w}^{\top} \mathbf{x}-b=0
$$

This could be reformulated as an optimization problem:

$$
\underset{\mathbf{w}, b}{\operatorname{argmin}} \frac{1}{2}\|\mathbf{w}\|^{2}
$$

with the following constraints:

$$
\begin{aligned}
\mathbf{w}^{\top} \mathbf{x}_{i}-b \geq 1 & \text { for } y_{i}=1 \\
\mathbf{w}^{\top} \mathbf{x}_{i}-b \leq-1 & \text { for } y_{i}=-1
\end{aligned}
$$

In practice the Dual Lagrangian formulation is used:

$$
\mathcal{L}(\boldsymbol{\alpha})=\sum_{n=1}^{N} \alpha_{n}-\frac{1}{2} \sum_{n=1}^{N} \sum_{m=1}^{N} y_{n} y_{m} \alpha_{n} \alpha_{m} \mathbf{x}_{n}^{\top} \mathbf{x}_{m}
$$

Maximize w.r.t. to α subject to:

$$
\begin{array}{r}
\alpha_n \geq 0 \text { for } n=1, \cdots, N \\
\qquad \sum_{n=1}^N \alpha_n y_n=0
\end{array}
$$

In this formulation the values of the Lagrangian multipliers α could be found with quadratic programming.

##### Kernel trick

The above formulations work for linearly separable data, but the real-world datasets are rarely linearly separable. In cases where the data points are intermixed, one could use a transformation function that is applied to the original datapoints $x_i$ that maps them to a new feature space. The hope is that in this new feature space the mapped datapoints $z_i$ are linearly separable. 

<div style="background-color: #ffffff; padding: 10px; text-align: center; width: fit-content;">
  <img src="images/SVM_kernel.png" width="600"><br>
  <span style="font-style: italic; font-size: 14px;">(Source: https://www.hackerearth.com/blog/developers/simple-tutorial-svm-parameter-tuning-python-r/
  )</span>
</div>

$$
\mathcal{L}(\boldsymbol{\alpha})=\sum_{n=1}^{N} \alpha_{n}-\frac{1}{2} \sum_{n=1}^{N} \sum_{m=1}^{N} y_{n} y_{m} \alpha_{n} \alpha_{m} \mathbf{z}_{n}^{\top} \mathbf{z}_{m}
$$

subject to:

$$
\begin{array}{r}
\alpha_n \geq 0 \text { for } n=1, \cdots, N \\
\qquad \sum_{n=1}^N \alpha_n y_n=0
\end{array}
$$

Computing those new datapoints $z_i$ could be quite computationally expensive. To avoid it we could apply the kernel trick: instead of computing each new datapoint in the new feature space, what we need is their dot product, as seen in the equation above. So if we could compute it directly, by using the data in the original input space, we could speed up the computation and get access to kernels in infinite dimensions (for example the RBF kernel).

Some classical kernel functions:

$$
\begin{array}{l|l|l}
\hline \text { Name } & \text { Kernel } & \text { Hyperparameters } \\
\hline \text { Linear } & \mathbf{x}^{T} \mathbf{x}^{\prime} & - \\
\hline \text { Polynomial } & \left(\mathbf{x}^{T} \mathbf{x}^{\prime}+c\right)^{p} & p \in \mathbb{N}, c \in \mathbb{R} \\
\hline \text { Gaussian } & \mathrm{e}^{-\gamma\left\|\mathbf{x}-\mathbf{x}^{\prime}\right\|^{2}} & \gamma \in \mathbb{R}^{+} \\
\hline \text { Exponential } & \mathrm{e}^{-\gamma\left\|\mathbf{x}-\mathbf{x}^{\prime}\right\|} & \gamma \in \mathbb{R}^{+} \\
\hline \text { Sigmoid } & \tanh \left(\mathbf{x}^{T} \mathbf{x}^{\prime}+c\right) & c \in \mathbb{R} \\
\hline
\end{array}
$$

#### Quantum-enhanced Support Vector Machines (QSVM)

WIP

<div style="background-color: #ffffff; padding: 10px; text-align: center; width: fit-content;">
  <img src="images/SVM_QSVM_workflows.png" width="600"><br>
  <span style="font-style: italic; font-size: 14px;"></span>
</div>

<div style="background-color: #ffffff; padding: 10px; text-align: center; width: fit-content;">
  <img src="images/kernel_circuit.png" width="600"><br>
  <span style="font-style: italic; font-size: 14px;">(Source: https://pennylane.ai/qml/demos/tutorial_kernel_based_training)</span>
</div>

## QSVM Tutorial

### Imports and setup

In [2]:
import qiskit

from qiskit_machine_learning.utils import algorithm_globals

algorithm_globals.random_seed = 12345

### Dataset

In [3]:
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np

def read_csv(file_dir, selected_columns):
    df = pd.read_csv(file_dir)
    df_selected = df[selected_columns]
    X = df_selected.values
    Y = df['Label'].values
    return X,Y

In [4]:
train_data = 'data/train_32.csv'
test_data = 'data/test_32.csv'

# four_features_data = ['B02', 'B03', 'B04', 'B08']
# adhoc_dimension = 4
eight_features_data = ['B02', 'B03', 'B04', 'B08', 'NDVI', 'EVI', 'SAVI', 'NDWI']
adhoc_dimension = 8


In [5]:
train_features, train_labels = read_csv(train_data, eight_features_data)

test_features, test_labels = read_csv(test_data, eight_features_data)

# scaler = StandardScaler().fit(train_features)
# X_train_scaled = scaler.transform(train_features)
# X_test_scaled = scaler.transform(test_features)

In [6]:
from sklearn.svm import SVC

svc_classifier = SVC(kernel='linear')
svc_classifier.fit(train_features, train_labels)

train_score = svc_classifier.score(train_features, train_labels)
test_score = svc_classifier.score(test_features, test_labels)


print(f"Precomputed classical kernel classification test score: {test_score}")



Precomputed classical kernel classification test score: 0.5419921875


In [7]:
scaler = StandardScaler().fit(train_features)
train_features_scaled = scaler.transform(train_features)
test_features_scaled = scaler.transform(test_features)

In [8]:
from qiskit.circuit.library import ZZFeatureMap, ZFeatureMap
from qiskit.primitives import StatevectorSampler as Sampler
from qiskit_machine_learning.state_fidelities import ComputeUncompute
from qiskit_machine_learning.kernels import FidelityQuantumKernel, FidelityStatevectorKernel

adhoc_feature_map = ZZFeatureMap(feature_dimension=adhoc_dimension, reps=1, entanglement="full")
#adhoc_feature_map = ZFeatureMap(feature_dimension=adhoc_dimension, reps=2)

#sampler = Sampler()

#fidelity = ComputeUncompute(sampler=sampler)

# adhoc_kernel = FidelityQuantumKernel(fidelity=fidelity, feature_map=adhoc_feature_map)
adhoc_kernel = FidelityStatevectorKernel(feature_map=adhoc_feature_map)

adhoc_kernel.feature_map.decompose().draw()


In [9]:
adhoc_matrix_train = adhoc_kernel.evaluate(x_vec=train_features)
adhoc_matrix_test = adhoc_kernel.evaluate(x_vec=test_features, y_vec=train_features)

In [10]:
adhoc_svc = SVC(kernel="precomputed")

adhoc_svc.fit(adhoc_matrix_train, train_labels)

adhoc_score_precomputed_kernel = adhoc_svc.score(adhoc_matrix_test, test_labels)

print(f"Precomputed quantum kernel classification test score: {adhoc_score_precomputed_kernel}")

Precomputed quantum kernel classification test score: 0.330078125


In [11]:
matrix_train = adhoc_kernel.evaluate(x_vec=train_features_scaled)
adhoc_matrix_test = adhoc_kernel.evaluate(x_vec=test_features_scaled, y_vec=train_features_scaled)

In [12]:
adhoc_svc = SVC(kernel="precomputed")

adhoc_svc.fit(adhoc_matrix_train, train_labels)

adhoc_score_precomputed_kernel = adhoc_svc.score(adhoc_matrix_test, test_labels)

print(f"Precomputed quantum kernel classification test score: {adhoc_score_precomputed_kernel}")

Precomputed quantum kernel classification test score: 0.4990234375


In [13]:
print(f'min: {train_features.min()}, max: {train_features.max()}')
print(f'min: {train_features_scaled.min()}, max: {train_features_scaled.max()}')

min: -29.769021739130437, max: 5064.0
min: -18.1594339837482, max: 16.512914583221786


In [14]:
svc_classifier = SVC(kernel='linear')
svc_classifier.fit(train_features_scaled, train_labels)

train_score = svc_classifier.score(train_features_scaled, train_labels)
test_score = svc_classifier.score(test_features_scaled, test_labels)


print(f"Precomputed classical kernel classification test score: {test_score}")

Precomputed classical kernel classification test score: 0.5390625


## Exercise

Use the dataset with eight features and repeat the experiments above.

## Conclusions

Discuss real-world results and experience.

## References

- https://qiskit-community.github.io/qiskit-machine-learning/tutorials/03_quantum_kernel.html
- https://pennylane.ai/qml/demos/tutorial_kernel_based_training

## Acknowledgements

We extend our gratitude to the Irish Centre for High-End Computing (ICHEC) and University of Galway for providing computing and for all-encompassing invaluable support. This project was funded by the EuroHPC JU under grant agreement No 951732 and Ireland.

<div>
  <img src="images/ICHEC.png" width="250" style="display: inline-block; margin-right: 30px;">
  <img src="images/UoG_.png" width="235" style="display: inline-block; margin-right: 30px;">
  <img src="images/EuroCC-Ireland.png" width="80" style="display: inline-block; margin-right: 30px;">
  <img src="images/EU-flag-Horizon-Europe.jpg" width="145" style="display: inline-block;">
</div>