# Quantum Kernels


<br>
<br>
<span style="color: red; font-weight: bold;">
Please replace the "?"-signs by real code
</span>
<br>
In this tutorial we will apply a Quantum Kernel Method to calculate a single kernel matrix entry for data with many features, using a real quantum computer. We will not estimate an entire kernel matrix for a large dataset, in order to respect time on IBM quantum computers.



In [None]:
# If you have not already, install scikit learn
#!pip install scikit-learn

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [None]:
from qiskit.circuit.library import unitary_overlap

## Scaling to more features and qubits

In this section, we will repeat the calculation of a single matrix element, but for a much larger number of features, sketching the path to scale toward utility. The restriction to a single matrix element is done so that the process can be shown without using up too much of your allotted time on quantum computers.

### Step 1: Map classical inputs to a quantum problem

We will assume a starting point of a dataset in which each data point has 42 features. As in the first example, we will calculate a single kernel matrix element, requiring two data points. 


<span style="color: blue; font-weight: bold;">
The two points below have 42 features and a single category variable ($\pm 1$).
</span>



In [None]:
# Two mock data points, including category labels, as in training

large_data = [
    [
        -0.028,
        -1.49,
        -1.698,
        0.107,
        -1.536,
        -1.538,
        -1.356,
        -1.514,
        -0.109,
        -1.8,
        -0.122,
        -1.651,
        -1.955,
        -0.123,
        -1.732,
        0.091,
        -0.048,
        -0.128,
        -0.026,
        0.082,
        -1.263,
        0.065,
        0.004,
        -0.055,
        -0.08,
        -0.173,
        -1.734,
        -0.39,
        -1.451,
        0.078,
        -1.578,
        -0.025,
        -0.184,
        -0.119,
        -1.336,
        0.055,
        -0.204,
        -1.578,
        0.132,
        -0.121,
        -1.599,
        -0.187,
        -1,
    ],
    [
        -1.414,
        -1.439,
        -1.606,
        0.246,
        -1.673,
        0.002,
        -1.317,
        -1.262,
        -0.178,
        -1.814,
        0.013,
        -1.619,
        -1.86,
        -0.25,
        -0.212,
        -0.214,
        -0.033,
        0.071,
        -0.11,
        -1.607,
        0.441,
        -0.143,
        -0.009,
        -1.655,
        -1.579,
        0.381,
        -1.86,
        -0.079,
        -0.088,
        -0.058,
        -1.481,
        -0.064,
        -0.065,
        -1.507,
        0.177,
        -0.131,
        -0.153,
        0.07,
        -1.627,
        0.593,
        -1.547,
        -0.16,
        -1,
    ],
]
train_data = [large_data[0][:-1], large_data[1][:-1]]

Recall that the `zz_feature_map` produced rather deep circuits in the case of relatively few features (14 features). As we increase the number of features, we need to closely monitor circuit depth. To illustrate this, we will first try using the `zz_feature_map` and check the depth of the resulting circuit.



In [None]:
from qiskit.circuit.library import zz_feature_map

fm = ?(
    feature_dimension=np.shape(train_data)[1], entanglement="linear", reps=1
)

unitary1 = ?.assign_parameters(train_data[0])
unitary2 = ?.assign_parameters(train_data[1])

In [None]:
from qiskit.circuit.library import unitary_overlap


overlap_circ = unitary_overlap(?, ?)
overlap_circ.measure_all()

print("circuit depth = ", overlap_circ.decompose(reps=2).depth())
print(
    "two-qubit depth",
    overlap_circ.decompose().depth(lambda instr: len(instr.qubits) > 1),
)
# overlap_circ.draw("mpl", scale=0.6, style="iqp")

As described before, determining exactly how deep is too deep is nuanced. But a two-qubit depth of more than 100, even before transpilation is a non-starter. 

<span style="color: blue; font-weight: bold;">
This is why custom feature maps have been emphasized throughout this lesson. If you know something about the structure of your entire dataset, you should design an entanglement map with that structure in mind. 
</span>
    
Here, since we are only calculating the inner product between two such data points, we have prioritized low circuit depth over any detailed consideration of data structure.



In [None]:
from qiskit.circuit import Parameter, ParameterVector, QuantumCircuit

# Prepare feature map for computing overlap

entangler_map = [
    [3, 4],
    [2, 5],
    [1, 4],
    [2, 3],
    [4, 6],
    [7, 9],
    [10, 11],
    [9, 12],
    [8, 11],
    [9, 10],
    [11, 13],
    [14, 16],
    [17, 18],
    [16, 19],
    [15, 18],
    [16, 17],
    [18, 20],
]

In [None]:
# Use the entangler map above to build a feature map

num_features = np.shape(train_data)[1]
num_qubits = int(num_features / 2)

fm = QuantumCircuit(num_qubits)
training_param = Parameter("Î¸")
feature_params = ParameterVector("x", num_qubits * 2)
fm.ry(training_param, fm.qubits)
for cz in ?:
    fm.cz(cz[0], cz[1])
for i in range(?):
    fm.rz(-2 * feature_params[2 * i + 1], i)
    fm.rx(-2 * feature_params[2 * i], i)

In [None]:
fm.draw('mpl')

In [None]:
from qiskit.circuit.library import unitary_overlap

# Assign features of each data point to a unitary, an instance of the general feature map.

unitary1 = fm.assign_parameters(list(train_data[0]) + [np.pi / 2])
unitary2 = fm.assign_parameters(list(train_data[1]) + [np.pi / 2])

# Create the overlap circuit

overlap_circ = unitary_overlap(?, ?)
?.measure_all()

We won't bother checking the depths yet, since what really matters is the transpiled two-qubit depth.



### Step 2: Optimize problem for quantum execution

We start by selecting the least busy backend, then optimize our circuit for running on that backend.



In [None]:
# Import needed packages
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
from qiskit_ibm_runtime import QiskitRuntimeService

# Get the least busy backend
service = ?()
backend = ?.least_busy(
    operational=True, simulator=False, min_num_qubits=fm.num_qubits
)
print(?)

On small-scale jobs, a preset pass manager will often return the same circuit with the same depth, reliably. But in very large, complex circuits the pass manager can return different transpiled circuits each time it runs. This is because it is using heuristics, and because very large circuits will have a complicated landscape of possible optimizations. It is often useful to transpile a few times and take the shallowest circuit. This only introduces classical overhead and may substantially improve the results from the quantum computer.

Here, we transpile the unitary overlap circuit 20 times, and look at the depths of the circuits obtained.



In [None]:
# Apply level 3 optimization to our overlap circuit
transpiled_qcs = []
transpiled_depths = []
transpiled_twoqubit_depths = []
for i in range(1, ?):
    pm = generate_preset_pass_manager(optimization_level=3, backend=backend)
    overlap_ibm = pm.run(overlap_circ)
    transpiled_qcs.append(?)
    transpiled_depths.append(?.decompose().depth())
    transpiled_twoqubit_depths.append(
        overlap_ibm.decompose().depth(lambda instr: len(instr.qubits) > 1)
    )

print("circuit depth = ", overlap_ibm.decompose().depth())

In [None]:
print(transpiled_depths)
print(transpiled_twoqubit_depths)

Is there variation in the total gate depth with different transpilation passes?
<br>
We will use the `transpiled_qcs[1]`.
<br>
What is the depth?

In [None]:
overlap_ibm = transpiled_qcs[1]

### Step 3: Execute using Qiskit Runtime Primitives

As we scale closer to utility, simulators will not be useful. Only the syntax for real quantum computers is shown here.



<span style="color: blue; font-weight: bold;">
We apply Dynamical Decoupling and Twirling.
</span>


In [None]:
# Import runtime primitive
from qiskit_ibm_runtime import SamplerV2 as Sampler

# Define backend directly (instead of Session)
# For example, using the IBM provider:
# backend = provider.get_backend("ibmq_qasm_simulator")
# or any other backend you have access to
num_shots = 10000

# Create the sampler directly with backend
sampler = Sampler(mode=backend)

# Access options (if supported by your backend/plan)
options = sampler.options
?.dynamical_decoupling.enable = True
?.twirling.enable_gates = True

# Run and get counts
result = sampler.run([?], shots=?).result()
counts = ?[0].data.meas.get_int_counts()

print(counts)


### Step 4: Post-process, return result in classical format

As described in the introduction, the most useful measurement here is the probability of measuring the zero state $|00000\rangle$.



In [None]:
counts.get(0, 0.0) / ?

This process for the single kernel matrix element could be repeated between other data pairs in your set to obtain the full kernel matrix. The dimension of the kernel matrix is dictated by the number of points in your training data, not the number of features. So the computing cost of manipulating the kernel matrix into a predictive model does not scale like the number of features or qubits. Even for relatively small datasets with large numbers of features, the data would still need to be matched to a feature map that yields effective classification.

### Scaling and future work

The kernel method requires that we measure the $|0\rangle$ as accurately as possible. But gate errors and readout errors mean that there is some non-zero probability $p$ that any given qubit will be erroneously measured to be in the $|1\rangle$ state. Even with the oversimplification that the probability of $|0\rangle$ should be $100\%$, for many features encoded on, say, $N$ bits, the probability of correctly measuring all bits to be $|0\rangle$ is reduced to $(1-p)^N$. As $N$ becomes large, this method becomes less and less reliable. Overcoming this difficulty and scaling kernel estimation to more and more features is an area of current research. To learn more about this issue, see this work by [Thanasilp, Wang, Cerezo, and Holmes.](https://www.nature.com/articles/s41467-024-49287-w) We recommend you explore what can be done with current quantum computers, and also look forward to what will be possible in the era of error correction.



### Review

Calculating a quantum kernel involves

*   calculating kernel matrix entries, using pairs of training data points
*   encoding the data and mapping it via a feature mapping
*   optimizing your circuit for running on real quantum computers / backends

The quantum kernel can then be used in classical machine learning algorithms, as in this notebook.

<span style="color: blue; font-weight: bold;">    
Some key things to keep in mind when using quantum kernels include:
</span>

*   Is the dataset likely to benefit from quantum kernel methods?
*   Try different feature maps and entanglement schemes.
*   Is the circuit depth acceptable?
*   Try running a pass manager multiple times and use the smallest-depth circuit you can get.

Quantum kernel methods are potentially powerful tools given a proper match between datasets with quantum-amenable features, and a suitable quantum feature map. To better understand where quantum kernels are likely to be useful, we recommend reading [Liu, Arunachalam & Temme (2021)](https://www.nature.com/articles/s41567-021-01287-z).



# END OF NOTEBOOK