# Quantum Variational Circuits & Quantum Neural Networks


<span style="color: red; font-weight: bold;">
Please replace the "?" signs by real code !
</span>
<br>
<br>

<span style="color: blue; font-weight: bold;">
In this notebook, we implement several variational quantum circuits for a data classification task, so-called variational quantum classifiers (VQCs). 
</span>

At one point, it was common to refer to a subset of VQCs as quantum neural networks (QNNs) in analogy with classical neural networks. Indeed, there are cases where structures borrowed from classical neural networks, such as convolution layers, play an important role in VQCs. In such cases where the analogy is strong, QNNs may be a useful description. But parameterized quantum circuits need not follow the general structure of a neural network; for example, not all data need to be loaded in the first (input) layer; we can load some data in the first layer, apply some gates and then load additional data (a process called data "reuploading"). We should therefore think of QNNs as a subset of parameterized quantum circuits, and we should not be limited in our exploration of useful quantum circuits by the analogy to classical neural networks.

<span style="color: blue; font-weight: bold;">The dataset being addressed in this notebook consists of images containing horizontal and vertical stripes, and our goal is to label unseen images into one of the two categories depending on the orientation of their line. We will accomplish this with a VQC. As we go, we will address ways in which the calculation can be improved and scaled. 
</span>

The dataset here is exceptionally easy to classify classically. It has been chosen for its simplicity so we can focus on the quantum part of this problem, and look at how a dataset attribute might translate to a part of a quantum circuit. It is not reasonable to expect a quantum speed-up for such simple cases where classical algorithms are so efficient.

By the end of this notebook you should be able to:

*   Load data from an image into a quantum circuit
*   Construct an ansatz for a VQC (or QNN), and adjust it to fit your problem
*   Train your VQC/QNN and use it to make accurate predictions on test data
*   Scale the problem, and recognize limits of current quantum computers.



Research Paper: 
[https://arxiv.org/abs/2405.00781](https://arxiv.org/abs/2405.00781)

## High Level Phases

<span style="color: blue; font-weight: bold;">

- Image Classification, 16 pixels --> low accuracy score of 60%

- Enhancement: more CNOT gates for horizontal pixels --> high accuracy score of 100%

- Extension to 36 pixels with fewer iterations
</span>

## Preparations

In [None]:
!pip install qiskit[visualization]

In [None]:
!pip install qiskit-machine-learning

In [None]:
%pip install scikit-learn

In [None]:
%pip install qiskit_aer

## Data generation

We will start by constructing the data. Data sets are often not explicitly generated as part of the Qiskit patterns framework. But data type and preparation is critical to successfully applying quantum computing to machine learning. The code below defines a data set of images with set pixel dimensions. <span style="color: blue; font-weight: bold;">One full row or column of the image is assigned the value $\pi/2$, and the remaining pixels are assigned random values on the interval $(0,\pi/4)$. The random values are noise in our data. </span>Glance through the code to make sure you understand how the images are generated. Later on we will scale up the images.



In [None]:
# This code defines the images to be classified:

import numpy as np

# Total number of "pixels"/qubits
size = 8
# One dimension of the image (called vertical, but it doesn't matter). Must be a divisor of `size`
vert_size = 2
# The length of the line to be detected (yellow). Must be less than or equal to the smallest dimension of the image (`<=min(vert_size,size/vert_size)`
line_size = 2


def generate_dataset(num_images):
    images = []
    labels = []
    hor_array = np.zeros((size - (line_size - 1) * vert_size, size))
    ver_array = np.zeros((round(size / vert_size) * (vert_size - line_size + 1), size))

    j = 0
    for i in range(0, size - 1):
        if i % (size / vert_size) <= (size / vert_size) - line_size:
            for p in range(0, line_size):
                hor_array[j][i + p] = ? / 2
            j += 1

    # Make two adjacent entries pi/2, then move down to the next row. Careful to avoid the "pixels" at size/vert_size - linesize, because we want to fold this list into a grid.

    j = 0
    for i in range(0, round(size / vert_size) * (vert_size - line_size + 1)):
        for p in range(0, line_size):
            ver_array[j][i + p * round(size / vert_size)] = ? / 2
        j += 1

    # Make entries pi/2, spaced by the length/rows, so that when folded, the entries appear on top of each other.

    for n in range(num_images):
        rng = np.random.randint(0, 2)
        if rng == 0:
            labels.append(-1)
            random_image = np.random.randint(0, len(hor_array))
            images.append(np.array(hor_array[random_image]))

        elif rng == 1:
            labels.append(1)
            random_image = np.random.randint(0, len(ver_array))
            images.append(np.array(ver_array[random_image]))
            # Randomly select 0 or 1 for a horizontal or vertical array, assign the corresponding label.

        # Create noise
        for i in range(size):
            if images[-1][i] == 0:
                images[-1][i] = np.random.rand() * ? / 4
    return images, labels


hor_size = round(size / vert_size)

<span style="color: blue; font-weight: bold;">
Note that the code above has also generated labels indicated whether the images contain a vertical (+1) or horizontal (-1) line.</span> We will now use sklearn to split a data set of 100 images into a training and testing set (along with their corresponding labels). Here, we use $70%$ of the data set for training, with the remaining $30%$ withheld for testing.



In [None]:
from sklearn.model_selection import train_test_split

np.random.seed(42)
images, labels = generate_dataset(200)

train_images, test_images, train_labels, test_labels = ?(
    ?, ?, test_size=???, random_state=246
)

<span style="color: blue; font-weight: bold;">
Let's plot a few elements of our data set to see what these lines look like:
</span>


In [None]:
import matplotlib.pyplot as plt

# Make subplot titles so we can identify categories
titles = []
for i in range(8):
    title = "category: " + str(train_labels[i])
    titles.append(title)

# Generate a figure with nested images using subplots.
fig, ax = plt.subplots(4, 2, figsize=(10, 6), subplot_kw={"xticks": [], "yticks": []})

for i in range(8):
    ax[i // 2, i % 2].imshow(
        train_images[i].reshape(vert_size, hor_size),
        aspect="equal",
    )
    ax[i // 2, i % 2].set_title(titles[i])
plt.subplots_adjust(wspace=0.1, hspace=0.3)

Each of these images is still paired with its label in `train_labels` in a simple list form:



In [None]:
print(train_labels[:8])

## Variational quantum classifier: a first attempt

### Qiskit patterns step 1: Map the problem to a quantum circuit

The goal is to find a function $f$ with parameters $\theta$ that maps a data vector / image $\vec{x}$ to the correct category: $f_\theta(\vec{x}) \rightarrow \pm1$. This will be accomplished using a VQC with few layers that can be identified by their distinct purposes:

$$
f_\theta(\vec{x}) = \langle 0|U^{\dagger}(\vec{x})W^\dagger(\theta)OW(\theta)U(\vec{x})|0\rangle
$$

<span style="color: blue; font-weight: bold;">
Here, $U(\vec{x})$ is the encoding circuit, for which we have many options as seen in the course. $W(\theta)$ is a variational, or trainable circuit block, and $\theta$ is the set of parameters to be trained. Those parameters will be varied by classical optimization algorithms to find the set of parameters that yields the best classification of images by the quantum circuit. This variational circuit is sometimes called the "ansatz". Finally, $O$ is some observable that will be estimated using the Estimator primitive. </span>There is no constraint that forces the layers to come in this order, or even to be fully separate. One could have multiple variational and/or encoding layers in any order that is technically motivated.

<span style="color: blue; font-weight: bold;">
We start by choosing a feature map to encode our data. We will use the `z_feature_map`, as it keeps circuit depths low compared to some other feature mappings.
</span>


In [None]:
from qiskit.circuit.library import z_feature_map

# One qubit per data feature
num_qubits = len(train_images[0])

# Data encoding
# Note that qiskit orders parameters alphabetically. We assign the parameter prefix "a" to ensure our data encoding goes to the first part of the circuit, the feature mapping.
feature_map = ?(?, parameter_prefix="a")

In [None]:
?.draw('mpl')

In [None]:
?.decompose().draw("mpl", style="clifford", fold=-1)

In [None]:
print(num_qubits)

In [None]:
print(size)

We must now decide on an ansatz to be trained. There are many considerations when selecting an ansatz. A complete description is beyond the scope of this introduction; here we simply point out a few categories of considerations.

1.  **Hardware:** All modern quantum computers are more prone to errors and more susceptible to noise than their classical counterparts. Using an ansatz that is excessively deep (especially in transpiled, two-qubit depth) will not produce good results. A related issue is that quantum computers have some qubit layout, meaning that some physical qubits are adjacent on the quantum computer, and others may be very far from each other. Entangling adjacent qubits does not increase the depth by too much, but entangling very distant qubits can increase depth substantially, as we must insert swap gates to move information onto qubits that are adjacent in order for them to be entangled.
2.  **The problem:** Whenever you have some information about your problem that could guide your ansatz, make use of it. For example, the data in this lesson is made up of images of horizontal and vertical lines. One could consider what correlation between adjacent colors/values identifies an image of a horizontal or vertical line. What attributes of an ansatz would correspond to this correlation between adjacent pixels? We will revisit this point more technically later in this lesson. But for now, let us simply say that including entanglement and CNOT gates between qubits corresponding to adjacent pixels seems like a good idea. In the bigger picture, consider whether the problem is actually best solved using a quantum circuit, or whether classical algorithms might exist that can do as good a job.
3.  **Number of parameters:** Each independently parameterized quantum gate in the circuit increases the space to be classically optimized, and this results in slower convergence. But as problems scale up, one may encounter *barren plateaus*. This term refers to a phenomenon where the optimization landscape of a variational quantum algorithm becomes exponentially flat and featureless as the problem size increases. This causes vanishing gradients, making it difficult to effectively train the algorithm[\[1\]](#references). Barren plateaus are relevant to variational quantum algorithms like VQCs/QNNs. It should be noted that the increasing number of parameters is not the only consideration in avoiding barren plateaus; other considerations include global cost functions and random parameter initialization.


In this notebook we will see a few simple examples of good practices in ansatz construction. 

<span style="color: blue; font-weight: bold;">
Let us first try the ansatz below. We will return to revise it, later.
</span>

In [None]:
# Import the necessary packages
from qiskit import QuantumCircuit
from qiskit.circuit import ParameterVector

# Initialize the circuit using the same number of qubits as the image has pixels
qnn_circuit = QuantumCircuit(size)

# We choose to have two variational parameters for each qubit.
params = ParameterVector("θ", length=2 * size)

# A first variational layer for all qubits:
for i in range(?):
    qnn_circuit.ry(params[i], i)

# Here is a list of qubit pairs between which we want CNOT gates. The choice of these is not yet obvious.
qnn_cnot_list = [[0, 1], [1, 2], [2, 3]]

for i in range(len(?)):
    qnn_circuit.?(qnn_cnot_list[i][0], qnn_cnot_list[i][1])

# The second variational layer for all qubits:
for i in range(?):
    qnn_circuit.rx(params[size + i], i)

# Check the circuit depth, and the two-qubit gate depth
print(qnn_circuit.decompose().depth())
print(
    f"2+ qubit depth: {qnn_circuit.decompose().depth(lambda instr: len(instr.qubits) > 1)}"
)

# Draw the circuit
?.draw("mpl")

With the data encoding and variational circuit prepared, we can combine them to form our full ansatz. <span style="color: blue; font-weight: bold;">
In this case, the components of our quantum circuit are quite analogous to those in neural networks, with $U(\vec{x})$ being most similar to the layer that loads input values from the image, and $W(\theta)$ being like the layer of variable "weights". </span>Since this analogy holds in this case, we are adopting "qnn" in some of our naming conventions; but this analogy should not be limiting in your exploration of VQCs.





In [None]:
# QNN ansatz
ansatz = qnn_circuit

# Combine the feature map with the ansatz
full_circuit = QuantumCircuit(num_qubits)
full_circuit.compose(?, range(num_qubits), inplace=True)
full_circuit.compose(?, range(num_qubits), inplace=True)

# Display the circuit
?.decompose().?("mpl", style="clifford", fold=-1)

<span style="color: blue; font-weight: bold;">
We must now define an observable, so we can use it in our cost function. We will obtain an expectation value for this observable using Estimator. If we have selected a good, problem-motivated ansatz, then each qubit will contain information relevant to classification.</span> One can add layers to combine information onto fewer qubits (called a *convolutional layer*), such that measurements are only needed on a subset of the qubits in the circuit (as in convolutional neural networks). Or one can measure some attribute from each qubit. <span style="color: blue; font-weight: bold;">Here we will opt for the latter, so we include a `Z` operator for each qubit.</span> There is nothing unique about choosing $Z$, but it is well motivated:

*   This is a binary classification task, and a measurement of $Z$ can yield two possible outcomes.
*   The eigenvalues of $Z$ ($\pm 1$) are reasonably well separated, and result in an estimator outcome in interval \[-1, +1], where 0 can simply be used as a cutoff value.
*   It is straightforward to measure in Pauli Z basis with no extra gate overhead.

So, Z is a very natural choice.



In [None]:
from qiskit.quantum_info import SparsePauliOp

observable = SparsePauliOp.from_list([("?" * (num_qubits), 1)])

SparsePauliOp in Qiskit is a compact representation of an operator written as a sum of Pauli strings, optimized for speed and memory efficiency.

We have our quantum circuit and the observable we want to estimate. Now we need a few things in order to run and optimize this circuit. 
<span style="color: blue; font-weight: bold;">First, we need a function to run a forward pass. Note that the function below takes in the `input_params` and `weight_params` separately. The former is the set of static parameters describing the data in an image, and the latter is the set of variable parameters to be optimized.</span>



In [None]:
from qiskit.primitives import BaseEstimatorV2
from qiskit.quantum_info.operators.base_operator import BaseOperator


def forward(
    circuit: QuantumCircuit,
    input_params: np.ndarray,
    weight_params: np.ndarray,
    estimator: BaseEstimatorV2,
    observable: BaseOperator,
) -> np.ndarray:
    """
    Forward pass of the neural network.

    Args:
        circuit: circuit consisting of data loader gates and the neural network ansatz.
        input_params: data encoding parameters.
        weight_params: neural network ansatz parameters.
        estimator: EstimatorV2 primitive.
        observable: a single observable to compute the expectation over.

    Returns:
        expectation_values: an array (for one observable) or a matrix (for a sequence of observables) of expectation values.
        Rows correspond to observables and columns to data samples.
    """
    num_samples = input_params.shape[0]
    weights = np.broadcast_to(weight_params, (num_samples, len(weight_params)))
    params = np.concatenate((input_params, weights), axis=1)
    pub = (circuit, observable, params)
    job = estimator.run([pub])
    result = job.result()[0]
    expectation_values = result.data.evs

    return expectation_values

In Qiskit, a Pub (Primitive Unified Bloc) is a fundamental unit of work that is used to define vectorized workloads for quantum circuits. PUBs aggregate elements from multiple arrays (observables and parameter values) and are essential for executing workloads in Qiskit's Sampler and Estimator primitives. Each PUB is represented as a tuple, where the first element is the circuit being executed, and subsequent elements provide the observables and parameter values needed for the computation.

### Loss function

<span style="color: blue; font-weight: bold;">Next, we need a loss function to calculate the difference between the predicted and calculated values of the labels. The function will take in the labels predicted by the algorithm and the correct labels and return the mean squared difference. There any many different loss functions. Here, MSE is an example that we chose.</span>



In [None]:
def mse_loss(predict: np.ndarray, target: np.ndarray) -> np.ndarray:
    """
    Mean squared error (MSE).

    prediction: predictions from the forward pass of neural network.
    target: true labels.

    output: MSE loss.
    """
    if len(predict.shape) <= 1:
        return ((predict - target) ** 2).mean()
    else:
        raise AssertionError("input should be 1d-array")

Let us also define a slightly different loss function that is a <span style="color: blue; font-weight: bold;">function of the variable parameters (weights), for use by the classical optimizer. This function only takes the ansatz parameters as input;</span> other variables for the forward pass and the loss are set as global parameters. <span style="color: blue; font-weight: bold;">
The optimizer will train the model by sampling different weights and attempting to lower the output of the cost/loss function.</span>



In [None]:
def mse_loss_weights(weight_params: np.ndarray) -> np.ndarray:
    """
    Cost function for the optimizer to update the ansatz parameters.

    weight_params: ansatz parameters to be updated by the optimizer.

    output: MSE loss.
    """
    predictions = forward(
        circuit=circuit,
        input_params=input_params,
        weight_params=weight_params,
        estimator=estimator,
        observable=observable,
    )

    cost = ?(?=predictions, ?=target)
    objective_func_vals.append(cost)

    global iter
    if iter % 50 == 0:
        print(f"Iter: {iter}, loss: {cost}")
    iter += 1

    return cost

Above we referred to using a classical optimizer. When we get to searching through weights to <span style="color: blue; font-weight: bold;">
minimize the cost function</span>, we will use the optimizer COBYLA:



In [None]:
from scipy.optimize import minimize

We will set some initial global variables for the cost function.



In [None]:
# Globals
circuit = full_circuit
observables = observable
# input_params = train_images_batch
# target = train_labels_batch
objective_func_vals = []
iter = 0

## Qiskit Patterns Step 2: Optimize problem for quantum execution

We start by selecting a backend for execution. In this case, we will use the least-busy backend.
**We use a Simulator in this notebook.**

In [None]:
from qiskit.primitives import StatevectorSampler
sampler = StatevectorSampler()

## Qiskit Patterns Step 3: Execute using Qiskit Primitives

### Loop over the dataset in batches and epochs

We first implement the full algorithm using a simulator for cursory debugging and for estimates of error. We can now go over the entire dataset in batches in desired number of epochs to train our quantum neural network.



An epoch in machine learning is one complete pass of the training dataset through the model.

If your training set has 10,000 samples, then 1 epoch means the model has seen all 10,000 once.
- Epoch = Full pass over the entire dataset
- Training usually needs multiple epochs because one pass is not enough for convergence.
- During each epoch: the model computes predictions, the loss is calculated, gradients are computed, parameters are updated.

Batch: the subset of samples processed before one update. 

Iteration: one update step.
- If batch size = 100 and dataset = 10,000 samples → 100 iterations per epoch.
- Iteration: one batch processed → one parameter update
- Iterations per epoch = dataset size ÷ batch size

<span style="color: blue; font-weight: bold;">
Here we take 1 epoch, a batch size of 140 and a dataset seize of 140.
</span>


In [None]:
from qiskit.primitives import StatevectorEstimator as Estimator

batch_size = 140
num_epochs = ?
num_samples = len(train_images)

# Globals
circuit = full_circuit
estimator = Estimator()  # simulator for debugging
observables = observable
objective_func_vals = []
iter = 0

# Random initial weights for the ansatz
np.random.seed(42)
weight_params = np.random.rand(len(ansatz.parameters)) * 2 * np.pi

for epoch in range(num_epochs):
    for i in range((num_samples - 1) // batch_size + 1):
        print(f"Epoch: {epoch}, batch: {i}")
        start_i = i * batch_size
        end_i = start_i + batch_size
        train_images_batch = np.array(train_images[start_i:end_i])
        train_labels_batch = np.array(train_labels[start_i:end_i])
        input_params = train_images_batch
        target = train_labels_batch
        iter = 0
        res = minimize(
            mse_loss_weights, weight_params, method="COBYLA", options={"maxiter": 100}
        )
        weight_params = res["x"]

## Qiskit Patterns Step 4: Post-process, return result in classical format

### Testing and accuracy

We now interpret the results from training. 

<span style="color: blue; font-weight: bold;">
We first test the training accuracy over the training set.
</span> text.



In [None]:
import copy
from sklearn.metrics import accuracy_score
from qiskit.primitives import StatevectorEstimator as Estimator  # simulator
# from qiskit_ibm_runtime import EstimatorV2 as Estimator  # real quantum computer

estimator = Estimator()
# estimator = Estimator(backend=backend)

pred_train = forward(circuit, np.array(train_images), res["x"], estimator, observable)
# pred_train = forward(circuit_ibm, np.array(train_images), res['x'], estimator, observable_ibm)

print(pred_train)

pred_train_labels = copy.deepcopy(pred_train)
pred_train_labels[pred_train_labels >= 0] = 1
pred_train_labels[pred_train_labels < 0] = -1
print(pred_train_labels)
print(train_labels)

accuracy = accuracy_score(?, ?)
print(f"Train accuracy: {? * 100}%")

In [None]:
print(len(pred_train_labels))
print(len(train_labels))

<span style="color: blue; font-weight: bold;">
The training accuracy is only $60%$, which is definitely not good. It is hard to imagine that the model's performance on the test set could be any better. Let's verify.
</span>



In [None]:
pred_test = forward(circuit, np.array(test_images), res["x"], estimator, observable)
# pred_test = forward(circuit_ibm, np.array(test_images), res['x'], estimator, observable_ibm)

print(pred_test)

pred_test_labels = copy.deepcopy(pred_test)
pred_test_labels[pred_test_labels >= 0] = 1
pred_test_labels[pred_test_labels < 0] = -1
print(pred_test_labels)
print(test_labels)

accuracy = accuracy_score(?, ?)
print(f"Test accuracy: {? * 100}%")

In [None]:
print(len(pred_test_labels))
print(len(test_labels))

The model is not classifying these data well. We should ask why this is, and in particular, we should check:

<span style="color: blue; font-weight: bold;">
           
- Did we stop the training too soon? Were more optimization steps needed?

- Did we construct a bad Ansatz? This could mean a lot of things. When we work on real quantum computers, circuit depth will be a major consideration. The number of parameters is also potentially important, as is the entangling between qubits.

- Combining the two above, did we construct an ansatz with too many parameters to be trainable?


We can start by checking for convergence in the optimization:

</span>

In [None]:
obj_func_vals_first = objective_func_vals
# import matplotlib.pyplot as plt

plt.figure(figsize=(12, 6))
plt.plot(obj_func_vals_first, label="first ansatz")
plt.xlabel("iteration")
plt.ylabel("loss")
plt.legend()
plt.show()

We might try extending the optimization steps to make sure the optimizer didn't just get stuck in a local minimum in parameter space. <span style="color: blue; font-weight: bold;">But it looks fairly converged.</span> 

Let's take a closer look at the images that were *not* classified correctly, and see if we can understand what is happening.



In [None]:
missed = []
for i in range(len(test_labels)):
    if pred_test_labels[i] != ?[i]:
        missed.append(test_images[i])
print(len(missed))

In [None]:
fig, ax = plt.subplots(12, 2, figsize=(6, 6), subplot_kw={"xticks": [], "yticks": []})
for i in range(len(missed)):
    ax[i // 2, i % 2].imshow(
        missed[i].reshape(vert_size, hor_size),
        aspect="equal",
    )
plt.subplots_adjust(wspace=0.02, hspace=0.025)

<span style="color: blue; font-weight: bold;">We see 24 images.</span>


Here we can see that <span style="color: blue; font-weight: bold;">the vast majority of the wrongly-classified images have a vertical line. Something about our model is failing to capture information about those.</span>You may have seen this coming, based on the first variational circuit. Let's look at it more closely.

## Improving the model

### Step 1 revisited

In mapping our problem to a quantum circuit, we should have explicitly thought about the how the information in adjacent pixels determines class. In order to identify horizontal lines, we want to know "if pixel $i$ is yellow, is pixel $i+1$ yellow" for all the pixels across each row. We also want to know about vertical lines. But since the classification is binary, one could imagine simply saying that if such a horizontal line is *not* detected, then it is a vertical line. Our previous variational circuit contained CNOT gates between qubits (and therefore pixels) 0 & 1, 1 & 2, and 2 & 3. 

<span style="color: blue; font-weight: bold;">That covers any horizontal lines across the top of the image, but it does not directly detect vertical lines, nor does it completely detect horizontal lines, as it ignores the lower row.</span>


<span style="color: blue; font-weight: bold;">To fully detect all horizontal lines, we would want to have a similar set of CNOT gates between qubits (pixels) 4 & 5, 5 & 6, and 6 & 7. We could keep in mind that adding CNOT gates between qubits corresponding to vertical lines (like 0 & 4, or 2 & 6) may also be useful. But we will first check whether it is sufficient to detect that there *is* or *is not* a horizontal line.</span>



In [None]:
# Initialize the circuit using the same number of qubits as the image has pixels
qnn_circuit = QuantumCircuit(size)

# We choose to have two variational parameters for each qubit.
params = ParameterVector("θ", length=2 * size)

# A first variational layer:
for i in range(size):
    qnn_circuit.ry(params[i], i)

# Here is an extended list of qubit pairs between which we want CNOT gates. This now covers all pixels connected by horizontal lines.
qnn_cnot_list = [[0, 1], [1, 2], [2, 3], [4, 5], [5, 6], [6, 7]]

for i in range(len(qnn_cnot_list)):
    qnn_circuit.cx(qnn_cnot_list[i][0], qnn_cnot_list[i][1])

# The second variational layer:
for i in range(size):
    qnn_circuit.rx(params[size + i], i)

# Check the circuit depth, and the two-qubit gate depth
print(qnn_circuit.decompose().depth())
print(
    f"2+ qubit depth: {qnn_circuit.decompose().depth(lambda instr: len(instr.qubits) > 1)}"
)

# Combine the feature map and variational circuit
ansatz = qnn_circuit

# Combine the feature map with the ansatz
full_circuit = QuantumCircuit(num_qubits)
full_circuit.compose(?, range(num_qubits), inplace=True)
full_circuit.compose(?, range(num_qubits), inplace=True)

# Display the circuit
?.decompose().?("mpl", style="clifford", fold=-1)

We have not increased the depth of the circuit. Let's see if we have increased its ability to model our images.

### Step 2 revisited

On a real device, we will need to transpile this new circuit for running on a real quantum backend.
We use a simulator.
Let's skip this step for now <span style="color: blue; font-weight: bold;">to see if our revision of the variational circuit has had the desired effect on simulators.</span> 

### Step 3 revisited

We now apply the updated model to our training data.



In [None]:
from qiskit.primitives import StatevectorEstimator as Estimator

batch_size = 140
num_epochs = 1
num_samples = len(train_images)

# Globals
circuit = full_circuit
estimator = Estimator()  # simulator for debugging
observables = observable
objective_func_vals = []
iter = 0

# Random initial weights for the ansatz
np.random.seed(42)
weight_params = np.random.rand(len(ansatz.parameters)) * 2 * np.pi

for epoch in range(num_epochs):
    for i in range((num_samples - 1) // batch_size + 1):
        print(f"Epoch: {epoch}, batch: {i}")
        start_i = i * batch_size
        end_i = start_i + batch_size
        train_images_batch = np.array(train_images[start_i:end_i])
        train_labels_batch = np.array(train_labels[start_i:end_i])
        input_params = train_images_batch
        target = train_labels_batch
        iter = 0
        res = minimize(
            mse_loss_weights, weight_params, method="COBYLA", options={"maxiter": 100}
        )
        weight_params = res["x"]

### Step 4 revisited

Let's start by checking whether our optimizer fully converged.



In [None]:
obj_func_vals_revised = objective_func_vals
# import matplotlib.pyplot as plt

plt.figure(figsize=(12, 6))
plt.plot(obj_func_vals_revised, label="revised ansatz")
plt.xlabel("iteration")
plt.ylabel("loss")
plt.legend()
plt.show()

<span style="color: blue; font-weight: bold;">This does not appear fully converged</span>, as the loss function has not remained roughly level for substantially many steps. 

<span style="color: blue; font-weight: bold;">
But the loss function is already ~60% lower than when using the previous variational circuit. 
    

If this were a research project, we would want to ensure full convergence. But for the purposes of exploration, this is sufficient. 
    

Let's check the accuracy on our training and testing data.</span>



In [None]:
from sklearn.metrics import accuracy_score
from qiskit.primitives import StatevectorEstimator as Estimator  # simulator
# from qiskit_ibm_runtime import EstimatorV2 as Estimator  # real quantum computer

estimator = Estimator()
# estimator = Estimator(backend=backend)

pred_train = forward(circuit, np.array(train_images), res["x"], estimator, observable)
# pred_train = forward(circuit_ibm, np.array(train_images), res['x'], estimator, observable_ibm)

print(pred_train)

pred_train_labels = copy.deepcopy(pred_train)
pred_train_labels[pred_train_labels >= 0] = ?
pred_train_labels[pred_train_labels < 0] = ?
print(pred_train_labels)
print(train_labels)

accuracy = accuracy_score(?, ?)
print(f"Train accuracy: {? * 100}%")

In [None]:
print(len(pred_train_labels))
print(len(train_labels))

In [None]:
pred_test = forward(circuit, np.array(test_images), res["x"], estimator, observable)
# pred_test = forward(circuit_ibm, np.array(test_images), res['x'], estimator, observable_ibm)

print(pred_test)

pred_test_labels = copy.deepcopy(pred_test)
pred_test_labels[pred_test_labels >= 0] = ?
pred_test_labels[pred_test_labels < 0] = ?
print(pred_test_labels)
print(test_labels)

accuracy = accuracy_score(?, ?)
print(f"Test accuracy: {? * 100}%")

In [None]:
print(len(pred_test_labels))
print(len(test_labels))

<span style="color: blue; font-weight: bold;">

$100\%$ Accuracy on both sets! Our suspicion about accurate detection of horizontal lines being sufficient was correct!

Further, our mapping from required information about the pixels to the CNOT gates in the quantum circuit was sufficiently effective.

</span>

## References

\[1] [https://arxiv.org/abs/2405.00781](https://arxiv.org/abs/2405.00781)



# END OF NOTEBOOK