# Fix VQC when training from a warm start

This notebook uses pull request [#312](https://github.com/Qiskit/qiskit-machine-learning/pull/312) for Qiskit Machine Learning. This is provides a (trivial) fix for issue [#296](https://github.com/qiskit/qiskit-machine-learning/issues/296). Thanks, Patrick Odagiu, for raising it!

## Introduction

### Variational quantum algorithms

![](https://learn.qiskit.org/content/quantum-machine-learning/images/vqc/va.svg)

Variational quantum algorithm (VQAs) are near-term, classical-quantum-hybrid algorithms that use a parameterized quantum circuit $U(\theta)$. 

Initially this circuit uses some initial guess of the parameters $\boldsymbol{\theta}_0$ (ansatz). 
We then prepare a state $|\psi(\boldsymbol{\theta}_0) \rangle = U(\boldsymbol{\theta}_0) |0 \rangle$ and measure its expectation value.
We then use a cost function $C(\theta)$ that determines how far these expectation values are from the ideal solution.
The cost function is evaluated and parameters updated using a classical optimization algorithm: $\boldsymbol{\theta}_0 \rightarrow \boldsymbol{\theta}_1$.
We then prepare the state $|\psi(\boldsymbol{\theta}_1) \rangle = U(\boldsymbol{\theta}_1) |0 \rangle$ and evaluate the expectation value again.

This cycle of preparation and evaluation of circuit parameters continues until we have sufficiently minimized the cost function.

### The variational quantum classifier

![](https://learn.qiskit.org/content/quantum-machine-learning/images/vqc/vqc.svg)

The variational quantum classifier (VQC) is a supervised VQA, where the measured expectation value is interpreted as the output of a classifier. 

Consider a classification problem with binary target labels $y_i = \{0,1\}$ and data feature vectors $\boldsymbol{x}_i$. For each feature, we build a parameterized quantum circuit that prepares the state of $n$ qubits:

$$|\psi(\boldsymbol{x}_i;\boldsymbol{\theta}) \rangle =  U_{W(\boldsymbol{\theta})}U_{\phi(\boldsymbol{x}_i)}|0 \rangle,$$

where $U_{\phi(\boldsymbol{x}_i)}$ corresponds to the data-encoding circuit and $U_{W(\boldsymbol{\theta})}$ corresponds to the variational circuit.

Measuring this state provides an $n$-length bitstring. This is mapped to the binary classification label using a boolean function $f: \{0, 1\}^{n} \rightarrow \{0, 1\}$, usually the parity function.
We then compute the difference between predicted labels $\hat{y}_i$ and the target labels $y_i$ using the cost function. 
The classical optimization algorithm selects a new point $\boldsymbol{\theta}'$, which is then used to create a new circuit.
This training process is repeated until the cost function stabilizes around a minimum point for which $\boldsymbol{\theta}$ provides the best predictions.

See the [tutorial on VQCs in the Qiskit Textbook Beta](https://learn.qiskit.org/course/machine-learning/variational-classification) for more information.

### Warm-start training

Training a VQC usually begins with initializing the model parameters with randomized values. 

Alternatively, we can initialize the parameters with values saved from a previously trained model. This warm-starting approach enables us to start training from a better initial point on the cost surface. Thus, warm-starting leverages prior computation to pick up where the last training run left off and reduce the time required to train the full model. This is useful for training in batches.

## Minimal (not) working example

Run the next cell to checkout the pre-fix commit and install it. **Note that this notebook expects to live in the root directory of the qiskit-machine-learning repository.** Restart the kernel for the changes to take effect. 

In [None]:
import sys 
!git checkout 44bfc66a8aef1baf6794ab43ea9c73e5be34ce8e && {sys.executable} -m pip install -e . > /dev/null

In [None]:
import numpy as np
from qiskit import Aer
from qiskit.utils import QuantumInstance
from qiskit_machine_learning.algorithms.classifiers import VQC

Construct toy data set.

In [None]:
features = np.random.rand(20, 2)
target = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
target = np.tile(target, 2)
print(f"Features:\n{features.T}")
print(f"Target:", target)

One-hot encode the target.

In [None]:
onehot_target = np.zeros((target.size, int(target.max() + 1)))
onehot_target[np.arange(target.size), target.astype(int)] = 1
print(onehot_target.T)

Initialize the quantum instance.

In [None]:
backend = Aer.get_backend("aer_simulator_statevector")
quantum_instance = QuantumInstance(backend)

Initialize the VQC.

In [None]:
vqc = VQC(
    num_qubits=2,
    loss="cross_entropy",
    warm_start=True,
    quantum_instance=quantum_instance,
)

Fit the VQC to the first half of the constructed toy data.

In [None]:
vqc.fit(features[:10, :], onehot_target[:10])
print("First fit complete.")

Fit the VQC to the second half of the constructed toy data.

In [None]:
vqc.fit(features[10:, :], onehot_target[10:])
print("Second fit complete.")

## Why does it fail?

In [Qiskit Terra](https://github.com/Qiskit/qiskit-terra) the optimizers were refactored in pull request [#6866](https://github.com/Qiskit/qiskit-terra/pull/6866/).

The old `optimize` method, that returned a subscriptable `Tuple` was deprecated and a new method `minimize` was created that returns an [`OptimizerResult`](https://qiskit.org/documentation/stubs/qiskit.algorithms.optimizers.OptimizerResult.html) object. 

The code was updated to use the new `minimize` method but the result handling was not altered—leading to the above failure.

## What's the fix?

Examining the API documentation for [`OptimizerResult`](https://qiskit.org/documentation/stubs/qiskit.algorithms.optimizers.OptimizerResult.html), we see that the final point of the minimization is stored in the property `x`.

So, the very simple fix is:
```sh
--- a/qiskit_machine_learning/algorithms/trainable_model.py
+++ b/qiskit_machine_learning/algorithms/trainable_model.py
@@ -201,7 +201,7 @@ class TrainableModel:
             An array as an initial point
         """
```
```python
         if self._warm_start and self._fit_result is not None:
            self._initial_point = self._fit_result.x
#            self._initial_point = self._fit_result[0]
         elif self._initial_point is None:
             self._initial_point = algorithm_globals.random.random(self._neural_network.num_weights)
         return self._initial_point
```

Let's checkout the version of the code after the pull request that introduced this fix was merged.

In [None]:
import sys
!git checkout 5a02c7639db6a86beb9a944b14721935a48932e6 && {sys.executable} -m pip install -e . > /dev/null

**Now restart the kernel and clear the outputs of all cells. Then run the implementation cells again.**
All cells should now run without raising any errors.

## Conclusion

We now correctly save the final point of the minimization of a fit for all trainable models. This enables one to use for a warm start in subsequent fits.

We also added some additional unit tests to check this is functionality is not broken by any future commits.