## Variational algorithms

In this course you will learn about the specifics of _variational algorithms_, explaining each step and the reasoning behind it, as well as how [Qiskit's primitive framework](https://qiskit.org/documentation/apidoc/primitives.html) fits on these algorithms. After that, we will give an [overview](instances.ipynb) of some of the most well-known algorithms of these type, like VQE, VQD, SSVQE, or QSR.

The first question that one may ask here is: What's a variational algorithm and why should I care about them?

In general, a variational algorithm is any algorithm based on a _variational principle_: a framework through which the solution to a problem can be represented as a set of optimal values that minimize or maximize some quantities which we will refer to as _cost functions_.

In the context of quantum computing, we understand variational algorithms as near-term hybrid quantum-classical algorithms, usually based on the *variational theorem of quantum mechanics* or some variant of it, and aiming at optimizing certain given cost function(s). The hybrid nature of this family of algorithms comes from the fact that said cost functions are evaluated using quantum resources, and optimized through classical ones.

One of the main advantages of this kind of algorithms is that they can be implemented with state-of-the-art quantum computers, as they require a relatively-low amount of quantum resources. This makes them ideal candidates to achieve so called [_Quantum Advantage_](https://www.ibm.com/blogs/research/2019/10/on-quantum-supremacy/), a regime where quantum computers surpass classical ones at solving certain problems.

## Simplified hybrid workflow

Variational algorithms start by initializing the quantum computer in a _default state_ $|0\rangle$, then transforming it to some desired (non-parametrized) state $|\rho\rangle$, which we will call _reference state_. This transformation is represented by the application of a (unitary) _reference operator_ $U_R$ on the default state, such that  $U_R|0\rangle = |\rho\rangle$.

From that state, a so called _variational form_ $U_V(\vec\theta)$ is applied, resulting in the parametrized states $|\psi(\vec\theta)\rangle = U_V(\vec\theta)|\rho\rangle = U_V(\vec\theta)U_R|0\rangle$. This is the collection of states that our variational algorithm will explore when searching for an answer to the problem at hand.

We will refer to any particular combination of reference state and variational form as an _ansatz_, such that: $U_A(\vec\theta) := U_V(\vec\theta) U_R$. Ansatze will ultimately take the form of parametrized quantum circuits, denominated _ansatz circuits_, capable of taking the default state $|0\rangle$ to the target state $|\psi(\vec\theta)\rangle$. All in all we will have:

$$
|0\rangle \xrightarrow{U_R} 
U_R|0\rangle = |\rho\rangle \xrightarrow{U_V(\vec\theta)} 
U_A(\vec\theta)|0\rangle = 
U_V(\vec\theta)U_R|0\rangle = 
U_V(\vec\theta)|\rho\rangle = 
|\psi(\vec\theta)\rangle
$$

For each particular choice of the variational parameters $\vec\theta$ a different quantum state will be produced, which in turn means a different evaluation of a problem-specific _cost function_ $C(\vec\theta)$; for instance, the measured expectation value of some observable(s) (e.g. energy). Evaluations are taken to a classical computer, where a classical optimizer analyzes them and chooses the next set of values for the variational parameters.

If relevant for the algorithm, we will denote the initial choice of the variational parameters $\vec\theta_0$ by the name _initial point_; and the corresponding variational state $|\psi(\vec\theta_0)\rangle$ by _initial state_. These will sometimes be provided by the user as a means for bootstrapping the computation.

Finally, the entire process is repeated until the classical optimizer's finalization criteria is met, and an optimal set of parameter values $\vec\theta^*$ returned. The proposed solution state for our problem will then be $|\psi(\vec\theta^*)\rangle = U_A(\vec\theta^*)|0\rangle$.

## Variational theorem

A common goal or need of variational algorithms —as expressed through each problem's cost function— is to approximate the state of lowest (highest) eigenvalue of certain observable for a quantum system. A key insight to understand how to achieve such states in this variational fashion, is the _Variational Theorem of Quantum Mechanics_, but, before going into its full statement, let us explore some of the mathematical intuition behind it.


### Mathematical intuition for energy and ground states

In quantum mechanics, energy comes in the form of a quantum observable usually referred to as the _Hamiltonian_, which we'll denote by $\hat{\mathcal{H}}$. Let us consider its spectral decomposition $$\hat{\mathcal{H}}=\sum_{k=0}^{N-1} \lambda_k |\phi_k\rangle \langle \phi_k|,$$ where $N$ is the dimensionality of the space of states, and $\lambda_{k-1}$ is the $k$-th eigenvalue or, physically, the $k$-th energy level and $|\phi_k\rangle$ the corresponding eigenstate. That is, $\hat{\mathcal{H}}|\phi_k\rangle = \lambda_k|\phi_k\rangle$.

Then, the expected energy of a system in the (normalized) state $|\psi\rangle$ will be:

$$
\langle \psi | \hat{\mathcal{H}} | \psi \rangle = 
\langle \psi |\bigg(\sum_{k=0}^{N-1} \lambda_k |\phi_k\rangle \langle \phi_k|\bigg) | \psi \rangle = 
\sum_{k=0}^{N-1} \lambda_k \langle \psi |\phi_k\rangle \langle \phi_k| \psi \rangle = 
\sum_{k=0}^{N-1} \lambda_k |\langle \psi |\phi_k\rangle|^2.
$$

If we take into account that $\lambda_0\leq \lambda_k$ $\forall k$ we have that:

$$
\langle \psi | \hat{\mathcal{H}} | \psi \rangle = 
\sum_{k=0}^{N-1} \lambda_k |\langle \psi |\phi_k\rangle|^2 \geq 
\sum_{k=0}^{N-1} \lambda_0 |\langle \psi |\phi_k\rangle|^2 = 
\lambda_0 \sum_{k=0}^{N-1} |\langle \psi |\phi_k\rangle|^2 = 
\lambda_0.
$$

In the last step we used that, since $\{|\phi_k\rangle\}_{k=0}^{N-1}$ is an orthonormal basis, the probability of measuring $|\phi_k\rangle$ is $p_k = |\langle \psi |\phi_k \rangle |^2$, and the sum of all probabilities is such that $\sum_{k=0}^{N-1} |\langle \psi |\phi_k\rangle|^2 = \sum_{k=0}^{N-1}p_k = 1$. In short, the expected energy of any system is higher than the lowest energy or ground state. That is,

$$
\langle \psi | \hat{\mathcal{H}} | \psi \rangle \geq \lambda_0.
$$

The above argument applies to any valid (normalized) quantum state $|\psi\rangle$,
so it is perfectly possible to consider parametrized states $|\psi(\vec\theta)\rangle$ depending on a parameter vector $\vec\theta$; this is where the "variational" part comes into play. If we consider a cost function given by $C(\vec\theta) := \langle \psi(\vec\theta)|\hat{\mathcal{H}}|\psi(\vec\theta)\rangle$ and want to minimize it, the minimum will always verify:

$$
\min_{\vec\theta} C(\vec\theta) = 
\min_{\vec\theta} \langle \psi(\vec\theta)|\hat{\mathcal{H}}|\psi(\vec\theta)\rangle \geq \lambda_0.
$$

So that minimum will be the closest one can get to $\lambda_0$ with the parametrized states $|\psi(\vec\theta)\rangle$, where the equality is reached if and only if there exists a parameter vector $\vec{\theta^*}$ such that $|\psi(\vec{\theta^*})\rangle = |\phi_0\rangle$.

### Variational theorem of Quantum Mechanics 

If the (normalized) state $|\psi\rangle$ of a quantum system depends on a parameter vector $\vec\theta$, then the optimal approximation of the ground state (i.e. the eigenstate $|\phi_0\rangle$ with the minimum eigenvalue $\lambda_0$) is the one that minimizes the expectation value of the Hamiltonian $\hat{\mathcal{H}}$:

$$
\langle \hat{\mathcal{H}} \rangle(\vec\theta) := 
\langle \psi(\vec\theta) |\hat{\mathcal{H}}| \psi(\vec\theta) \rangle \geq 
\lambda_0
$$

The reason why the variational theorem is stated in terms of energy minimums is that these include a number of mathematical assumptions:
- For physical reasons, there needs to exist a finite lower bound to the energy $E \geq \lambda_0 > -\infty$, even for $N\rightarrow\infty$.
- Upper bounds do not exist in general.

However, mathematically speaking, there is nothing special about the Hamiltonian $\hat{\mathcal{H}}$ beyond these; so the theorem can be generalized to other quantum observables and their eigenstates provided that they follow the same constraints. Notice as well that, if finite upper bounds exist, the same mathematical arguments could be made for maximizing eigenvalues by virtue of swapping lower bounds for upper bounds.

In [1]:
import qiskit.tools.jupyter
%qiskit_copyright