## Variational algorithms

This course covers the specifics of variational algorithms, near-term hybrid quantum-classical algorithms, based on the *variational theorem of quantum mechanics*. These algorithms require relatively low amount of quantum resources requires, making them ideal candidates to achieve [Quantum Advantage](gloss:advantage).

Variational algorithms use a set of parameters to iteratively explore possible solutions to a problem. During each of iteration, we use the parameters to evaluate a *cost function*, and select the next iteration's parameters until we [converge](gloss:converge) on an optimal solution. The hybrid nature of this family of algorithms comes from the fact that said cost functions are evaluated using quantum resources, and optimized through classical ones.

Throughout this course, we'll explore:

- Each step in the variational algorithm design workflow
- Tradeoffs associated with each step
- How to use [Qiskit Runtime primitives](https://qiskit.org/documentation/apidoc/primitives.html) to optimize for speed and accuracy

This course applies concepts covered in [Basics of Quantum Information and Computation](https://qiskit.org/learn/course/basics-quantum-information/) (also available as [a series of YouTube videos](https://www.youtube.com/playlist?list=PLOFEBzvs-VvqKKMXX4vbi4EB1uaErFMSO)). Feel free to refer back to the series at any time.

## Simplified hybrid workflow

![Variational Flow](images/variational_workflow.png)

1. **Initialize problem**: Variational algorithms start by initializing the quantum computer in a _default state_ $|0\rangle$, then transforming it to some desired (non-parametrized) state $|\rho\rangle$, which we will call _reference state_. 
   
   This transformation is represented by the application of a (unitary) _reference operator_ $U_R$ on the default state, such that  $U_R|0\rangle = |\rho\rangle$.

2. **Prepare ansatz**: To begin iteratively optimizing from default state $|0\rangle$ to the target state $|\psi(\vec\theta)\rangle$, we must define a *variational form* $U_V(\vec\theta)$ to represent a collection of parametrized states for our variational algorithm to explore.
   
   We refer to any particular combination of reference state and variational form as an _ansatz_, such that: $U_A(\vec\theta) := U_V(\vec\theta) U_R$. Ansatze will ultimately take the form of parametrized quantum circuits capable of taking the default state $|0\rangle$ to the target state $|\psi(\vec\theta)\rangle$.


   All in all we will have:

   $$
   \begin{aligned}
   |0\rangle \xrightarrow{U_R} U_R|0\rangle

   & = |\rho\rangle \xrightarrow{U_V(\vec{\theta})} U_A(\vec{\theta})|0\rangle \\[1mm]

   & = U_V(\vec{\theta})U_R|0\rangle \\[1mm]

   & = U_V(\vec{\theta})|\rho\rangle \\[1mm]

   & = |\psi(\vec{\theta})\rangle \\[1mm]

   \end{aligned}
   $$

3. **Evaluate cost function**: For each variational parameter $\vec\theta$, a different quantum state will be produced. This requires evaluating a problem-specific _cost function_ $C(\vec\theta)$, such as the measured expectation value of an observable(s) (e.g. energy).

4. **Optimize parameters**: Evaluations are taken to a classical computer, where a classical optimizer analyzes them and chooses the next set of values for the variational parameters. If we have an pre-existing optimial solution, we can set it as an *initial point*  $\vec\theta_0$ to *bootstrap* our optimization. Using this *initial state* $|\psi(\vec\theta_0)\rangle$ could help our optimizer find a valid solution faster.

5. **Adjust ansatz parameters with results, and re-run**: the entire process is repeated until the classical optimizer's finalization criteria is met, and an optimal set of parameter values $\vec\theta^*$ returned. The proposed solution state for our problem will then be $|\psi(\vec\theta^*)\rangle = U_A(\vec\theta^*)|0\rangle$.

## Variational theorem

A common goal of variational algorithms is to find the quantum state with the lowest or highest eigenvalue of a certain observable. A key insight we'll use is the _variational theorem_ of quantum mechanics. Before going into its full statement, let us explore some of the mathematical intuition behind it.

### Mathematical intuition for energy and ground states

In quantum mechanics, energy comes in the form of a quantum observable usually referred to as the _Hamiltonian_, which we'll denote by $\hat{\mathcal{H}}$. Let us consider its [spectral decomposition](gloss:decomposition): 

$$
\hat{\mathcal{H}} = \sum_{k=0}^{N-1} \lambda_k |\phi_k\rangle \langle \phi_k|
$$

Where $N$ is the dimensionality of the space of states, and $\lambda_{k-1}$ is the $k$-th eigenvalue or, physically, the $k$-th energy level and $|\phi_k\rangle$. With the corresponding [eigenstate](gloss:eigenstate): $\hat{\mathcal{H}}|\phi_k\rangle = \lambda_k |\phi_k\rangle$, the expected energy of a system in the (normalized) state $|\psi\rangle$ will be:

$$
\begin{aligned}
\langle \psi | \hat{\mathcal{H}} | \psi \rangle

& = \langle \psi |\bigg(\sum_{k=0}^{N-1} \lambda_k |\phi_k\rangle \langle \phi_k|\bigg) | \psi \rangle \\[1mm]

& = \sum_{k=0}^{N-1} \lambda_k \langle \psi |\phi_k\rangle \langle \phi_k| \psi \rangle \\[1mm]

& = \sum_{k=0}^{N-1} \lambda_k |\langle \psi |\phi_k\rangle|^2 \\[1mm]

\end{aligned}
$$

If we take into account that $\lambda_0\leq \lambda_k, \forall k$ we have that:

$$
\begin{aligned}
\langle \psi | \hat{\mathcal{H}} | \psi \rangle

& = \sum_{k=0}^{N-1} \lambda_k |\langle \psi |\phi_k\rangle|^2 \\[1mm]

& \geq  \sum_{k=0}^{N-1} \lambda_0 |\langle \psi |\phi_k\rangle|^2 \\[1mm]

& = \lambda_0 \sum_{k=0}^{N-1} |\langle \psi |\phi_k\rangle|^2 \\[1mm]

& = \lambda_0 \\[1mm]

\end{aligned}
$$

Since $\\{|\phi_{k} \rangle\\}_{k=0}^{N-1}$ is an orthonormal basis, the probability of measuring $|\phi_{k} \rangle$ is $p_k = |\langle \psi |\phi_{k} \rangle |^2$, and the sum of all probabilities is such that $\sum_{k=0}^{N-1} |\langle \psi |\phi_k\rangle|^2 = \sum_{k=0}^{N-1}p_k = 1$. In short, the expected energy of any system is higher than the lowest energy or ground state energy:

$$
\langle \psi | \hat{\mathcal{H}} | \psi \rangle \geq \lambda_0.
$$

The above argument applies to any valid (normalized) quantum state $|\psi\rangle$,
so it is perfectly possible to consider parametrized states $|\psi(\vec\theta)\rangle$ depending on a parameter vector $\vec\theta$, this is where the "variational" part comes into play. If we consider a cost function given by $C(\vec\theta) := \langle \psi(\vec\theta)|\hat{\mathcal{H}}|\psi(\vec\theta)\rangle$ and want to minimize it, the minimum will always verify:

$$
\min_{\vec\theta} C(\vec\theta) = 
\min_{\vec\theta} \langle \psi(\vec\theta)|\hat{\mathcal{H}}|\psi(\vec\theta)\rangle \geq \lambda_0.
$$

So that minimum will be the closest one can get to $\lambda_0$ with the parametrized states $|\psi(\vec\theta)\rangle$, where the equality is reached if and only if there exists a parameter vector $\vec{\theta^*}$ such that $|\psi(\vec{\theta^*})\rangle = |\phi_0\rangle$.

### Variational theorem of Quantum Mechanics 

If the (normalized) state $|\psi\rangle$ of a quantum system depends on a parameter vector $\vec\theta$, then the optimal approximation of the ground state (i.e. the eigenstate $|\phi_0\rangle$ with the minimum eigenvalue $\lambda_0$) is the one that minimizes the expectation value of the Hamiltonian $\hat{\mathcal{H}}$:

$$
\langle \hat{\mathcal{H}} \rangle(\vec\theta) := 
\langle \psi(\vec\theta) |\hat{\mathcal{H}}| \psi(\vec\theta) \rangle \geq 
\lambda_0
$$

The reason why the variational theorem is stated in terms of energy minimums is that these include a number of mathematical assumptions:
- For physical reasons, there needs to exist a finite lower bound to the energy $E \geq \lambda_0 > -\infty$, even for $N\rightarrow\infty$.
- Upper bounds do not exist in general.

However, mathematically speaking, there is nothing special about the Hamiltonian $\hat{\mathcal{H}}$ beyond these, so the theorem can be generalized to other quantum observables and their eigenstates provided that they follow the same constraints. Notice as well that, if finite upper bounds exist, the same mathematical arguments could be made for maximizing eigenvalues by virtue of swapping lower bounds for upper bounds.

With this lesson, you learned the high-level view of variational algorithms. Over the following lessons, we'll explore each step in greater detail, and their associated tradeoffs.