### Natural Gradient Descent

In variational calculations, the primary goal is to find the optimal set of parameters, $\theta_{opt}$, that minimizes the energy expectation value:

$$
E(\theta) = \frac{\langle\psi_{\theta}|H|\psi_{\theta}\rangle}{\langle\psi_{\theta}|\psi_{\theta}\rangle} \ge E_{0}
$$

While standard stochastic gradient descent (SGD) is an option, a more general and physically motivated approach is derived from the concept of imaginary time evolution.

### Imaginary Time Evolution as Optimization

Imaginary time evolution is a well-known method for systematically filtering out the ground state component $|0\rangle$ from an arbitrary trial state $|\psi\rangle$ (provided they are not orthogonal). This is expressed as:

$$
|0\rangle \propto \lim_{\tau\rightarrow\infty} e^{-\tau H} |\psi\rangle
$$

By expanding the trial state in the energy eigenbasis ($H|n\rangle = E_n|n\rangle$), we can see why this works[cite: 218]:

$$
e^{-\tau H}|\psi\rangle = \sum_{n} c_n e^{-\tau E_n} |n\rangle \propto |0\rangle + \sum_{n>0} \frac{c_n}{c_0} e^{-\tau(E_n - E_0)} |n\rangle
$$

Because $E_n > E_0$ for all excited states ($n>0$), this second term vanishes as $\tau \rightarrow \infty$, leaving only the ground state.

### Projecting onto the Variational Manifold

We apply this concept iteratively to our variational state $|\psi_{\theta}\rangle$ using small time steps $\delta\tau$. The parameters are updated $\theta' = \theta + \delta\theta$ so that the new state approximates the imaginary time-evolved state:

$$
|\psi_{\theta+\delta\theta}\rangle \propto e^{-\delta\tau H} |\psi_{\theta}\rangle
$$

By expanding both sides to the first order and taking the limit $\delta\tau \rightarrow 0$, we arrive at a differential equation for the parameters $\theta$:

$$
S\dot{\theta} = -g
$$

Here, $\dot{\theta} = d\theta/d\tau$, and $S$ and $g$ are defined as:
* $g_{\mu} = 2~\text{Re}\{\langle\mathcal{O}_{\mu}^{\dagger}H\rangle - \langle\mathcal{O}_{\mu}^{\dagger}\rangle\langle H\rangle\}$
* $S_{\mu\nu} = 2~\text{Re}\{\langle\mathcal{O}_{\mu}^{\dagger}\mathcal{O}_{\nu}\rangle - \langle\mathcal{O}_{\mu}^{\dagger}\rangle\langle\mathcal{O}_{\nu}\rangle\}$

The vector $g$ is exactly the gradient of the energy, $g = \nabla_{\theta}E(\theta)$.

---

### The Quantum Geometric Tensor (S)

The matrix $S$ is the crucial component. It is known as the **Quantum Geometric Tensor (QGT)** or the quantum Fisher information matrix.

This matrix acts as the **metric tensor** of the variational manifold $\mathcal{M}_{\psi}$. It measures the distance between two infinitesimally close quantum states. This is evident in the expansion of the quantum fidelity $F$ between a state and its slightly perturbed version:

$$
F(\psi_{\theta+\delta\theta}, \psi_{\theta}) = \frac{|\langle\psi_{\theta+\delta\theta}|\psi_{\theta}\rangle|^2}{\langle\psi_{\theta+\delta\theta}|\psi_{\theta+\delta\theta}\rangle\langle\psi_{\theta}|\psi_{\theta}\rangle} = 1 - \frac{1}{2}\sum_{\mu\nu}S_{\mu\nu}\delta\theta^{\mu}\delta\theta^{\nu} + \dots
$$

This provides a clear geometrical picture: imaginary time evolution defines a trajectory in the full Hilbert space toward the ground state. Our optimization procedure projects this true trajectory onto the curved variational manifold $\mathcal{M}_{\psi}$, and the path along this manifold is governed by the equation $S\dot{\theta} = -g$.

### The QNG / SR Update Rule

The **Quantum Natural Gradient (QNG)**, also known as **Stochastic Reconfiguration (SR)**, is derived by simply discretizing the differential equation $S\dot{\theta} = -g$ using a simple Euler integrator.

This gives the parameter update rule:

$$
\theta^{\prime} = \theta - \delta\tau S^{-1}g
$$


This contrasts sharply with standard Stochastic Gradient Descent (SGD), which is recovered if we make the crude approximation that the metric is the identity matrix, $S \approx I$:

$$
\theta^{\prime} = \theta - \eta \nabla_{\theta}E(\theta)
$$


The QNG/SR algorithm can be understood as performing the steepest descent in a *curved space* defined by the metric $S$, which guarantees that the updated state remains close to the initial state in the proper Fubini-Study metric.

---

### Exercise: Write the JAX code that evaluates the QGT matrix S