# Direct minimisation

- Recall that we introduced DFT as the minimisation problem
  $$ \displaystyle \text{min}_{\{\psi_i\}} \mathcal{E}_\text{DFT}(\{\psi_i\}) $$
where the energy expression $\mathcal{E}_\text{DFT}(\{\psi_i\})$ is a known analytic function and the $\{\psi_i\}$ are orthonormal orbitals.


- Taking a closer look at this minimisation problem one might rightfully wonder why one should bother going through the SCF procedure at all. Could one not just minimise the energy wrt. the orbitals?
- The answer is yes and leads to a procedure called **direct minimisation** (DM). In practice DM provides an alternative route to access a DFT ground state.

While the basic principle of a DM algorithms is relatively straightforward
a practical implementation faces a few subtle points that require some thought:

- Due to the orthogonality constraints on the orbitals $\{\psi_i\}$,
  the set of admissible orbitals does not form a vector space,
  but much rather the unknowns of $\{\psi_i\}$ belong to a Stiefel manifold.
  This needs to be taken into account
  (via appropriate projections) in order to get the correct minimum.
  Fortunately `Optim.jl` can do that out of the box ...

- DM in combination with metals (i.e. without a band gap) is tricky,
  because of possible degeneracies of the orbitals at the Fermi level ($i = N$).
  This problem manifests when computing the gradient of $\mathcal{E}_\text{DFT}$
  in such a setting, where thus special care is needed to not run into
  "division by zero" issues.
  
- In this workbook we will restrict ourseves to insulators,
  which have a band gap and thus don't feature this numerical problem.
  Furthermore we will restrict ourselves to a single $k$-Point
  as going beyond that requires a little more bookkeeping
  (see the [DFTK implementation](https://github.com/JuliaMolSim/DFTK.jl/blob/master/src/scf/direct_minimization.jl)).
  
All right, now let's consider computing the gradient.
Since the total energy can also be written as
$$ \mathcal{E}_\text{DFT}(\{\psi_i\}) = 2 \sum_i \int \psi_i^\ast H \psi_i + E_\text{nuclear} $$
(where $E_\text{nuclear}$ is a constant), the **gradient of $\mathcal{E}_\text{DFT}$**
can be represented as $4 H \psi_i $.

We can proceed to a simple implementation:

In [None]:
using DFTK
using Optim
using LineSearches

# Standard silicon setup
a = 10.26
lattice = a / 2 * [[0 1 1.];
                   [1 0 1.];
                   [1 1 0.]]
Si = ElementPsp(:Si, psp=load_psp("hgh/lda/si-q4"))
atoms = [Si => [ones(3)/8, -ones(3)/8]]
model  = model_DFT(lattice, atoms, [:lda_x, :lda_c_vwn])
basis  = PlaneWaveBasis(model; Ecut=10, kgrid=[1, 1, 1]);

# One unit cell has 2 Silicon atoms.
# In the model we use (where only valence electrons are explictly treated)
# this makes 8 electrons, which requires 4 bands with 2 electrons each:
occupation = [2.0, 2.0, 2.0, 2.0]

# We specify a random initial guess for the 4 orbitals:
n_G = length(G_vectors(only(basis.kpoints)))
ψ0 = Matrix(qr(randn(ComplexF64, n_G, 4)).Q);

# Function to compute energies and gradients
function fg!(E, G, ψ)
    ρ = compute_density(basis, [ψ], [occupation])
    energies, H = energy_hamiltonian(basis, [ψ], [occupation]; ρ=ρ)

    if G !== nothing
        # Optim expects the gradient in G
        G .= 4 * (H.blocks[1] * ψ)
    end
    energies.total
end

# Select a quasi-Newton algorithm with backtracking linesearches
# to avoid to many costly gradient evaluations
algorithm = Optim.LBFGS(manifold=Optim.Stiefel(),
                        linesearch=LineSearches.BackTracking())

# Set some convergence options in Optim:
options = Optim.Options(; allow_f_increases=true, show_trace=true, x_tol=1e-6)

# Run the direct minimisation
res = Optim.optimize(Optim.only_fg!(fg!), ψ0, algorithm, options)

@show res.minimum

#### More details
- [Geometry of algorithms with orthogonality constraints](https://doi.org/10.1137/S0895479895290954)
- [Convergence analysis of direct minimisation and self-consistent field iterations](https://doi.org/10.1137/20M1332864)