# Direct minimisation

- Recall that we introduced DFT as the minimisation problem
  $$ \displaystyle \text{min}_{\{\psi_i\}} \mathcal{E}_\text{DFT}(\{\psi_i\}) $$
where the energy expression $\mathcal{E}_\text{DFT}(\{\psi_i\})$ is a known analytic function and the $\{\psi_i\}$ are orthonormal orbitals.


- Taking a closer look at this minimisation problem one might rightfully wonder why one should bother going through the SCF procedure at all. Would one not just minimise the energy wrt. the orbitals?
- The answer is yes and leads to a procedure usually called **direct minimisation** (DM). In practice it provides an alternative and to some extent more natural route to access a DFT ground state.

While the basic principle of an DM algorithms is relatively straightforward,
there are a few subtle points that make a practical implementation require some thought:

- Due to the orthogonality constraints on the orbitals $\{\psi_i\}$,
  the set of admissible orbitals does not form a vector space,
  but much rather a so-called Stiefel manifold.
  When implementing a DM algorithm this needs to be taken into account
  (via appropriate projections) to obtain the correct minimum.
  Fortunately `Optim.jl` can do that out of the box ...

- DM in combination with metals (i.e. without a band gap) is tricky,
  because of possible degeneracies of the orbitals at the Fermi level,
  which makes computing the gradient of $\mathcal{E}_\text{DFT}$ numerically tricky.
  
- In this workbook we will therefore assume our problem has a band gap
  (i.e. is an insulator) to make our life easier.

Now let's consider the gradient.
Since the total energy can also be written as
$$ \mathcal{E}_\text{DFT}(\{\psi_i\}) = 2 \sum_i \int \psi_i^\ast H \psi_i + E_\text{nuclear} $$
(where $E_\text{nuclear}$ is a constant),
the **gradient of $\mathcal{E}_\text{DFT}$**
can be represented as $4 H \psi_i $.
  
With this at hand we can proceed with a simple implementation:

In [47]:
using DFTK
using Optim
using LineSearches

# Standard silicon setup
# We'll keep things simple by limiting ourselves to a single k-Point.
a = 10.26
lattice = a / 2 * [[0 1 1.];
                   [1 0 1.];
                   [1 1 0.]]
Si = ElementPsp(:Si, psp=load_psp("hgh/lda/si-q4"))
atoms = [Si => [ones(3)/8, -ones(3)/8]]
model  = model_DFT(lattice, atoms, [:lda_x, :lda_c_vwn])
basis  = PlaneWaveBasis(model; Ecut=10, kgrid=[1, 1, 1]);

# One unit cell has 2 Silicon atoms.
# In the model we use (where only valence electrons are explictly treated)
# this makes 8 electrons, which requires 4 bands with 2 electrons each:
occupation = [2.0, 2.0, 2.0, 2.0]

# We specify a random initial guess for the 4 orbitals:
n_G = length(G_vectors(only(basis.kpoints)))
ψ0 = Matrix(qr(randn(ComplexF64, n_G, 4)).Q);

# to be filled in during the session ...

#### More details
- [Geometry of algorithms with orthogonality constraints](https://doi.org/10.1137/S0895479895290954)
- [Convergence analysis of direct minimisation and self-consistent field iterations](https://doi.org/10.1137/20M1332864)