# Electron Correlation

Hartree-Fock theory provides a first approximation for the solution of the electronic Schrodinger's equation. However, Hartree-Fock theory is not exact even at the complete basis set limit. The difference between the exact energy and the Hartree-Fock energy is called the **correlation energy**:

\begin{align}
E_{corr} = E_{Exact} - E_{HF}
\end{align}

How does Hartree-Fock theory neglect electron correlation? Remeber that, in fact, the Hartree-Fock trial wavefunction, the Slater determinant, accounts for *exchange correlation*, which arises from the Pauli exclusion principle. That is, electrons with the same spin are correlated while those with opposite spins are uncorrelated. The more approximate Hartree product neglects exchange correlation, and is a true independent particle model. Hartree-Fock theory is a mean-field theory that describes the interaction of an electron with the *average* charge density of all other electrons. The approximate nature of Hartree-Fock theory stems from the fact that it uses only a *single* Slater determinant as a trial wavefunction. Indeed, Hartree-Fock theory provides the *best* estimate of the energy given a single Slater determinant as the trial wavefunction, as guranteed by the variational procedure. Therefore, we can improve the energy by including additional Slater determinants in the trial wavefunction.

Electron correlation is often divided into two types: dynamical, and static. **Dynamical correlation** accounts for the instantaneous correlated motion of the electrons, which is neglected in the mean-field approximation of Hartree-Fock theory. In **static correlation**, the Hartree-Fock single Slater determinant method fails to qualitatively describe the wavefunction, and *multi-reference* methods are required. This generally happens when the several Slater determinants have similar weights, which arises because of (near) degeneracies in the energy levels. We will first discuss dynamical correlation.

## Revisiting the H$_2$ Minimal Basis Set Model
H$_2$ in a minimal basis set has the following possible Slater determinants:

\begin{align}
|\Psi_0 \rangle = | 1 \overline{1} \rangle = & \underline{\quad} \, \psi_2 \\
                                             & \underline{\uparrow \downarrow} \, \psi_1.
\\
| \Psi_1^2 \rangle = |2\overline{1} \rangle = & \underline{\uparrow}  \, \psi_2\\
                                              & \underline{\downarrow} \, \psi_1,
\\
|\Psi_1^\overline{2} \rangle = |\overline{21} \rangle = & \underline{\downarrow}  \, \psi_2\\
                                                        & \underline{\downarrow}  \, \psi_1,
\\
|\Psi_\overline{1}^2 \rangle = |12 \rangle = & \underline{\uparrow}  \, \psi_2\\
                                                        & \underline{\uparrow}  \, \psi_1,
\\
|\Psi_\overline{1}^\overline{2} \rangle = |1\overline{2} \rangle = & \underline{\downarrow} \psi_2\\
                                                        & \underline{\uparrow}  \, \psi_1.
\\
|\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle = |2 \overline{2} \rangle = 
    & \underline{\uparrow \downarrow} \, \psi_2 \\
    & \underline{\quad} \, \psi_1. 
\end{align}

In Hartree-Fock theory, only the ground state Slater determinant $|\Psi_0 \rangle$ is used and all the others are neglected. An *exact* wavefunction within the space spanned by the minimal basis set is given by the linear combination of all possible Slater determinants:

\begin{align}
|\Phi_0 \rangle = c_0 |\Psi_0 \rangle + c_1^2 | \Psi_1^2 \rangle + c_1^\overline{2} |\Psi_1^\overline{2} \rangle + 
                c_\overline{1}^2 |\Psi_\overline{1}^2 \rangle + 
                c_\overline{1}^\overline{2} |\Psi_\overline{1}^\overline{2} \rangle + 
                c_{1 \overline{1}}^{2 \overline{2}} |\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle
\end{align}

Based on the symmetry of the ground state (totally symmetric and gerade), it can be shown that only $c_0$ and $c_{1 \overline{1}}^{2 \overline{2}}$ are nonzero. The singly excited configurations are antisymmetric and ungerade and therefore they do not contribute to the wavefunction. Thus, the wavefunction reduces to:
\begin{align}
|\Phi_0 \rangle = c_0 |\Psi_0 \rangle +
                c_{1 \overline{1}}^{2 \overline{2}} |\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle
\end{align}

The exact values of the coefficients and the value of the exact energy ($\langle \phi_0| \hat{H} |\phi_0 \rangle$) can be calculated by diagonalizing the matrix:
\begin{align}
\textbf{H} = \begin{pmatrix}
                \langle \Psi_0| \hat{H} |\Psi_0 \rangle &
                \langle \Psi_0| \hat{H} |\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle \\
                \langle \Psi_{1 \overline{1}}^{2 \overline{2}}| \hat{H} | \Psi_0 \rangle &
                \langle \Psi_{1 \overline{1}}^{2 \overline{2}}| \hat{H} |\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle
             \end{pmatrix}
\end{align}

The above example showed all the possible determinants for H$_2$ in a minimal basis set. However, if we increase the basis set size, the number of excited determinants increases rapidly. In turn, the size of the Hamiltonian matrix also increases. Below, we will generalize the discussion to describe the exact wavefunction for $N$ electron systems in any given basis.

## The Configuration Interaction Wavefunction
Consider an $N$ electron system whose spatial molecular orbitals $\psi$ are expanded using a basis set with $K$ basis functions. The Hartree-Fock procedure generates a set {$\chi_i$} of $2K$ spin orbials. The Hartree-Fock ground state is just the Slater determinant of the occupied spin orbitals:
\begin{align}
    |\Psi_0 \rangle = | \chi_1 \chi_2 \cdots \chi_a \chi_b \cdots \chi_N \rangle
\end{align}
However, as illustrated above, there is a large set of possible Slater determinants. The number of possible determinants is equal to:
\begin{align}
    \begin{pmatrix}
        2K \\
        N
    \end{pmatrix}
    = \frac{(2K)!}{N!(2K-N)!}
\end{align}

Hartree-Fock theory ignores all these possible determinants except one! How big is this number? Let's examine it for a few values of $K$ and $N$.

In [None]:
import math

def number_determinants(K, N):
    return math.comb(2*K, N)

print("Number of Determinants")
N_values = range(2, 11, 2)
K_values = range(10, 31, 4)
print("%-18s" %("K\\N") + ''.join(["%-18i" %N for N in N_values]))
for K in K_values:
    print("%-18i" %(K) + ''.join(["%-18i"%(number_determinants(K, N)) for N in N_values]))

The number of determinants increases rapidly as the number of electrons and the size of the basis set increase.

How can we write the excited determinants? Consider again the Hartree-Fock determinant:
\begin{align}
    |\Psi_0 \rangle = | \chi_1 \chi_2 \cdots \chi_a \chi_b \cdots \chi_N \rangle
\end{align}
We can excite a single electron from a given occupied orbital $a$ to a given virtual orbital $r$. The new Slater determinant can be written as:
\begin{align}
    |\Psi_a^r \rangle = | \chi_1 \chi_2 \cdots \chi_r \chi_b \cdots \chi_N \rangle
\end{align}
This is a singly excited determinant. Similarly, we can build a doubly excited determinant as:
\begin{align}
    |\Psi_{ab}^{rs} \rangle = | \chi_1 \chi_2 \cdots \chi_r \chi_s \cdots \chi_N \rangle
\end{align}
With this procedure, we can build all possible excited determinants. Then, the form of the exact wavefunction is given by:
\begin{align}
    |\Phi \rangle = c_0 |\Psi_0 \rangle + 
                    \sum_{ra} c_a^r |\Psi_a^r \rangle + 
                    \sum_{a<b\\r<s} c_{ab}^{rs} |\Psi_{ab}^{rs} \rangle +
                    \sum_{a<b<c\\r<s<t} c_{abc}^{rst} |\Psi_{abc}^{rst} \rangle +
                    \cdots
\end{align}
With this wavefunction, the *exact* non-relativistic energy of the system can be calculated. This method is called **configuration interaction (CI)**. **Full CI** is the exact energy for a given finite basis set. **Complete CI** is the exact energy for a complete basis set.

Note the above expansion provides the exact wavefunction whether or not $|\Psi_0 \rangle$ is taken to be the Hartree-Fock wavefunction. In practice, $|\Psi_0 \rangle$ is set to the Hartree-Fock wavefunction if the system does not display static correlation.

## Slater Rules for Calculating Matrix Elements
We need to calculate a huge number of elements in order to build the Hamiltonian matrix for the CI wavefunction. We can potentially reduce the number by utilizing the symmetry of the wavefunction to remove coefficients that are necessarily zero. Still, the number of determinants will be extremely large. Fortunately, there are a few sets of rules for computing the matrix elements for the one- and two-electron operators. We will consider three general Slater determinants. The first determinant is given by:
\begin{align}
    |K \rangle = | \cdots \chi_m \chi_n \cdots \rangle
\end{align}
In the second determinant, we replace spin orbital $\chi_m$ with $\chi_p$, *keeping all other orbitals in the same order*:
\begin{align}
    |L \rangle = | \cdots \chi_p \chi_n \cdots \rangle
\end{align}
In the third determinant, we also replace spin orbital $\chi_n$ with $\chi_q$:
\begin{align}
    |M \rangle = | \cdots \chi_p \chi_q \cdots \rangle
\end{align}

The matrix elements for the one-electron operator $O_1 =\sum_{i=1}^N h(i)$ are given in the following table:

| Case                        | Value               |
|------------------------------|---------------------|
|$$\langle K| O_1 |K \rangle$$ | $$\sum_m^N [m|h|m]$$|
|$$\langle K| O_1 |L \rangle$$ | $$ [m|h|p]$$        |
|$$\langle K| O_1 |M \rangle$$ | $$0$$               |

For the two-electron operator $O_2 =\sum_{i=1}^N \sum_{j>i}^N r_{ij}^{-1}$, the matrix elements are given by:

| Case                                            | Value                                               |
|-------------------------------------------------|-----------------------------------------------------|
|$$\langle K| O_2 | K \rangle$$                   | $$\frac{1}{2} \sum_m^N \sum_n^N [mm|nn] - [mn|nm]$$ |
|$$\langle K| O_2 | L \rangle$$                   | $$ \sum_n^N [mp|nn] - [mn|np]$$                     |
|$$\langle K| O_2 | M \rangle$$                   | $$ [mp|nq] - [mq|np] $$                             |

For the two-electron operator, if two determinants differ by three or more spin orbitals, the matrix element will be zero.


For a Hartree-Fock reference state, we can use the $a,b$ indices for occupied orbitals and the $r, s$ indices for virtual orbitals. The two tables below show the matrix elements with the Hartree-Fock ground state:

| Case                                          | Value               |
|-----------------------------------------------|---------------------|
|$$\langle \Psi_0| O_1 |\Psi_0 \rangle$$        | $$\sum_a^N [a|h|a]$$|
|$$\langle \Psi_0| O_1 |\Psi_a^r \rangle$$      | $$ [a|h|r]$$        |
|$$\langle \Psi_0| O_1 |\Psi_{ab}^{rs} \rangle$$| $$0$$               |


| Case                                            | Value                                               |
|-------------------------------------------------|-----------------------------------------------------|
|$$\langle \Psi_0| O_2 |\Psi_0 \rangle$$          | $$ \sum_a^N \sum_{b>a}^N [aa|bb] - [ab|ba]$$        |
|$$\langle \Psi_0| O_2 |\Psi_a^r \rangle$$        | $$ \sum_b^N [ar|bb] - [ab|br]$$                     |
|$$\langle \Psi_0| O_2 |\Psi_{ab}^{rs} \rangle$$  | $$ [ar|bs] - [as|br] $$                             |

## Example: The Matrix Elements for the H$_2$ Minimal Basis Set Model
Using the above Slater rules, let's write the matrix elements for the H$_2$ molecule in a minimal basis set. The Hamiltoniam is just the sum of the one-electron and the two-electron operators. The Hamiltonian matrix is given by:

\begin{align}
\textbf{H} = \begin{pmatrix}
                \langle \Psi_0| \hat{H} |\Psi_0 \rangle &
                \langle \Psi_0| \hat{H} |\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle \\
                \langle \Psi_{1 \overline{1}}^{2 \overline{2}}| \hat{H} | \Psi_0 \rangle &
                \langle \Psi_{1 \overline{1}}^{2 \overline{2}}| \hat{H} |\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle
             \end{pmatrix} = 
             \begin{pmatrix}
                \langle \Psi_0| \hat{H} |\Psi_0 \rangle &
                \langle \Psi_0| \hat{H} |\Psi_{1 2}^{3 4} \rangle \\
                \langle \Psi_{1 2}^{3 4}| \hat{H} | \Psi_0 \rangle &
                \langle \Psi_{1 2}^{3 4}| \hat{H} | \Psi_{1 2}^{34} \rangle
             \end{pmatrix}         
\end{align}

The matrix elements are as follows:

\begin{align}
\langle \Psi_0| \hat{H} |\Psi_0 \rangle &= \langle \Psi_0| O_1 |\Psi_0 \rangle + \langle \Psi_0| O_2 |\Psi_0 \rangle \\
                                        &= \sum_a^2 [a|h|a] + \sum_a^2 \sum_{b>a}^2 [aa|bb] - [ab|ba] \\
                                        &= [1|h|1] + [2|h|2] + [11|22] - [12|21].
\end{align}
This is just the Hartree-Fock energy.

The second matrix element is:
\begin{align}
    \langle \Psi_0| \hat{H} |\Psi_{1 2}^{3 4} \rangle &=
            \langle \Psi_0| O_1 |\Psi_{1 2}^{3 4} \rangle + \langle \Psi_0| O_2 |\Psi_{1 2}^{3 4} \rangle \\
            &= [13|24] - [14|23].
\end{align}

For the third matrix element:
\begin{align}
\langle \Psi_{1 2}^{3 4}| \hat{H} | \Psi_0 \rangle = [31|42] - [32|41].
\end{align}

Finally, the fourth matrix element is:
\begin{align}
\langle \Psi_{1 2}^{3 4}| \hat{H} | \Psi_{1 2}^{34} \rangle &= 
    \langle \Psi_{1 2}^{3 4}| O_1 | \Psi_{1 2}^{34} \rangle +
    \langle \Psi_{1 2}^{3 4}| O_2 | \Psi_{1 2}^{34} \rangle \\
    &= [3|h|3] + [4|h|4] + [33|44] - [34|43].
\end{align}

All one-electron integrals will survive spin integration. Let's now do the spin integrations for the two-electron integrals. Let orbital $1$ have $\alpha$ spin, orbital $2$ have $\beta$ spin, orbital $3$ have $\alpha$ spin, and orbital $4$ have $\beta$ spin. If two orbitals have opposite spins and are in the same side of the bracket, the spin integration will yield zero. With this, the Hamiltonian matrix becomes:

\begin{align}
    \textbf{H} &= \begin{pmatrix}
                    \langle \Psi_0| \hat{H} |\Psi_0 \rangle &
                    \langle \Psi_0| \hat{H} |\Psi_{1 2}^{3 4} \rangle \\
                    \langle \Psi_{1 2}^{3 4}| \hat{H} | \Psi_0 \rangle &
                    \langle \Psi_{1 2}^{3 4}| \hat{H} | \Psi_{1 2}^{34} \rangle
                 \end{pmatrix} \\
               &= \begin{pmatrix}
                    (1|h|1) + (2|h|2) + (11|22) & (13|24)             \\
                    (31|42)                     & (3|h|3) + (4|h|4) + (33|44)
                  \end{pmatrix}
\end{align}

Below we will calculate the *exact* full-CI energy for H$_2$ for increasingly larger basis sets. It should be emphasized that full-CI energies can be computed only for systems with small numbers of electrons. We will first calculate the full-CI energy for H$_2$ in the minimal basis set by building and diagonalizing the above Hamiltonian matrix. Then, we will calculate the full-CI energies for larger basis sets using the Psi4 package.

In [None]:
import psi4
import numpy as np

psi4.core.set_output_file("output.dat", True)

mol = psi4.geometry("""
H
H 1 0.74
symmetry c1
""")

# We will first calculate the Hartree-Fock energy and wavefunction
e, wfn = psi4.energy("scf/sto-3g", return_wfn=True)

# We will create an object that holds the molecular integrals.
mints = psi4.core.MintsHelper(wfn.basisset())

# We first build the core Hamiltonian or the one-electron integral
T = np.asarray(mints.ao_kinetic())
V = np.asarray(mints.ao_potential())
H = T + V

# Get the occupied and virtual orbital coefficients.
# Note that this is a restricted Hartree-Fock calculation,
# so the alpha electrons and beta electrons have the same orbital coefficients and density matrix
Co = wfn.Ca_subset('AO','OCC')
Cv = wfn.Ca_subset('AO','VIR')

# Calculate the density matrix from the orbital coefficent
Do = np.einsum('pi,qi->pq', Co.np, Co.np)
Dv = np.einsum('pi,qi->pq', Cv.np, Cv.np)

# Calculate the one-electron integrals in the molecular orbital basis
h_11 = h_22 = np.einsum('pq,pq->', H, Do)
h_33 = h_44 = np.einsum('pq,pq->', H, Dv)

# Now calculate the two-electron integrals in the molecular orbital basis
# This can be calculated directly in Psi4 by providing the appropriate orbital coefficients
i_1122 = float(np.squeeze(np.asarray(mints.mo_eri(Co, Co, Co, Co))))
i_1324 = float(np.squeeze(np.asarray(mints.mo_eri(Co, Cv, Co, Cv))))
i_3142 = float(np.squeeze(np.asarray(mints.mo_eri(Cv, Co, Cv, Co))))
i_3344 = float(np.squeeze(np.asarray(mints.mo_eri(Cv, Cv, Cv, Cv))))

# With all the above components, we can now build the Hamiltonian full-CI matrix in a minimal basis
H_matrix = np.array([
    [h_11 + h_22 + i_1122 , i_1324         ],
    [i_3142,                h_33 + h_44 + i_3344]
]
)

# Find the eigenvalues of the matrix. The lowest eigenvalue is the ground state energy
energies = np.linalg.eigvals(H_matrix)

# Add the nuclear repulsion energy between the two hydrogen atoms
E_nuc = mol.nuclear_repulsion_energy()
fci = energies[0] + E_nuc

print("The full CI energy of H2 in the STO-3G basis set is equal to", fci)

fci_psi4 = psi4.energy("fci/sto-3g")
print("The full CI energy of H2 in the STO-3G basis set calculated with Psi4 is equal to", fci_psi4)
print("The calculated value agrees with Psi4:", np.isclose(fci, fci_psi4))

We have demonstrated above how to perform the full-CI calculation in a minimal basis set. Even though this value is *exact* within the space spanned by the minimal basis set, this value is not very useful because the basis set is very small and the basis set error is likely very large. We can now use the Psi4 package to calculate the full-CI energy for different basis sets.

In [None]:
import psi4
import matplotlib.pyplot as plt

psi4.core.set_output_file('output.dat', True)

psi4.set_options({"scf_type": "pk", "reference": "rhf"})

mol = psi4.geometry("""0 1
H
H 1 0.74
""")

basis_sets = ["sto-3g", "cc-pvdz", "cc-pvtz", "cc-pvqz", "cc-pv5z", "cc-pv6z"]

hf_energies = []
fci_energies = []
print("%-12s%-24s%-24s%-20s%-20s" %("Zeta", "# Basis functions", "# CI Determinants", "HF Energy (Ha)",
                                    "Full-CI Energy (Ha)"))
for i, basis in enumerate(basis_sets):
    hf_e, wfn = psi4.energy("scf/" + basis, return_wfn=True)
    hf_energies.append(hf_e)
    
    basisset_object = wfn.basisset()
    nbf = basisset_object.nbf()
    
    fci_e = psi4.energy("fci/" + basis)
    fci_energies.append(fci_e)
    
    nbf = wfn.basisset().nbf()
    n_det = number_determinants(nbf, 2)
    
    print("%-12i%-24i%-24i%-20.6f%-20.6f" %(i + 1, nbf, n_det, hf_e, fci_e))
    

zeta = range(1, 7)

plt.figure()
plt.plot(zeta, hf_energies, "-o", label="Hartree-Fock")
plt.plot(zeta, fci_energies, "--s", label="Full CI")
plt.xlabel("Zeta")
plt.ylabel("Energy (Hartree)")
plt.legend()
plt.show()

## Electron Correlation Methods

As mentioned above, the exact wavefunction is given by the following CI expansion:
\begin{align}
    |\Phi \rangle = c_0 |\Psi_0 \rangle + 
                    \sum_{ra} c_a^r |\Psi_a^r \rangle + 
                    \sum_{a<b\\r<s} c_{ab}^{rs} |\Psi_{ab}^{rs} \rangle +
                    \sum_{a<b<c\\r<s<t} c_{abc}^{rst} |\Psi_{abc}^{rst} \rangle +
                    \cdots
\end{align}

If we calculate the energy using this wavefunction, we get the exact (full-CI) energy within a given basis set and we obtain the exact wavefunction (i.e. we find the expansion coefficients). In practice, this is only possible for systems with a few electrons, so further approximations are necessary.

The various electron-correlation methods try to estimate the energy by selecting the important determinants. There are three main classes of the post-Hartree-Fock methods:
- Truncated CI
- Moller-Plesset Perturbation Theory
- Coupled Cluster Theory

## Truncated CI

In truncated CI, the full CI wavefunction is truncated to a given excitation level. We thus have the following hierarchy of truncated CI methods:
- CIS: Includes up to single excitation. $|\Phi \rangle = c_0 |\Psi_0 \rangle + \sum_{ra} c_a^r |\Psi_a^r \rangle$ 
- CISD: Includes up to double excitation: $    |\Phi \rangle = c_0 |\Psi_0 \rangle + 
                    \sum_{ra} c_a^r |\Psi_a^r \rangle + 
                    \sum_{a<b\\r<s} c_{ab}^{rs} |\Psi_{ab}^{rs} \rangle$
- CISDT: Includes up to triple excitation: $    |\Phi \rangle = c_0 |\Psi_0 \rangle + 
                    \sum_{ra} c_a^r |\Psi_a^r \rangle + 
                    \sum_{a<b\\r<s} c_{ab}^{rs} |\Psi_{ab}^{rs} \rangle +
                    \sum_{a<b<c\\r<s<t} c_{abc}^{rst} |\Psi_{abc}^{rst} \rangle$
- $\cdots$

In fact, CIS, which has only single excitations, will not contribute to the correlation energy of the ground state because of the **Brillouni's theorem**, which states that singly excited determinants $|\Psi_a^r \rangle$ will not interact directly with a reference Hartree-Fock determinant $|\Psi_0 \rangle$, i.e. $\langle \Psi_0| H | \Psi_a^r \rangle = 0$. However, in higher excitation level methods, singly excited determinants can contribute to the correlation energy be mixing with other determinants, e.g. $\langle \Psi_a^r| H | \Psi_{ab}^{rs} \rangle $. Why does the Brillouni's theorem hold? Using Slater's rule, the matrix element between the Hartree-Fock determinant and a singly excited determinant is given by:
\begin{align}
\langle \Psi_0| H | \Psi_a^r \rangle = \langle r | f | a \rangle,
\end{align}
where $f$ is the Fock operator. Since the molecular orbitals are eigenfunctions of the Fock operator, we have:
\begin{align}
\langle r | f | a \rangle = \epsilon_a \langle r | a \rangle = \epsilon_a \delta_{ar} = 0
\end{align}
because to be singly excited, $r$ must not be equal to $a$. Despite not contributing to the ground state correlation energy, the CIS method can be used to improve the description of excited electronic states.

The above truncated CI methods have large computational scaling compared to Hartree-Fock theory. For example, CISD scales as $N^6$ while CISDTQ scales as $N^{10}$.

### Advantage of Truncated CI
Besides its conceptual simplicity, the main advantage of truncated CI is that it is *varitational*. The energy can be systematically improved by including higher excitations determinants. For a system with $N$ electrons, including up to $N$ excitations is equivalent to full CI. For example, for He, CISD is equivalent to full CI as shown below:

In [None]:
import psi4
import numpy as np

psi4.geometry("""He
""")
e_fci = psi4.energy("fci/cc-pvdz")
e_cisd = psi4.energy("cisd/cc-pvdz")

print("Full CI energy is", e_fci)
print("CISD energy is", e_cisd)
print("Are they equal?", np.isclose(e_fci, e_cisd))

Now consider the Ne atom, which has ten electrons. We will compute the CI energy for different excitation levels using a small double-zeta basis set. The energy for all calculations will be an upper bound to the exact full CI energy. The energy will decrease progressively as the level of excitations increases.

In [None]:
import psi4
import matplotlib.pyplot as plt

psi4.geometry("""Ne
""")
psi4.set_options({"scf_type": "pk"})

e_hf = psi4.energy("cisd/cc-pcvdz")
e_cisd = psi4.energy("cisd/cc-pcvdz")
e_cisdt = psi4.energy("cisdt/cc-pcvdz")
e_cisdtq = psi4.energy("cisdtq/cc-pcvdz")
e_fci = psi4.energy("fci/cc-pcvdz")

energies = [e_hf, e_cisd, e_cisdt, e_cisdtq, e_fci]
labels = ["HF", "CISD", "CISDT", "CISDTQ", "FCI"]

# Plot data
plt.figure()
plt.plot(range(len(energies)), energies, "-o")
plt.gca().ticklabel_format(useOffset=False)
plt.gca().set_xticks(range(len(energies)))
plt.gca().set_xticklabels(labels)
plt.xlabel("Method")
plt.ylabel("Energy (Hartree)")
plt.show()

### The Size-Consistency Problem in Truncated CI
Truncated CI has the serious deficiency that it is not **size-consistent**. Consider two helium atoms separated by a very large distance. The two helium atoms are not interacting and therefore we expect the total energy to be the sum of the energies of the two helium atoms. We will see that this is not the case for truncated CI methods. To illustrate this, we will calculate the energy below using the Hartree-Fock, CISD, CISDT, and the full CI methods.

In [None]:
import psi4
import matplotlib.pyplot as plt

psi4.set_output_file("output.dat", True)

# First calculate the energy of a single helium atom
he = psi4.geometry("""He""")
e_he_hf = psi4.energy("scf/cc-pvdz")
e_he_cisd = psi4.energy("cisd/cc-pvdz")
e_he_cisdt = psi4.energy("cisdt/cc-pvdz")
e_he_fci = psi4.energy("fci/cc-pvdz")

# Now calculate the energy of a noninteracting helium dimer
he_dimer = psi4.geometry("""
    He    
    He 1 10000
""")

e_he2_hf = psi4.energy("scf/cc-pvdz")
e_he2_cisd = psi4.energy("cisd/cc-pvdz")
e_he2_cisdt = psi4.energy("cisdt/cc-pvdz")
e_he2_fci = psi4.energy("fci/cc-pvdz")
 
# Now calculate the difference between the energy of the dimer and twice the energy of the monomer
energies = [e_he2_hf-2*e_he_hf, e_he2_cisd-2*e_he_cisd, e_he2_cisdt-2*e_he_cisdt, e_he2_fci-2*e_he_fci]

labels = ["HF", "CISD", "CISDT", "FCI"]

plt.figure()
plt.plot(range(len(energies)), energies, "o")
plt.gca().set_xticks(range(len(energies)))
plt.gca().set_xticklabels(labels)
plt.xlabel("Method")
plt.ylabel("$E_{dimer}-2 E_{monomer}$  (Hartree)")
plt.show()

Clearly, the CISD and CISDT methods do not predict the correct energy for two noninteracting helium molecules. Unfortunately, this problem becomes even worse as the system size increases. To illustrate this, we will calculate below the energy of a noninteracing helium cluster of various sizes and explore the error as the number increases.

In [None]:
import psi4
import matplotlib.pyplot as plt

psi4.set_output_file("output.dat", True)

# First calculate the energy of a single helium atom
he = psi4.geometry("""He""")
e_he_cisd = psi4.energy("cisd/cc-pvdz")

# Now calculate the energy of a noninteracting helium cluster of various sizes

energies = []
geometry = """He 0 0 0\n"""
for N in range(2, 11):
    geometry += """He 0 0 %f\n""" %(1000*(N-1))
    psi4.geometry(geometry)
    e_heN_cisd = psi4.energy("cisd/cc-pvdz")

    energies.append(e_heN_cisd - N*e_he_cisd)
    #energies.append((e_heN_cisd - N*e_he_cisd)/N)


plt.figure()
plt.plot(range(2, 11), energies, "-o")
plt.xlabel("Number of Helium Atoms (N)")
plt.ylabel("$E_{N}-N E_{monomer}$  (Hartree)")
plt.show()

The error increases significantly. In fact, it can be shown that the correlation energy per monomer for truncated CI methods vanishes in the limit of large $N$. This problem is present for all truncated CI methods.

### Why is truncated CI not size-consistent?
Consider a single helium atom in a minimal basis set. There are two possible Slater determinants: one without any excitation and one with double excitations.

\begin{align}
|\Psi_0 \rangle = | 1 \overline{1} \rangle = & \underline{\quad} \, \psi_2 \\
                                             & \underline{\uparrow \downarrow} \, \psi_1.
\\
|\Psi_{1 \overline{1}}^{2 \overline{2}} \rangle = |2 \overline{2} \rangle = 
    & \underline{\uparrow \downarrow} \, \psi_2 \\
    & \underline{\quad} \, \psi_1. 
\end{align}

The CISD method can correctly describe these two configurations. Now consider two noninteracting helium dimers. Again, each atom can have these two configurations, so we have the following possibilities:
\begin{align}
|0 \rangle = &  \underline{\quad} \, \psi_2^1 \quad  \underline{\quad} \, \psi_2^2 \\
& \underline{\uparrow \downarrow} \, \psi_1^1 \quad \underline{\uparrow \downarrow} \, \psi_1^2
\end{align}

\begin{align}
|D_1 \rangle = &  \underline{\uparrow \downarrow} \, \psi_2^1 \quad  \underline{\quad} \, \psi_2^2 \\
& \underline{\quad} \, \psi_1^1 \quad \underline{\uparrow \downarrow} \, \psi_1^2
\end{align}

\begin{align}
|D_2 \rangle = &  \underline{\quad} \, \psi_2^1 \quad  \underline{\uparrow \downarrow} \, \psi_2^2 \\
& \underline{\uparrow \downarrow} \, \psi_1^1 \quad \underline{\quad} \, \psi_1^2
\end{align}

\begin{align}
|Q \rangle = &  \underline{\uparrow \downarrow} \, \psi_2^1 \quad  \underline{\uparrow \downarrow} \, \psi_2^2 \\
& \underline{\quad} \, \psi_1^1 \quad \underline{\quad} \, \psi_1^2
\end{align}

The last configuration requires quadruple excitation, while CISD is truncated at double excitation. Thus, CISD cannot describe two noninteracting helium atoms correctly.

## Perturbation Theory

**Perturbation theory** is a general approximation technique in quantum mechanics. In the context of post-Hartree-Fock methods, it is called Moller-Plesset perturbation theory (MP). Perturbation theory provides a size-consistent, *non-variational* approximation technique for calculating the correlation energy.

### General Time-Independent Perturbation Theory
Perturbation theory is an approach for obtaining an *approximate* solution for a problem by using information from an *exact* solution for a sufficiently similar problem. The new problem is formulated as a small *perturbation* from the exact problem.

In general perturbation theory, we can write the Hamiltonian as a sum of two terms: $\hat{H} = \hat{H}^0 + \lambda \hat{H}^\prime$. $\hat{H}^0$ is the unperturbed Hamiltonian with eigenfunctions $\psi^0_n$ and eigenvalues $E^0_n$, $\hat{H}^\prime$ is the perturbation, and $\lambda$ is a parameter (it will be set to 1 at the end).

We can write the wavefunction and energy as a power series in $\lambda$:

\begin{align}
\psi_n &= \psi_n^0 + \lambda \psi_n^1 + \lambda^2 \psi_n^2 + \cdots; \\
E_n &= E^0_n + \lambda E^1_n + \lambda^2 E^2_n + \cdots.
\end{align}

$E^0_n$ is the energy of the unperturbed system, while $E^1_n$ and $E^2_n$ are the first- and second-order corrections to the energy.


Substituting this into the time-indepedent Schrodinger's equation
\begin{align}
\hat{H} \psi_n &= E_n \psi_n \\
\left ( \hat{H}^0 + \lambda \hat{H}^\prime \right ) \left [\psi_n^0 + \lambda \psi_n^1 + \lambda \psi_n^2 + \cdots \right] &= 
\left ( E^0_n + \lambda E^1_n + \lambda^2 E^2_n + \cdots \right )  \left [ \psi_n^0 + \lambda \psi_n^1 + \lambda^2 \psi_n^2 + \cdots \right ]
\end{align}

Collecting like powers of $\lambda$:
\begin{align}
\hat{H}^0 \psi_n^0 + \lambda \left (\hat{H}^0 \psi_n^1 + \hat{H}^\prime \psi_n^0 \right) + \lambda^2 \left (\hat{H}^0 \psi_n^2 + \hat{H}^\prime \psi_n^1 \right) + \cdots =
E_n^0 \psi_n^0 + \lambda \left ( E_n^0 \psi_n^1 + E_n^1 \psi_n^0 \right ) + \lambda^2 \left ( E_n^0 \psi_n^2 E_n^1 \psi_n^1 + E_n^2 \psi_n^0 \right ) + \cdots
\end{align}

Equating terms that have the same power of $\lambda$, we get:
\begin{align}
\hat{H}^0 \psi_n^0 &= E_n^0 \psi_n^0 \\
\hat{H}^0 \psi_n^1 + \hat{H}^\prime \psi_n^0 &= E_n^0 \psi_n^1 + E_n^1 \psi_n^0 \\
\hat{H}^0 \psi_n^2 + \hat{H}^\prime \psi_n^1 &= E_n^0 \psi_n^2 E_n^1 \psi_n^1 + E_n^2 \psi_n^0.
\end{align}

The first equation is simply the Schrodinger's equation for the unperturbed Hamiltonian.

For the second equation, multiplying by $\psi_n^{0*}$ from the left and integrating, we get:
\begin{align}
\int dx \psi_n^{0*} \left ( \hat{H}^0 \psi_n^1 + \hat{H}^\prime \psi_n^0 \right ) &=  \int dx \psi_n^{0*} \left ( E_n^0 \psi_n^1 + E_n^1 \psi_n^0 \right )
\end{align}

Utilizing the hermiticity of the Hamiltonian and the orthonormality of the eigenstates, we get:

\begin{align}
E^1_n &= \int \psi_n^{0*} \hat{H}^\prime \psi_n^0 dx,
\end{align}

Thus, the first-order correction to the energy is the expectation value of the perturbation Hamiltonian in the unperturbed eigenstate.

The first-order correction to the wavefunction can be found to be:

\begin{align}
\psi_n^1 = \sum_{m \neq n} = \frac{\int dx \psi_m^{0*} \hat{H}^\prime \psi_n^0}{E_n^0 - E_m^0} 
\end{align}

Finally, the second-order correction to the energy is given by:
\begin{align}
E^2_n &= \int dx \psi_n^{0*} \hat{H}^\prime \psi_n^1 \\
E^2_n &= \sum_{m \neq n} \frac{|\int dx \psi_m^{0*} \hat{H}^\prime \psi_n^0|^2}{E_n^0 - E_m^0}.
\end{align}

Note that the above perturbation theory description only works if the energy level are non-degenerate. Degeneracy occurs when multiple states have the same energy. In this case, a **degenerate perturbation theory** is used.

### Moller-Plesset perturbation theory

In Moller-Plesset perturbation theory, the unperturbed Hamiltonian is taken to be the sum of the fock operators:
\begin{align}
H^0 = \sum_{i=1}^{N_{elec}} f_i  &= \sum_{i=1}^{N_{elec}} h_i + v_i^{\rm HF} \\
                                  &= \sum_{i=1}^{N_{elec}} \left ( h_i + \sum_{j=1}^{N_{elec}} (J_j-K_j) \right )
\end{align}

The solution for this unperturbed Hamiltonian is known- the eigenfunctions are the Hartree-Fock molecular orbitals and the eigenvalues are the molecular orbital energies:
\begin{align}
E^0 = \langle \Psi_0 | \sum_{i=1}^{N_{elec}} f_i | \Psi_0 \rangle = \sum_{i=1}^{N_{elec}} \epsilon_i
\end{align}

You may recall that the sum of the orbital energies is not equal to the Hartree-Fock energy. The Fock operator calculates the electron-electron repulsion twice. In contrast, the Hartree-Fock energy is the expectation value for the correct Hamiltonian.

To obtain the correct energy, we take the perturbation Hamiltonian to be the difference between the exact electron-electron repulsion and the Hartree-Fock average electron-electron repulsion:

\begin{align}
    H^\prime = \sum_{i<j}^{N_{elec}} r_{ij}^{-1} - \sum_{i=1}^{N_{elec}} v_i^{\rm HF}
\end{align}

We can calculate the first-order correction to the energy from the above expression:
\begin{align}
E^1_n &= \int \psi_n^{0*} \hat{H}^\prime \psi_n^0 dx,
\end{align}
However, notice the following:
\begin{align}
E^0 + E^1 &= \langle \Psi_0 | H^0 | \Psi_0 \rangle +  \langle \Psi_0 | H^\prime | \Psi_0 \rangle \\
          &= \langle \Psi_0 | \sum_{i=1}^{N_{elec}} h_i + \sum_{i<j}^{N_{elec}} r_{ij}^{-1} | \Psi_0 \rangle \\
          &= \langle \Psi_0 | H | \Psi_0 \rangle \\
          &= E_{\rm HF}
\end{align}
Thus, the sum of the zeroth-order and first-order energies is equal to the Hartree-Fock energy. The first correction to the Hartree-Fock energy comes from the second-order energy.

The second-order correction to the energy is given by:
\begin{align}
    E^2_n &= \sum_{m \neq n} \frac{|\int dx \psi_m^{0*} \hat{H}^\prime \psi_n^0|^2}{E_n^0 - E_m^0}.
\end{align}
or using the bra-ket notation and taking the ground state wavefunction to be the Hartree-Fock Slater determinant:
\begin{align}
    E^2 = \sum_{m>0} \frac{ | \langle \Psi_m | H^\prime | \Psi_0 \rangle |^2 } {E_0 - E_m}
\end{align}

Let's focus first on the numerator. Remember that $H^\prime = H-H^0$ and thus we can write:
\begin{align}
\sum_{m>0} \langle \Psi_m | H^\prime | \Psi_0 \rangle
    &= \sum_{m>0} \langle \Psi_m | H-H^0 | \Psi_0 \rangle \\        
    &= \sum_{m>0} \langle \Psi_m | H| \Psi_0 \rangle - \langle \Psi_m | H^0 | \Psi_0 \rangle  \\
    &= \sum_{m>0} \langle \Psi_m | H| \Psi_0 \rangle - \sum_i^{\rm occ} \epsilon_i \langle \Psi_m |\Psi_0 \rangle \\
    &= \sum_{m>0} \langle \Psi_m | H| \Psi_0 \rangle
\end{align}
The last simplification stems from the fact that the ground and excited state determinants are orthogonal.

Now, how do we calculate the remaining integrals? We can use Slater rules to calculate the energies involving the Hartree-Fock determinant and excited state determinants. Remember that singly excited determinants will not contribute to the ground state energy because of the Brillouin’s theorem. Furthermore, because the integrals involve a two-electron operator, expressions involving determinants that differ by more than two electron will be zero. Thus, only doubly excited determinants will contribute ($\langle \Psi_0| O_2 |\Psi_{ab}^{rs} \rangle = [ar|bs] - [as|br]$). Thus, we get:
\begin{align}
    \sum_{m>0} \langle \Psi_m | H^\prime | \Psi_0 \rangle =
         \sum_{a<b}^{\rm occ} \sum_{r<s}^{\rm vir} [ar|bs] - [as|br]
\end{align}

Now, how do we compute the denomentator? The denomentator is the difference between the energy of the Hartree-Fock determinant and doubly excited determinants. The energy of each excited determinant will include the energy of each filled virtual orbital and exclude the energy of each empty occupied orbital. Thus, the difference in the energy between the excited determinant and the Hartree-Fock determinant is given by:
\begin{align}
E_0 - E_m = \sum_{a<b}^{\rm occ} \sum_{r<s}^{\rm vir} \epsilon_a + \epsilon_b - \epsilon_r - \epsilon_s
\end{align}

Thus, the expression for the second-order energy correction becomes:
\begin{align}
E^2 = \sum_{a<b}^{\rm occ} \sum_{r<s}^{\rm vir} \frac{|[ar|bs] - [as|br]|^2}{\epsilon_a + \epsilon_b - \epsilon_r - \epsilon_s}
\end{align}

The second-order Moller-Plesset (MP2) energy is the sum of the energies up to second-order:
\begin{align}
    E_{\rm MP2} = E^0 + E^1 + E^2 = E_{\rm HF} + \sum_{a<b}^{\rm occ} \sum_{r<s}^{\rm vir} \frac{|[ar|bs] - [as|br]|^2}{\epsilon_a + \epsilon_b - \epsilon_r - \epsilon_s}
\end{align}
By including higher order terms in the perturbation series, one obtains the MP3, MP4, $\cdots$, MP$n$ methods. 

For closed-shell systems, the $E^2$ expression can be written in terms of spatial intergrals as:
\begin{align}
E^2 = \sum_{abrs}^{N/2} \frac{(ar|bs) [2 (ra|sb) - (rb|sa)]} {\epsilon_a + \epsilon_b - \epsilon_r - \epsilon_s}
\end{align}

We illustrate below how to calculate the MP2 energy using the above formula.

In [None]:
import psi4
import numpy as np

psi4.core.set_output_file("output.dat", True)

mol = psi4.geometry("""
H
H 1 0.74
""")

psi4.set_options({"scf_type": "pk", 
                  'mp2_type': 'conv'})

# We will first calculate the Hartree-Fock energy and wavefunction
e_scf, wfn = psi4.energy("scf/cc-pVDZ", return_wfn=True)

# We will create an object that holds the molecular integrals.
mints = psi4.core.MintsHelper(wfn.basisset())

# Get the occupied and virtual orbital coefficients.
# Note that this is a restricted Hartree-Fock calculation,
# so the alpha electrons and beta electrons have the same orbital coefficients and density matrix
Co = wfn.Ca_subset('AO','OCC')
Cv = wfn.Ca_subset('AO','VIR')

# Now calculate the two-electron integrals in the molecular orbital basis
# This can be calculated directly in Psi4 by providing the appropriate orbital coefficients
i_arbs = np.asarray(mints.mo_eri(Co, Cv, Co, Cv))
i_rasb = np.asarray(mints.mo_eri(Cv, Co, Cv, Co))
i_rbsa = np.asarray(mints.mo_eri(Cv, Co, Cv, Co))

# Now calculate the E^2 energy 
N_occ = wfn.nalpha()
N_vir = wfn.nmo()-N_occ
occ_epsilons = wfn.epsilon_a_subset("AO", "OCC").np
vir_epsilons = wfn.epsilon_a_subset("AO", "VIR").np

e2 = 0.0
for a in range(N_occ):
    for b in range(N_occ):
        for r in range(N_vir):
            for s in range(N_vir):
                num = i_arbs[a, r, b, s]*(2*i_rasb[r, a, s, b] - i_rbsa[r, b, s, a])
                den = occ_epsilons[a] + occ_epsilons[b] - vir_epsilons[r] - vir_epsilons[s]
                e2 += num/den

# Calculate the MP2 energy
e_mp2 = e_scf + e2

# Calculate the MP2 energy with Psi4
e_mp2_psi4 = psi4.energy("mp2/cc-pVDZ")


print("The MP2 energy is ", e_mp2)
print("The calculated value agrees with Psi4:", np.isclose(e_mp2, e_mp2_psi4))

### Scaling of the MP2 Method
The MP2 method scales as $N^5$. In order to calculate the MP2 energy, we need to express the two electron integrals in term of the molecular orbital (MO) basis instead of the atomic orbital (AO) basis. How is this conversion carried out? We need to perform the following contraction:
\begin{align}
    (ar|bs) = C_{\mu a}C_{\nu r}(\mu \nu | \lambda \sigma)C_{\lambda b}C_{\sigma s}
\end{align}
As written, this contraction scales as $N^8$ because it involves eight unique indices! However, the contractions can be performed in series such that each contraction scales as $N^5$:
\begin{align}
    (a r|b s) = \left[C_{\mu a} \left[C_{\nu r} \left[C_{\lambda b} \left[C_{ \sigma s}(\mu \nu |\lambda\sigma)\right] \right] \right] \right]
\end{align}

This transformation from the AO basis to the MO basis is required for all post-Hartree-Fock methods. Therefore, electron-correlation methods scale at least as $N^5$.

### Applicability of the MP Method
It is not guranteed that including higher order perturbation terms will actually improve the energy. The perturbation in the Moller-Plesset method is the full electron-electron repulsion, which is quite large, and therefore the series is not guranteed to converge. Furthermore, the MP series is not variational. Unlike the truncated CI methods, for which the energy decreases as the excitation level increases, the MP series often displays oscillatory behavior. We show below an example of this behavior for the energy of the Ne atom. In practice, the MP2 method is widely used for computing electron correlation while higher order MP methods are used less often. Coupled cluster theory, which is discussed below, is generally preferred for performing more accurate calculations.

The fact that perturbation theory is not variational is not a big limitation. In chemical applications, *energy differences* often determines chemical properties. Having upper bounds on absolute energies does not gurantee that the relative energy is an upper bound to the true relative energy. Therefore, what is more important is that the errors remain relatively constant for different systems so that the errors are cancelled out.

In [None]:
import psi4
import numpy as np
import matplotlib.pyplot as plt

psi4.core.set_output_file("output.dat", True)

mol = psi4.geometry("""
Ne
""")

psi4.set_options({"scf_type": "pk", 
                  'mp2_type': 'conv'})

e_fci = psi4.energy("fci/cc-pVDZ")

mp_series = range(2, 8)
energies = []

for n in mp_series:
    energies.append(psi4.energy("mp%i/cc-pVDZ" %n))

energies = np.array(energies)
errors = e_fci - energies

plt.figure()
plt.plot(mp_series, errors, "-o")
plt.xlabel("MP Level")
plt.ylabel("Energy Error (Hartree)")
plt.show()

## Coupled Cluster Theory

The coupled cluster (CC) theory is a size-consistent, highly accurate, approach for estimating the correlation energy. The main idea of the CC theory is that the *exact* wavefunction within a given basis can be written as:
\begin{align}
    \Psi_{\rm CC} = e^{\textbf{T}} \Psi_{\rm HF},
\end{align}
where $\textbf{T}$ is called the cluster operator and is defined as:
\begin{align}
    {\textbf{T}} = {\textbf{T}}_1 + {\textbf{T}}_2 + {\textbf{T}}_3 + \cdots + {\textbf{T}}_n
\end{align}

where $n$ is the total number of electrons and the $\textbf{T}_i$ operator generates all determinants that have $i$ excitations from the reference determinant. For example, 
\begin{align}
    \textbf{T}_1 \Psi_{\rm HF} &= \sum_a^{\rm occ} \sum_r^{\rm vir} t_a^r \Psi_a^r \\
    \textbf{T}_2 \Psi_{\rm HF} &= \sum_{a<b}^{\rm occ} \sum_{r<t}^{\rm vir} t_{ab}^{rt} \Psi_{ab}^{rt}. 
\end{align}
The $t$ coefficients are called the cluster amplitude.

How do we define the exponential of an operator? We can use Taylor expansion to write it as:
\begin{align}
    e^{\textbf{T}} = 1 + \textbf{T} + \frac{1}{2!} \textbf{T} + \frac{1}{3!} \textbf{T} + \cdots.
\end{align}

The $\textbf{T}$ operator by itself can generate all possible excited determinants and thus can be used to obtain the full CI wavefunction. So what is the point of using the exponential form $e^{\textbf{T}}$? The advantage becomes evident when the $\textbf{T}$ operator is truncated to a certain level of excitation. For example, consider the approximation that $\textbf{T}=\textbf{T}_2$. The Taylor expansion of the exponential function gives:
\begin{align}
    \Psi_{\rm CCD} &= e^{\textbf{T}} \Psi_{HF} \\
               &= \left ( 1 + \textbf{T}_2 + \frac{\textbf{T}_2^2}{2!} + \frac{\textbf{T}_2^3}{3!} + \cdots \right )
                 \Psi_{HF}.
\end{align}
The first two terms in the expansion would define the CID method where only double excitations are included in truncated CI.  The remaining terms involve products of excitation operators. For example, the $\textbf{T}_2^2$ operator generates two independent, or *disconnected*, double exciations, which is a quadruple excitation. As discussed above for the helium dimer, the lack of such excitations *is* what makes truncated CI methods lack size-consistency. The use of the exponential form in coupled cluster theory ensures its size-consistency.

How do we estimate the energy using the above trial wavefunction? In principle, we can use the variational method to estimate the energy. That is, we calculate the energy using the usual expectation value formula:
\begin{align}
    E_{\rm CC}^{\rm var} = \frac{\langle \Psi_{\rm CC} |\hat{H}| \Psi_{\rm CC} \rangle}
                                {\langle \Psi_{\rm CC} | \Psi_{\rm CC} \rangle }
\end{align}
and minimize the energy with respect to the cluster amplitdues $t$. Unfortunately, this leads to  very complicated sets of equations. Therefore, the coupled cluster energy is usually estimated using one of the following two expressions:
\begin{align}
    E_{\rm CC} &= \langle \Psi_{\rm HF} |\hat{H} e^{\textbf{T}} | \Psi_{\rm HF} \rangle \\
    E_{\rm CC} &= \langle \Psi_{\rm HF} |e^{\textbf{-T}} \hat{H} e^{\textbf{T}} | \Psi_{\rm HF} \rangle.
\end{align}
For this reason, truncated coupled cluster theory is *non-variational*.

### Truncated Coupled Cluster Methods
In truncated coupled cluster methods, the cluster operator $\textbf{T}$ is truncated to a given excitation level. Recall that Brillouni's theorem states that singly excited determinants do not directly interact with the Hartree-Fock determinant. Therefore, the lowest level of truncated coupled cluster methods would be CCD, where $\textbf{T}=\textbf{T}_2$. Since the computational cost of including single exciations is outweighed by the increase in accuracy, it is usually included, defining the CCSD method. The CCSD method scales as $N^6$. Including triple and quadruple excitations (i.e., $\textbf{T}_3$ and $\textbf{T}_4$) defines the CCSDT and CCSDTQ methods.

Including the full triple (or quadruple) excitations is computationally expensive. An estimate of the triple contribution can be calculated using perturbation theory and added to the CCSD energy. This defines the CCSD(T) method. This method can be applied to moderately sized system. Because of its high accuracy, it is often called the *gold-standard* in *ab initio* methods.

We will illustrate below the size-consistency of coupled cluster theory for helium dimer. 

In [None]:
import psi4
import matplotlib.pyplot as plt

psi4.set_output_file("output.dat", True)

# First calculate the energy of a single helium atom
he = psi4.geometry("""He""")
e_he_hf = psi4.energy("scf/cc-pvdz")
e_he_cisd = psi4.energy("cisd/cc-pvdz")
e_he_cisdt = psi4.energy("cisdt/cc-pvdz")
e_he_ccsd = psi4.energy("ccsd/cc-pvdz")
e_he_fci = psi4.energy("fci/cc-pvdz")

# Now calculate the energy of a noninteracting helium dimer
he_dimer = psi4.geometry("""
    He    
    He 1 10000
""")

e_he2_hf = psi4.energy("scf/cc-pvdz")
e_he2_cisd = psi4.energy("cisd/cc-pvdz")
e_he2_cisdt = psi4.energy("cisdt/cc-pvdz")
e_he2_ccsd = psi4.energy("ccsd/cc-pvdz")
e_he2_fci = psi4.energy("fci/cc-pvdz")
 
# Now calculate the difference between the energy of the dimer and twice the energy of the monomer
energies = [e_he2_hf-2*e_he_hf, e_he2_cisd-2*e_he_cisd, e_he2_cisdt-2*e_he_cisdt, 
            e_he2_ccsd-2*e_he_ccsd, e_he2_fci-2*e_he_fci]

labels = ["HF", "CISD", "CISDT", "CCSD", "FCI"]

plt.figure()
plt.plot(range(len(energies)), energies, "o")
plt.gca().set_xticks(range(len(energies)))
plt.gca().set_xticklabels(labels)
plt.xlabel("Method")
plt.ylabel("$E_{dimer}-2 E_{monomer}$  (Hartree)")
plt.show()

## Multi-reference Methods

As mentioned above, electron correlation is often divided into two types: static and dynamical. In static correlation, the Hartree-Fock single Slater determinant is not a good starting point upon which to improve the energy. This generally happens because of degeneracies in the energy levels. This applies to some molecules in their equilibrium geometries. Degeneracies also often arise from bond dissociation.

Consider for example the $\pi$-orbitals of trimethylenemethane (TMM) shown below. From Hund's rule, it is expected that the triplet state is lower in energy than the singlet state. However, suppose that we are interested in the singlet state. Each one of the singlet configurations is a valid way for organizing the electrons. Now suppose a restricted Hartree-Fock calculation is carried out. Because Hartree-Fock theory is a single-reference method, only one of these determinant will be chosen, and the Hartree-Fock energy will be determined based on the occupied orbitals only. This is clearly not the correct description. We need a method that treats the two singlet configurations on equal footing. This is where multireference methods come into play.

![](TMM_orbitals.png)

In multireference methods, the occupied orbitals in each configuration are first selected. Then, the shapes of the molecular orbitals as well as the expansion coefficients are optimized during the self-consistent field procedure. For example, one starting point for the TMM system would be:
\begin{align}
    \Psi_{\rm MCSCF} = c_1 |\cdots \pi_1^2 \pi_2^2 \rangle + c_2 |\cdots \pi_1^2 \pi_3^2 \rangle
\end{align}
However, because the $\pi$ orbitals in conjugated $\pi$ systems are often close in energy, it is better to include all the $\pi$ orbitals in the expansion. Because we have four electrons that we want to distribute in four orbitals, this is called a (4, 4) calculation. The first number in this notation is the number of electrons and the second is the number of orbitals.

How can we distribute $m$ electrons in $n$ orbitals? The number of possible signlet configurations is given by:
\begin{align}
    N = \frac{n! (n+1)! }{(\frac{m}{2})! (\frac{m}{2}+1)! (n-\frac{m}{2})! (n-\frac{m}{2}+1)!}
\end{align}

For $n=m=4$, $N=20$. The number grows quickly as $n$ and $m$ increases. How do we decide on which of the possible determinants should we include in the expansion? Sometimes, chemical intuition can guide on choosing the most important determinants. For example, a reasonable starting guess for TMM would be:
\begin{align}
    \Psi_{\rm MCSCF} = c_1 |\cdots \pi_1^2 \pi_2^2 \pi_3^0 \pi_4^0 \rangle + c_2 |\cdots \pi_1^2 \pi_2^0 \pi_3^2 \pi_4^0 \rangle + c_3 |\cdots \pi_1^2 \pi_2^0 \pi_3^0 \pi_4^2 \rangle + c_4 |\cdots \pi_1^0 \pi_2^2 \pi_3^2 \pi_4^0 \rangle 
\end{align}
However, instead of picking and choosing based on a personal judgement, one can use *all* possible configurations. This is called the **complete active space self-consistent field (CASSCF)** method. The active space here refers to the electrons and orbitals chosen in the procedure. For example, for TMM, the active space represents the $\pi$ orbitals and $\pi$ electrons. If all the electrons and all the orbitals in the system are chosen, then CASSCF would be equivalent to full CI.

The **restricted active space self-consistent field (RASSCF)** method limits the number of configurations and therefore is less computationally expensive than CASSCF. The active space is divided into three regions: RAS1, RAS2, and RAS3. The RAS2 space includes some of the highest occupied and the lowest unoccupied orbitals, and the CASSCF method is implemented. On the other hand, only a subset of excitations (e.g. 0, 1, 2) are allowed from the occupied RAS1 space to the virtual RAS3 space. Furthermore, an additional approximation that lowers computational cost is to *freeze* some of the occupied and virtual orbitals. For these frozen orbitals, the shape is not reoptimized during the MCSCF procedure. The figure below illustrates the possible orbital roles in the MCSCF procedure.

<img src="CASSCF.png" alt="drawing" width="500px;"/>

The above multi-reference methods are often not adequate for capturing *dynamical* correlation. Thus, often dynamical correlation is estimated using multi-reference truncated configuration interaction, multi-reference perturbation theory, or multi-reference coupled cluster theory.

## Performance of Hartree-Fock and Post-Hartree-Fock Methods
We can define the error of a given method as the difference between the calculated value for a given property and the exact value computed with the nonrelativistic Schrodinger's equation. The exact value can in principle be calculated using the complete CI model. There are two sources of error in Hartree-Fock and post-Hartree-Fock calculations: (1) The use of an incomplete atomic orbital basis set; and (2) the incomplete accounting of electron correlation by neglecting some Slater determinants. It is often the case that these errors have opposite directions, and therefore favorable cancellation of errors can be leveraged.

Since the complete CI value is an unattainable theoretical quantity, we can use two approaches to evaluate the accuracy of a given method: (1) compare the predictions of the method with the best theoretical estimate available; and (2) compare the predictions with experimental data. Generally speaking, for small systems in the gas phase, theoretical predictions can often faithfully reproduce experimental data.

Benchmarking studies that evaluate the accuracy of theoretical methods usually select a subset of molecules and compute the desired properties. Then, the errors in the method are reported using various statistical measures, such as the **mean absolute error** and the **maximum absolute error**. The physical quantities investigated can be energetic, structural, spectroscopic, etc., properties.

We report below the performance of various theoretical methods, reproducing the figures from the following reference: Helgaker, T.; Jorgensen, P.; Olsen, J. *Molecular Electronic-Structure Theory*, 1st ed.; John Wiley & Sons: Nashville, TN, 2014. The test set is 20 small organic molecules and the compariosn is with respect to experimental data.

The figure below shows the distribution in the bond length error (in pm). Generally speaking, the bond length decreases with the increase in the basis set size, and increases with the increase in the number of Slater determinants. Hartree-Fock theory underestimates the bond length and have broad error distributions. Increasing the basis set size does not improve the description. The MP2 and MP4 methods show a better performance than the MP3 method, which has a systematic error. The CCSD(T) shows the best performance overall, with a very narrow error peak for the cc-pVTZ and the cc-pVQZ basis sets. The cc-pVDZ basis set is generally inadequate.
<img src="bond_length_error.png" alt="drawing" width="500px;"/>

The following figure shows the error in bond angle (in degrees). The errors in the bond length and bond angle are coupled. The shorter the bond length, the closer the atoms, and thus the greater the repulsion, and the larger the bond angle.
<img src="bond_angle_error.png" alt="drawing" width="500px;"/>

Finally, the following figure shows the distribution of reaction enthalpy errors (in kJ/mol) for a set of reactions at 0 K. The calculated enthalpies contain electronic and vibrational contributions. Again, the Hartree-Fock results show broad errors, while the CCSD(T) model again has the highest accuracy. 
<img src="reaction_enthalpy_error.png" alt="drawing" width="600px;"/>

Of course, the above analysis is for a small sample of molecules and for only a few properties. Careful analysis and benchmarking of theoretical methods should always be performed in order to make reliable predictions with good estimation of the errors. 

## Useful Resources

- Szabo, A.; Ostlund, N. S. *Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory*; Dover Publications: Mineola, NY, 1996. (Sections 2.2.5-2.2.7, 2.3.3, Chapters 4 and 6)  
- Cramer, C. J. *Essentials of Computational Chemistry: Theories and Models*, 2nd ed.; John Wiley & Sons: Chichester, England, 2004. (Chapter 7)  
- Jensen, F. *Introduction to Computational Chemistry*, 3rd ed.; John Wiley & Sons: Nashville, TN, 2017. (Chapter 4) 