<a href="https://colab.research.google.com/github/jamesETsmith/2022_simons_collab_pyscf_workshop/blob/main/demos/03_TDL_Convergence.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setting up the Jupyter notebook

* We need to install a few things before we get started
  * [PySCF](https://pyscf.org/) for the quantum chemistry
  * [NumPy](https://numpy.org/) for manipulating arrays
  * [plotly](https://plotly.com/python/) for plotting
  * [pandas](https://pandas.pydata.org/) for manipulating table data

In [None]:
%pip install numpy pyscf plotly==5.8.0 pandas

# What makes a meaningful quantum chemistry calculation for solid systems?

In calculations for solids, we consider a unit cell of atoms which we then infinitely repeat in space using periodic boundary conditions. This means that only the atoms in the unit cell are independently treated. The other atoms are periodic images. See e.g. https://en.wikipedia.org/wiki/Periodic_boundary_conditions.

Besides the same requirement as a molecular calculations, i.e.

- A proper **method** (see previous notebook)
- A proper **1e basis set** (see previous notebook),

a calculation with periodic boundary conditions needs a large enough cell that enough atoms are treated effectively independently, not as images. Instead of increasing the cell in real space, we increase it in reciprocal space with an increasing number of **k** points. Therefore, we also need

- A large enough number of **k points** to reach the thermodynamic limit (TDL) and remove finite size errors.


# Sources of error
  - **Method** error (see previous notebook)
  - **Basis set incompleteness** error (see previous notebook)
  - **Finite size** error

# Focus on removing the finite size error here:

Method and basis sets are chosen to be inexpensive here (DFT with LDA and gth-szv basis). Note: this is usually not enough for production calculations.

In [None]:
import numpy as np # manipulate arrays
import pandas as pd # read in and manipulate csv data
import plotly.express as px
from pyscf.pbc import gto, scf # note the pyscf.pbc for solid calculations

# Setting up our system
We initialize the solid PySCF object with coordinates, basis, pseudopotential information.
Here, we consider silicon in a face centered cubic (FCC) cell (see e.g. https://en.wikipedia.org/wiki/Silicon).

In [None]:
# Setting up primitive face centered cubic (FCC) cell
latt_param = 5.431  # Default units are in Angstrom; https://physics.nist.gov/cgi-bin/cuu/Value?asil
cell_lattice = 0.5*latt_param*np.asarray([[1.0, 0.0, 1.0],
                                          [1.0, 1.0, 0.0],
                                          [0.0, 1.0, 1.0]])
qlp = latt_param*0.25
cell_xyz = f"""Si        0.00000    0.00000   0.00000
               Si        {qlp}      {qlp}     {qlp}"""
cell = gto.Cell(a=cell_lattice, atom=cell_xyz, basis="gth-szv", pseudo="gth-pade", verbose=4)
cell.build()

## Density functional theory (here: LDA) energy in the thermodynamic limit
Now, we would like to evaluate the LDA energy in the thermodynamic limit, i.e. the limit of an infinite bulk crystal or equivalently, an infinitely fine k point mesh. We start with small k point meshes and increase the number of k points until we can extrapolate to the thermodynamic limit.

In [None]:
lda_es = []
ks = list(range(1,6))
# Only running k point meshes 111 to 555 due to cost here.
for k in ks:
    mykmf = scf.KRKS(cell, cell.make_kpts([k,k,k]), xc="lda").run()
    lda_es.append(mykmf.e_tot)
print(lda_es)

# Analysis

In [None]:
# Collect data
inv_nk = [1/k**3 for k in ks] # This is 1/number of k points.
energies = lda_es

# Plotting
fig = px.line(x=inv_nk, y=energies, title="LDA TDL Convergence", markers=True)
fig.update_layout(xaxis_title="1/Nk", yaxis_title="Energy (Ha)")
fig.update_traces(marker_size=12)
fig.update_xaxes(range=[0.0, 1.01])
fig.show() # It's interactive!

In [None]:
# Since we did not have time to run more data, we use some previously calculated data:
data = {"k": list(range(1,11)),
        "E_LDA": [-6.782331133176456, -7.40198089364257, -7.479069536866599, -7.494960445535517,
                  -7.499149552150502, -7.500421034558795, -7.50084684160541, -7.501000257157286,
                  -7.501058802586751, -7.501082221827545]}
# Note that using pandas is not really necessary here but this shows its use.
lda_tdl = pd.DataFrame(data)
lda_tdl

In [None]:
# Collect data
inv_nk = 1/lda_tdl["k"]**3 # This is 1/number of k points.
energies = lda_tdl["E_LDA"]

# Plotting
fig = px.line(x=inv_nk, y=energies, title="LDA TDL Convergence", markers=True)
fig.update_layout(xaxis_title="1/Nk", yaxis_title="Energy (Ha)")
fig.update_traces(marker_size=12)
fig.update_xaxes(range=[0.0, 1.01])
fig.show()