<a href="https://colab.research.google.com/github/MosaicGroupCMU/African-MRS-Tutorials/blob/main/Google-Colab/1_Quantum_Espresso_Silicon_Medium.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to electronic structure optimization of silicon using Quantum ESPRESSO

Contributors: [Seda Oturak](https://github.com/sedaoturak), [Ismaila Dabo](https://scholar.google.com/citations?user=rN299m0AAAAJ&hl=en), [Jessica Wen](https://github.com/JessicaWen-PhD), [Cierra Chandler](https://github.com/Cierra-Chandler), [Henry Eya](https://github.com/Henrynweya)




### Structure of the class

1.   Introduction to Quantum Mechanics
2.   Schrödinger Equation
3.   Density-Functional Theory (DFT)
4.   Application of DFT to Single-Particle Calculations
5.   Hands-on Workshop



### How to use these workbooks

*   Work in pairs and help others!
*   Fill in the blanks and run code blocks to see if it works.
*   Answers will be shared after the class.
*   This medium level workbook is designed for people who have some more experience with Python (including using matplotlib and numpy libraries) and using the terminal, but who might not have seen QuantumESPRESSO before.
*   If this intermediate level is too challenging for you, you can switch to the [easy difficulty Google Colab workbook](https://github.com/MosaicGroupCMU/African-MRS-Tutorials/blob/358659230a55be893b7ae11a3cd4e862fc486342/Google-Colab/1_Quantum_Espresso_Silicon_Easy.ipynb).
*   If this intermediate level is too easy for you, you can switch to the hard difficulty worksheet where you can run QuantumESPRESSO on your own computer.

# Install libraries, environment, and Quantum ESPRESSO

This part installs libraries for numerical calculations and plotting.

In [None]:
# load plotting libraries
import matplotlib.pyplot as plt

# load numerical libraries
import numpy as np

Quantum ESPRESSO is a plane wave code, which uses Fourier transforms to solve equations in plane wave space. This part installs libraries for fast Fourier transforms (FFTs).

In [None]:
# eliminate text output during installation
%%capture

# install mathematical libraries to peform fast Fourier transforms
# (the exclamation mark means that the command is run under Linux)
! apt-get install -y libfftw3-3 libfftw3-dev libfftw3-doc

The Atomic Simulation Environment (ASE) is a set of tools for running, visualizing, and analyzing simulations. This part installs ASE.

In [None]:
# eliminate text output during installation
%%capture

# install the Atomic simulation environment
# ! apt install ase
! pip install git+https://gitlab.com/ase/ase

Compiling Quantum ESPRESSO from scratch would take a long time. This part uploads pre-compiled executable files (`.x` extension) and additional files containing the pseudopotentials.

In [None]:
# eliminate text output during installation
%%capture

# navigate to main directory named '/content/'
%cd /content/

# download the pre-compiled files in compressed format (under Linux)
#! wget 'https://docs.google.com/uc?export=download&id=1kw_CJMjP6ggDZXDNp5phAqCPpoe2WXCA' -O qe-lite.tgz
!gdown 'https://drive.google.com/uc?export=download&id=13l-Kiyg-F6aYb5lF8M3RsE1hSnLRdGna' -O qe-lite.tgz

# unpack the compressed files (under Linux)
! tar -xvzf qe-lite.tgz

# clean up some files
! rm -rf sample_data qe-lite.tgz

# Prepare Quantum ESPRESSO input file

This is where we start setting up the data and information for our QuantumESPRESSO calculations.

We first need to start with the Self-Consistent Field (SCF) calculations. The parameters in the QuantumESPRESSO input file requires some playing around to achieve the best trade-off between accuracy and run-time.

The calculation is for a unit cell of diamond silicon. The definition of the input parameters of the `pw.x` executable can be found at `www.quantum-espresso.org/Doc/INPUT_PW.html`.

In [None]:
# create calculation folder and navigate into it
%________ -p /content/________
%________ /content/________/

# create input and write it into the file si.scf.in
________ = """
&________
  prefix='________',
  pseudo_dir = '________',
  outdir='/content/________/'
/
&________
  ibrav = ________,
  celldm(1) = ________,
  nat = ________,
  ntyp = ________,
  ecutwfc = ________,
  ecutrho = ________,
/
&electrons
  conv_thr = ________,
/
ATOMIC_SPECIES
 ________  ________  ________
ATOMIC_POSITIONS alat
 ________ ________ ________ ________
 ________ ________ ________ ________
K_POINTS ________
   ________
CELL_PARAMETERS
  ________
"""

with open(________) as f:
    f.________(________)

# print the content of the input file (under Linux)
________

Use [ASE tools](https://wiki.fysik.dtu.dk/ase/ase/io/formatoptions.html#ase.io.espresso.read_espresso_in) to extract information from Quantum ESPRESSO input and visualize the crystal.

In [None]:
import ase.io.espresso
from ase import Atoms
from ase.visualize import view
from ase.build import make_supercell
from ase.build import bulk

# extract unit cell information from input file using ASE
input_file = ase.io.espresso.________('________')
si = Atoms(________)

# create a supercell (3 × 3 × 3) using ASE
multiplier = np.________(3) * ________
si_supercell = ________(________,________)

# visualize the supercell
view(________, ________='x3d')

HINT: if you're getting an AttributeError where the 'NoneType' object has no attribute 'append', it's probably because you're not quite formatting your input file correctly. For example, you might be missing commas.

# Run Quantum ESPRESSO using input file

Make sure that the pseudopotential file is present in the working directory. It's time to start the self-consistent field calculation, which is often the longest part of any calculations you do. This should produce the potential energy.

### What is the Self-Consistent Field (SCF) calculation?

The SCF calculation is an iterative computational method used to solve the many-body Schrödinger equation for electrons in atoms and molecules.

It starts with an initial guess for the electron distribution and iteratively refines it until convergence is reached. Each electron is treated as if it moves in the average field created by all other electrons, simplifying the complex many-body problem (mean-field approximation).

The calculation aims to minimise the total energy of the system with respect to the electron density, thus solving for molecular orbitals and their energies by using (in this case) plane wave basis sets to represent the electron wavefunctions.

The iteration stops when the change in energy or electron density between successive steps become sufficiently small (given by the convergence threshold).

### What are the pseudopotential files used for?

The pseudopotential files replace the full electron system with an effective potential that accounts for the core electrons, so that we don't need to calculate where every single electron in the system goes. This reduces computational complexity. Well-designed pseudopotentials aim to balance computational efficiency with accuracy across different chemical environments.

You can read more about choosing pseudopotentials on the [VASP wiki](https://www.vasp.at/wiki/index.php/Choosing_pseudopotentials).

In [None]:
# run the pw.x executable using si.scf.in to create si.scf.out
________

# print the content of the output file (under Linux)
________

In [None]:
# first method: extract total energies in rydberg during the self-consistent-field calculation (under Linux)
________

In [None]:
# define physical constants for unit conversion
from scipy.constants import physical_constants
ha_in_ev = physical_constants["________"][0] # extract the Hartree energy in eV from physical_constants package
ry_in_ev = ________ * ________.

# second method: extract total energy at the end of the self-consistent calculation (using ASE)
output = ase.io.read("________")
total_energy = ________.________()
print("Energy = %.8f Ry " % ________) # total energy in Rydberg constant
print("Energy = %.8f eV " % ( ________ ) ) # total energy in Hartrees

# Convergence test with respect to the cutoff energy

Convergence tests are an important part of any DFT calculation. We do convergence tests for the cutoff energy and the k-point sampling.

You can read more about convergence tests in this [Choudhary and Tavazza, 2020](https://pmc.ncbi.nlm.nih.gov/articles/PMC7066999/) paper.

### About Cutoff Energy

The wavefunctions of electrons we use in these DFT calculations are Kohn-Sham wavefunctions, and we can theoretically create these wavefunctions by summing an infinite number of plane waves.

However, we are not able to compute an infinite number of plane waves, so we use the cutoff energy to specify the maximum kinetic energy of the plane waves used (i.e. we truncate the infinite expansion of the Kohn-Sham wavefunction into plane waves, thus ignoring all plane waves with energy above this threshold). Setting a higher cutoff energy increases the size of the basis set (plane waves) used to represent electronic wavefunctions, and controls the number of plane waves included in the calculation.

Higher cutoff energies generally lead to more accurate results (because you are using more plane waves) but also increases computational time. For most systems, a cutoff of 25-40 Ry (340-544 eV) may be sufficient, but this can vary widely.


### About Convergence Tests

Convergence tests achieve several goals:

1.   Accuracy: they make sure that the properties we calculated are sufficiently accurate and not affected by numerical artifacts.
2.   Property convergence: we want to find the point at which the properties we care about (e.g. total energy, bond lengths, electronic structure, etc.) are stabilised with increasing cutoff energy or k-point density.
3.   System-specific optimisation: we can account for the unique characteristics of different materials, such as band gap, crystal structure, and chemical composition.
4.   Validation of results: convergence tests provide confidence in the calculated results by demonstrating that further increases in cutoff energy or k-point density don't significantly change the outcomes.

Convergence tests allow us to establish a balance between accuracy and computational cost, which ensures that our DFT calculations can produce reliable and reproducible results for the properties we are investigating.

You can read more about convergence tests on [HJK Group at MIT's Convergence 101 post](https://hjkgrp.mit.edu/tutorials/2012-04-17-convergence-101/).


In [None]:
# create a list for cutoff energies to be tested
cutoff_energies = ________ # try e.g. values from 12 to 44 in increments of 4

# find the "cutoff energy" line in the input file
wfc_index = qe_input.find('________') + ________
rho_index = qe_input.find('________') + ________

total_energies = ________
for (n, cutoff_ratio) in enumerate([4,8,10]):
  for cutoff in cutoff_energies:
    # update the input file with the new cutoff energy
    new_input_file = (________)

    # overwrite the input file
    ________

    # run the DFT input file
    ________

    # read the output file
    ________

    ________ # record the calculated total energy into total_energies

In [None]:
# plot convergence graph of cutoff energies. Remember to include units in the axes labels!
________

# Convergence test with respect to the k-point sampling

The k-point sampling density specifies how many points in the [Brillouin zone](https://eng.libretexts.org/Bookshelves/Materials_Science/Supplemental_Modules_(Materials_Science)/Electronic_Properties/Brillouin_Zones) are used to calculate the property we are investigating (essentially by providing an efficient means of integrating periodic functions of the wave vector in the Brillouin zone).

This is usually done by taking a grid of k-points, often generated by methods like [Monkhorst-Pack sampling](https://journals.aps.org/prb/abstract/10.1103/PhysRevB.13.5188)

From [Choudhary and Tavazza, 2020](https://pmc.ncbi.nlm.nih.gov/articles/PMC7066999/):

> The total energy of the system is the most important output of a DFT calculation, and it is obtained by numerically integrating the Hamiltonian over the Brillouin zone. The k-points are a generic way to discretize such as integral. The quality of the results heavily depends on the number of these points on the mesh-grid as well as the method generating the mesh-grid itself used in such integration. The number of points can be arbitrarily increased to increase the precision of calculations. However, the higher the number of irreducible k-points, the higher the computational cost. Therefore, finding the optimum number of k-points to determine the total energy within a specified tolerance (i.e. “converging” on the k-point mesh) is extremely important.

In [None]:
# initialize input file with appropriate cutoffs
qe_input = """
________
"""

# write the above to an input file
________

# print the content of the input file (under Linux)
________

# create a list for k points to be tested
kpoints = ________ # enter kpoints limits
# find the "k points" line in the input file
lat_cons_index = qe_input.find('________') + ________

total_energies = ________

for k in kpoints:
  # update the input file with the new k points
  new_input_file = qe_input[:________] + str(k).ljust(2) + str(k).ljust(2) + str(k).ljust(2) + qe_input[________+8:]

  # overwrite the input file
  ________

  # run the DFT input file
  ________

  # read the output file
  ________

  ________.append(________.get_total_energy()) # record the calculated total energy in total_energies

In [None]:
# plot convergence graph. Don't forget units on the axes labels!
________

# Lattice parameter

We can now use an appropriate cutoff energy and k-point sampling found from the convergence tests to find the energetically favourable lattice parameter by iterating the self-consistent field calculation over many lattice parameters.

In [None]:
# initialize input file with appropriate cutoffs and k-point sampling
qe_input = """
________
"""

# write the above to an input file
________

# print the content of the input file (under Linux)
________

# create a list for lattice constants to be tested
lattice_constants = ________ # try from 9.5 to 11.1 bohr in increments of 0.1 bohr

# find the "lattice constant" line in the input file
lat_cons_index = ________

total_energies = []
for constant in lattice_constants:
  # update the input file with the new lattice constant
  new_input_file = ________

  # overwrite the input file
  ________

  # run the DFT input file
  ________

  # read the output file
  ________

  ________ # record the calculated total energy to total_energies

In [None]:
# plot potential energy curve and label the axes. Don't forget units!
________