# Electronic structure

In this exercise, you will combine everything you have learned so far: from Python basics to running and evaluating QE calculations.

### 1. Bandgap of diamond Si

**Tasks**:
Use DFT to calculate the bandgap of Si in the diamond structure. Here are some steps that may guide you:

1. Construct a structure corresponding to diamond Si. Use a primitive cell with two atoms. You can get inspiration from [Materials Project](https://materialsproject.org/materials/mp-149/) and construct `Atoms` object manually, or just use `bulk` from `ase.build`.
2. Get equilibrium (cubic) lattice constant by performing [E-V curve fit](https://en.wikipedia.org/wiki/Birch%E2%80%93Murnaghan_equation_of_state). To spare you convergence tests, you can use plane-wave cut-off energy of $40\,\mathrm{Ry}$ and charge density cutoff energy of $240\,\mathrm{Ry}$ and sample the reciprocal cell (1st Brillouin zone) with automatic $k$-point mesh corresponding to the spacing parameter of $0.03\,\mathrm{Å^{-1}}$. Additionally, add the following parameters to the QE input file:
    - `occupations = 'fixed'` - we have an insulator
    - `nbnd = 10` - for total energy, we care only about the valence band, but for visualization, we want more electrons
(Reasons for this go beyond these exercises. It suffices to say that this is related to the (expected) semiconducting character of Si: [pw.x input file description](https://www.quantum-espresso.org/Doc/INPUT_PW.html))
3. For the equilibrium structure, get a converged electronic structure. To get also the density of states (DOS) and partial density of states (PDOS), we will call QE postprocessing tools dos.x [dos.x input file description](https://www.quantum-espresso.org/Doc/INPUT_DOS.html) and projwfc.x [projwfc.x input file description](https://www.quantum-espresso.org/Doc/INPUT_PROJWFC.html) and parse the DOS information from output text files manually. Get:
    - band gap (Hint: `look for highest occupied, lowest unoccupied level (ev)` line in the `espresso.pwo` output file)
    - plot total density of states (Hint: inspect the `Si.tdos.dat` file (first two columns are the energy and DOS, Fermi level is printed on the header line). Conventionally, zero on the energy axis is set to the Fermi energy
4. Which states ($s$, $p$, ...) are at the top of the valence and the bottom of the conduction band? In which energy range do the $s$ and $p$ states hybridize? (Hint: see `'Si.pdos.pdos_atm#1(Si)_wfc#1(s)'`... files)

We start with loading some useful modules...

In [None]:
from ase import Atoms
from ase.build import bulk
from ase.visualize import view
from ase.eos import EquationOfState
from ase.calculators.espresso import Espresso, EspressoProfile
from ase.io import read

import numpy as np
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt

# We will be using MPI for prallelization, disable OpenMP threading
import os
os.environ['OMP_NUM_THREADS'] = '1'

### 1.1. Optimizing the Si structure

In [None]:
# Si diamond structure primitive cell
diamond_struct = bulk('Si', 'diamond', a=5.431) 
# view the structure
view(diamond_struct, viewer='ngl')

Let's create a dictionary for the input parameters as well as to define the pseudopotential

In [None]:
# we create a dictionary representing the input parameters
pwinput = {
    'ecutwfc': 40, # that is a very important parameter. This means use a plane-wave cutoff energy of 60 Ry, required.
    'ecutrho': 240, # plane-wave cutoff energy for charge density
    'disk_io': 'nowf', # do not write any large files
    'occupations': 'fixed', # insulator - no smearing
    'nbnd': 10, # for total energy, we care only about the valence band, but for visualization, we want more electrons
}
# Si pseudopotential, based on https://www.materialscloud.org/discover/sssp/table/efficiency 
# downloaded in /opt/SSSP_1.3.0_PBE_efficiency
pseudopotentials = {'Si': 'Si.pbe-n-rrkjus_psl.1.0.0.UPF'}

We start by gathering the data for equation of state fitting

In [None]:
# QE profile (set how many MPI processes we use, parallelization, and where the pseudopotentials are)
# specifically, we have 4 MPI processes (-np 4) and k-points are distributed over these 4 processes (-nk 4)
profile = EspressoProfile(
    command='mpirun -np 4 pw.x -nk 4', pseudo_dir='/opt/SSSP_1.3.0_PBE_efficiency/'
)

volumes = []
energies = []

# create the working directory
root_path='Si-EOS/'
for scale in tqdm(np.linspace(0.95, 1.05, 11)):
    struct = diamond_struct.copy()
    struct.set_cell(diamond_struct.cell.array*scale, scale_atoms=True)
    # create a separate folder for each calculation
    folder_name = f'{root_path}scale-{scale:.2f}/'
    # DFT calculations are slow, so first check if the calculation is already done
    # try to read the espresso.pwo output file with ase.io.read
    try:
        results = read(folder_name + 'espresso.pwo')
        energies.append(results.get_total_energy())
        volumes.append(results.get_volume())
    # if reading fails, just run a new calculation
    except:
        calc = Espresso(profile=profile,
                        directory=folder_name,
                        pseudopotentials=pseudopotentials,
                        kspacing=0.03,
                        input_data=pwinput)
        struct.calc = calc
        energies.append(struct.get_total_energy())
        volumes.append(struct.get_volume())

Now, let's fit the Birch-Murnaghan EoS

In [None]:
eos = EquationOfState(___, ___, eos="birchmurnaghan")
V0_fit, E0_fit, B0_fit = eos.fit()
a0_fit = (V0_fit*___)**(___)
print(f'lattice parameter is {a0_fit:.3f}Å')
print(f'bulk modulus is {B0_fit*___:.0f}GPa')
print(f'minimum energy is {E0_fit:.3f}eV')

In [None]:
eos = EquationOfState(volumes, energies, eos="birchmurnaghan")
V0_fit, E0_fit, B0_fit = eos.fit()
a0_fit = (V0_fit*4)**(1/3)
print(f'lattice parameter is {a0_fit:.3f}Å')
print(f'bulk modulus is {B0_fit*160.2:.0f}GPa')
print(f'minimum energy is {E0_fit:.3f}eV')

In [None]:
# plot results
from matplotlib import pyplot as plt

ax = eos.plot(show=False) # built-in ASE EOS function to get the matplotlib figure

ax.axvline(V0_fit, ls='--', color='gray')
ax.axhline(E0_fit, ls='--', color='gray')
ax.set_xlabel("Unit cell volume [Å³]")
ax.set_ylabel("Total energy [eV]")
ax.set_title("Birch–Murnaghan EOS, diamond si")
ax.legend(["fit", "calculated points"])

plt.show()

### 1.2. Calculating the electronic structure

We now perform a single calculation at the equilibrium lattice parameter. At the same time, we increase the number of $k$-points, to improve the numerical accuracy of the calculated electronic properties.

In [None]:
# new optimized structure at a larger k-point density (DOS converges more slowly than total energy)
diamond_struct_opt = bulk('Si', 'diamond', a=a0_fit)
pwinput['disk_io'] = 'low' # we will need wavefunctions written to the disk

calc = Espresso(profile=profile,
                directory='Si-DOS',
                pseudopotentials=pseudopotentials,
                kspacing=0.01,
                input_data=pwinput)
diamond_struct_opt.calc = calc
diamond_struct_opt.get_total_energy()

The ASE QE interface is not able to query either DOS. projected DOS from QE, or the band gap.
So, at this point, we will need to run QE directly and parse the files manually (`espresso.pwo` is the main output file).
(Welcome to the real DFT world :-)

Band gap can be parsed from the output from a line like this:

` highest occupied, lowest unoccupied level (ev):     5.9781    6.5930`

So we will do some regular expression matching magic

In [None]:
import re
with open("Si-DOS/espresso.pwo") as f:
    for line in f:
        match = re.search(r"unoccupied level \(ev\):\s*([0-9.]+)\s*([0-9.]+)", line)
        if match:
            band_gap = float(match.group(2)) - float(match.group(1))
            print(band_gap)
            break

Notice that the band is almost 50% underestimated, as compared to the low temperature experimental band gap value of $\sim$1.17eV. This is expected, unfortunately. While DFT is quite good at the ground state properties, for a proper modelling of band gap (and the conduction band in general), we would need to use some more advanced exchange-correlation meta-GGA functional, add exact-exchange (hybrid-functionals), or go beyond DFT to many-body perturbation theory (like the GW method).

Now, some input files for the `QE dos.x` (to get the total DOS) and `projwfc.x` (for the DOS projections) are needed:

In [None]:
dos_input = """&DOS
   prefix = 'pwscf'
   fildos = 'Si.tdos.dat'
   Emin   = -8.0
   Emax   =  12.0
   DeltaE = 0.05
/
"""
with open("Si-DOS/dos.in", "w") as f:
    f.write(dos_input)

# projwfc.in
projwfc_input = """&PROJWFC
   prefix  = 'pwscf'
   DeltaE  = 0.05
   Emin    = -8.0
   Emax    =  12.0
   filpdos = 'Si.pdos'
/
"""
with open("Si-DOS/projwfc.in", "w") as f:
    f.write(projwfc_input)

Now we run the postprocessing manually. `projwcf` is quite verbose, so we discard the standard output (the output files will be kept).

In [None]:
!cd Si-DOS; dos.x < dos.in; projwfc.x < projwfc.in > /dev/null

In [None]:
# Load data, skip lines starting with "#"
data = np.loadtxt("Si-DOS/Si.tdos.dat", comments="#")
energy = data[:, 0]   # first column: energy (eV)
dos = data[:, 1]      # second column: DOS

# the input also contains the Fermi level on the first line
import re
with open("Si-DOS/Si.tdos.dat") as f:
    first_line = f.readline()
match = re.search(r"EFermi\s*=\s*([0-9.]+)", first_line)
efermi = float(match.group(1))

plt.plot(energy-efermi, dos)
plt.axvline(0, linewidth=1, color='gray', linestyle='--')
plt.xlabel('$E-E_F$ [eV]')
plt.ylabel('total DOS [states/eV]')
plt.title('Total DOS')
plt.show()

In [None]:
data = np.loadtxt("Si-DOS/Si.pdos.pdos_tot", comments="#")
energy = data[:, 0] - efermi   # first column: energy (eV)
dos = data[:, 1]      # second column: DOS

# the projected DOS is per atom (2 Si atoms in the primitive cell) and per character (only s and p considered in our specific pseudopotential)
pdos = {
    "s": np.zeros(len(energy)),
    "p": np.zeros(len(energy)),
}

for o in [[1, 's'], [2, 'p']]:
    for atom in range(1,3):
        tmp = np.loadtxt(f"Si-DOS/Si.pdos.pdos_atm#{atom}(Si)_wfc#{o[0]}({o[1]})", comments="#")
        pdos[o[1]] += tmp[:, 1]
plt.plot(energy, dos, label="Total DOS")
plt.plot(energy, pdos["s"], label="Si s")
plt.plot(energy, pdos["p"], label="Si p")
plt.axvline(0, linewidth=1, color='gray', linestyle='--')
plt.legend()
plt.xlabel('$E-E_F$ [eV]')
plt.ylabel('partial DOS [states/eV]')
plt.title('Projected DOS')
plt.show()

This is quite a good agreement with literature (D.A. Papaconstantopolous, “Handbook of the Band Structure of Elemental Solids”, Plenum, New York, 1986):

<img src="https://www.researchgate.net/profile/Nikita-Medvedev/publication/235924584/figure/fig1/AS:669086010839061@1536533944674/The-density-of-states-of-solid-silicon-extracted-from-35-36-At-the-beginning-the_W640.jpg" />

Our DOS is a bit noisy; this is because, in the method we used now, the DOS is mostly a broadened eigenvalue histogram. While we could just add more $k$-points, a much nicer integration method exists, the tetrahedral method. You can rerun with `occupations='tetrahedra'`. In that case, the band gap info will no longer be printed, and the calculation (and DOS generation) will take longer, but you should end with a nice smooth DOS.