# Calculations of charged defects in 2D materials
a tutorial by Anne Marie Tan

Some things to note before we get started:
* Download the python scripts from https://github.com/aztan2/charged-defects-framework and place them in the same directory as this notebook on hipergator.
* You will need to launch this notebook from a virtual environment on hipergator in which you have installed python packages like numpy, matplotlib, pymatgen, pandas (+openpyxl), nglview (if you want to use the built-in crystal viewer), and of course jupyterlab.
* Follow the instructions under the section "Standalone Jupyter Notebook" at https://help.rc.ufl.edu/doc/Remote_Jupyter_Notebook to start a Jupyter notebook within a SLURM job on hipergator and connect to it from the web browser running on your local computer.
* For the purpose of this tutorial, I will try to keep everything self-contained by executing all commands within python, including navigating directories, executing python scripts, etc. \
However, when you actually apply this to a new system, you will probably find it easier to do some of these directly from command line. \
Hence, I will also include as comments the corresponding bash commands to perform certain steps.
* Please go through the tutorial **"Calculating structure and properties of pristine 2D materials"** before starting on this tutorial.

The main quantity that we’re interested in calculating is the defect formation energy $E^f[X^q]$, which we can compute in DFT using the supercell approach and the following equation:

<div>
<img src="tutorial_images/eqn_Eform.png" width="400"/>
</div>

where $E_{\textrm{tot}}[X^q]$ and $E_{\textrm{tot}}[\textrm{pristine}]$ are the total DFT-derived energies of the supercell containing the defect $X$ and the pristine supercell respectively, $n_i$ is the number of atoms of species $i$ added/removed by the defect, $\mu_i$ is the corresponding chemical potential of the species, and $E_{\textrm{F}}$ is the Fermi energy. 
The final term $E_{\textrm{corr}}$ contains corrections to the formation energy due to electrostatic interactions with periodic images and the implicit compensating background charges which are introduced in supercell calculations using plane-wave DFT approaches. We will evaluate this term using the method developed by [Freysoldt and Neugebauer](https://doi.org/10.1103/PhysRevB.97.205425).

(This tutorial is meant to focus on how to perform the calculations and corrections, hence I will no go into detail about the theory here. You are recommended to read up on your own about (1) the origin of the artefacts that we need to correct for, and (2) how this correction scheme works.)

Based on the equation above, the DFT calculations that are required to compute the formation energy are:
* $E_{\textrm{tot}}[\textrm{pristine}]$
* $E_{\textrm{tot}}[X^q]$ for every defect of interest $X$ in different charge states $q$
* the relevant $\mu_i$

So let's get started!

In [1]:
import os
import sys
import importlib
import pymatgen

In [2]:
import setup_defects
import parse_energies

#importlib.reload(myutils)

### 0.      Determine equilibrium lattice constants:

For the defect calculations, we will fix the lattice parameters and only relax the atom positions. This ensures that we have a consistent reference for all our calculations (with/without defects) and models the system in which the defect concentration is in the dilute limit.

* Based on the pristine unit cell calculations you did in the **"Calculating structure and properties of pristine 2D materials"** tutorial, estimate the converged values of the in-plane lattice parameter(s). It should be around 3.18 Å for monolayer MoS$_2$ evaluated with PBE.

* Prepare a set of unit cell POSCARS with these lattice constants at different vacuum spacings (e.g. 10, 15, 20 Å). You can do this by just slightly modifying the lattice vectors at the top of your existing corresponding unit cell POSCARs. Name each of the new POSCARs `POSCAR_vac_<vacuum spacing>` (this naming convention is important!).

### 1.      Total energy of a pristine supercell:

This calculation is pretty straightforward. In fact, haven’t we already computed the energy of the pristine monolayer in the previous tutorial? Why do we need to do it again? 

While it is true that we could simply compute the total energy of a pristine supercell by multiplying the energy of a pristine unit cell accordingly, our later calculations will also require the potential (LOCPOT) of the pristine supercell, so we still need to perform the pristine supercell calculation anyway.

* Create a new directory for this set of calculations. Let's assume I've named it `pristineref`.

In [3]:
%cd /ufrc/hennig/annemarietan/test/MoS2
%mkdir pristineref

/ufrc/hennig/annemarietan/test/MoS2
mkdir: cannot create directory ‘pristineref’: File exists


* Copy the POSCARs you made in Step 0 into this directory

In [4]:
%cp unitcell/POSCAR_vac_* pristineref/.
%cd pristineref
%ls

/ufrc/hennig/annemarietan/test/MoS2/pristineref
[0m[01;34mcharge_0[0m/  POSCAR_vac_10  POSCAR_vac_15  POSCAR_vac_20  POTCAR


* Prepare the POTCAR by concatenating the appropriate element POTCARs. \
In the following example, we will be creating a S vacancy in MoS$_2$, therefore we will only require the `Mo_pv` POTCAR for Mo and `S` POTCAR for S, as before. \
If you are introducing a dopant or impurity of a different species, you will have to concatenate the appropriate POTCAR. Remember, pymatgen orders the elements in the POSCAR by increasing electronegativity, hence the element POTCARs must be concatenated in the same order!

In [5]:
%cat /home/annemarietan/POTCAR/POT_GGA_PAW_PBE/Mo_pv/POTCAR /home/annemarietan/POTCAR/POT_GGA_PAW_PBE/S/POTCAR > POTCAR
%ls

[0m[01;34mcharge_0[0m/  POSCAR_vac_10  POSCAR_vac_15  POSCAR_vac_20  POTCAR


* Run the following python script to generate all the input files for our pristine supercell calculations: \
(run `setup_defects.main(["--h"])` or `python setup_defects.py --h` to see the description of the required and optional arguments for this script)

In [7]:
setup_defects.main(["/ufrc/hennig/annemarietan/test/MoS2/pristineref/", 
                    "--q", "0", "--cell", "3x3x1", "4x4x1", "--vacs", "10", "15", "20", 
                    "--kppa", "440", "--write_bulkref"])
## python setup_defects.py /path/to/pristineref --q 0 --cell 3x3x1 4x4x1 --vacs 10 15 20 --kppa 440 --write_bulkref

/ufrc/hennig/annemarietan/test/MoS2/pristineref/charge_0/3x3x1/vac_10
ref_3x3x1_10
/ufrc/hennig/annemarietan/test/MoS2/pristineref/charge_0/3x3x1/vac_15
ref_3x3x1_15
/ufrc/hennig/annemarietan/test/MoS2/pristineref/charge_0/3x3x1/vac_20
ref_3x3x1_20
/ufrc/hennig/annemarietan/test/MoS2/pristineref/charge_0/4x4x1/vac_10
ref_4x4x1_10
/ufrc/hennig/annemarietan/test/MoS2/pristineref/charge_0/4x4x1/vac_15
ref_4x4x1_15
/ufrc/hennig/annemarietan/test/MoS2/pristineref/charge_0/4x4x1/vac_20
ref_4x4x1_20


* This should have created 6 sub-directories in the `pristineref` main directory, following the directory structure of `charge/supercell_size/vacuum_spacing`. This is the default directory structure that all my scripts have been designed to work with. Check that each sub-directory contains a POSCAR, POTCAR, INCAR, KPOINTS, and submission script.

In [8]:
%cd charge_0/3x3x1/vac_10
%ls

/ufrc/hennig/annemarietan/test/MoS2/pristineref/charge_0/3x3x1/vac_10
INCAR  KPOINTS  POSCAR  POTCAR  submitVASP.sh


* Submit your jobs!\
Note that the default in the `gen_submit.py` script is to submit to the priority queue (`hennig`). The number of cores available on the priority queue is limited and shared among everyone in the group, so you don't want to hog too many cores all at once. You may change the default in the python script or edit the submission script directly to submit to the burst queue (`hennig-b`) instead.

* When your jobs are done, what’s an easy sanity check you can do? (hint: see the discussion at the beginning of Step 1)

### 2.      Total energy of a defected supercell:

Finally, we get to the real meat of our calculations! 

Let’s first consider a simple intrinsic defect, a S vacancy in MoS$_2$.

* Create a main defect directory for this set of calculations. Let’s assume I’ve named it `Svac`.

In [33]:
%cd /ufrc/hennig/annemarietan/test/MoS2
%mkdir Svac

/ufrc/hennig/annemarietan/test/MoS2
mkdir: cannot create directory ‘Svac’: File exists


* Copy the same POTCAR and unit cell POSCARs as used in step 1 into this directory.

In [34]:
%cp pristineref/PO* Svac/.
%cd Svac
%ls

/ufrc/hennig/annemarietan/test/MoS2/Svac
[0m[01;34mcharge_0[0m/  [01;34mcharge_-1[0m/       POSCAR_vac_10  POSCAR_vac_20
[01;34mcharge_1[0m/  initdefect.json  POSCAR_vac_15  POTCAR


* We will specify the number, type, and position of the defect(s) to create in an `initdefect.json` file. \
For a simple vacancy, this is very straightforward. For example, to create a S vacancy by removing the S atom at the last position in the POSCAR, simply create the file with the following lines: 

```
{
	"def1": {"type": "vac", "index": -1, "species": "S"}
}
```

This will tell the script to remove the S atom at index -1 (in python, negative indices will count backwards from the end of the list). To learn how to specify other types of defects, please refer to [the additional document that I have to write].

You can simply copy the lines above into your text editor of choice on hipergator and save the file as `initdefect.json` (the name is important!). However, in an attempt to keep this notebook as self-contained as possible, I also provide a small python script below which will do the same thing.

In [23]:
import json

defects = {"def1":{"type": "vac", "index": -1, "species": "S"}}

with open("initdefect.json", 'w') as file:
    file.write(json.dumps(defects,indent=4))

In [24]:
%cat initdefect.json

{
    "def1": {
        "type": "vac",
        "index": -1,
        "species": "S"
    }
}

* Now, you have all you need to run the following python script to generate all the input files for our defect calculations:

In [26]:
setup_defects.main(["/ufrc/hennig/annemarietan/test/MoS2/Svac/", 
                    "--q", "0", "-1", "1", "--cell", "3x3x1", "4x4x1", "--vacs", "10", "15", "20", 
                    "--kppa", "440"])
## python setup_defects.py /path/to/Svac --q 0 -1 1 --cell 3x3x1 4x4x1 --vacs 10 15 20 --kppa 440

/ufrc/hennig/annemarietan/test/MoS2/Svac/charge_0/3x3x1/vac_10
{'type': 'vac', 'index': -1, 'species': 'S', 'index_offset_n1': 0, 'index_offset_n2': 0, 'index_offset_n1n2': 0}
vac_S_0_3x3x1_10
/ufrc/hennig/annemarietan/test/MoS2/Svac/charge_0/3x3x1/vac_15
{'type': 'vac', 'index': -1, 'species': 'S', 'index_offset_n1': 0, 'index_offset_n2': 0, 'index_offset_n1n2': 0}
vac_S_0_3x3x1_15
/ufrc/hennig/annemarietan/test/MoS2/Svac/charge_0/3x3x1/vac_20
{'type': 'vac', 'index': -1, 'species': 'S', 'index_offset_n1': 0, 'index_offset_n2': 0, 'index_offset_n1n2': 0}
vac_S_0_3x3x1_20
/ufrc/hennig/annemarietan/test/MoS2/Svac/charge_0/4x4x1/vac_10
{'type': 'vac', 'index': -1, 'species': 'S', 'index_offset_n1': 0, 'index_offset_n2': 0, 'index_offset_n1n2': 0}
vac_S_0_4x4x1_10
/ufrc/hennig/annemarietan/test/MoS2/Svac/charge_0/4x4x1/vac_15
{'type': 'vac', 'index': -1, 'species': 'S', 'index_offset_n1': 0, 'index_offset_n2': 0, 'index_offset_n1n2': 0}
vac_S_0_4x4x1_15
/ufrc/hennig/annemarietan/test/MoS2

* This should have created 3 x 2 x 3 = 18 sub-directories in the `Svac` main directory, following the directory structure of `charge/supercell_size/vacuum_spacing`. Check that each sub-directory contains a POSCAR, POTCAR, INCAR, KPOINTS, `defectproperty.json` and submission script.

In [35]:
%cd charge_-1/3x3x1/vac_10
%ls

/ufrc/hennig/annemarietan/test/MoS2/Svac/charge_-1/3x3x1/vac_10
defectproperty.json  INCAR  KPOINTS  POSCAR  POTCAR  submitVASP.sh


* You should see that the POSCARs now have 1 fewer S atom. You can also visualize the structure in VESTA/CrystalMaker/etc. to check that you did indeed create the defect as intended. This is particularly important when you create more complex defects (e.g. interstitials, defect clusters, etc.) for which the above method of defining the defect site is more complicated.

In [37]:
#import nglview
from ase.io.vasp import read_vasp
import ase.visualize.ngl as ngl

atoms = read_vasp("POSCAR")
ngl.view_ngl(atoms, w=600, h=200)
#nglview.show_ase(read_vasp("POSCAR"))

HBox(children=(NGLWidget(), VBox(children=(Dropdown(description='Show', options=('All', 'Mo', 'S'), value='All…

* Charged systems are specified in VASP by setting the `NELECT` (number of electrons) tag in the INCAR. By default, VASP sets `NELECT` to be $\sum_i n_i Z_i$, where $n_i$ and $Z_i$ are the numbers of atoms and valence electrons of element $i$, respectively. \
For example, a neutral system with 9 Mo and 17 S atoms should have 9 x 12 + 17 x 6 = 210 valence electrons. We would specify a negatively charged system by adding electrons; for example, by setting `NELECT = 211` in the INCAR. You can verify that this has been taken care of by the script by comparing the INCARs for the defect in different charge states.

In [44]:
%cat INCAR

PREC = Accurate
ALGO = Fast
LREAL = Auto
ISYM = 0
NELECT = 211
ENCUT = 520
NELM = 120
EDIFF = 1e-06
ISIF = 2
IBRION = 2
NSW = 100
ISMEAR = 1
SIGMA = 0.1
ISPIN = 2
MAGMOM = 9*5.0 17*0.6
LPLANE = True
NPAR = 4
KPAR = 2
LWAVE = False
LCHARG = False
LMAXMIX = 4
LORBIT = 11
LVTOT = True
LVHAR = True


* The `defectproperty.json` is a new file we have not encountered before. It contains some basic information about the defect and the supercell it has been created in, organized in a standarized dictionary form that is easiy accessible by other python scripts that will parse through the directories extracting relevant information and applying the correction scheme. Let's take a look at it:

In [42]:
%cat defectproperty.json

{
    "charge": -1,
    "defect_type": [
        "vac_S"
    ],
    "defect_site": [
        [
            0.7777777777777777,
            0.8888888888888888,
            0.38088767112
        ]
    ],
    "lattice": {
        "@module": "pymatgen.core.lattice",
        "@class": "Lattice",
        "matrix": [
            [
                9.541940940734781,
                0.0,
                0.0
            ],
            [
                -4.7709704703673905,
                8.263561989784584,
                0.0
            ],
            [
                0.0,
                0.0,
                13.11
            ]
        ]
    },
    "supercell": [
        3,
        3,
        1
    ],
    "vacuum": 10
}

* If everything looks good, submit your jobs!\
Note, as before, that you can change the defaults for the queue, number of nodes, memory, and runtime requested for each job. 

### 3.      Chemical potentials:

The defect formation energy depends on the choice of chemical potential reference(s). For the example of a S vacancy in MoS$_2$, the relevant chemical potential is that of a S atom. However, the chemical potential of a S atom depends on the environment that it is in.

In MoS$_2$, the chemical potentials of Mo ($\mu_{\textrm{Mo}}$) and S ($\mu_{\textrm{S}}$) are related by the stability of the MoS$_2$ phase, i.e. $\mu_{\textrm{Mo}} + \mu_{\textrm{S}} = \mu_{\textrm{MoS$_2$}}$. Within this constraint, the chemical potentials can be varied, subject to bounds determined by the competing phases.

* Let's check the phase diagram for the Mo-S system to see what are the relevant competing phases. The following phase diagram was obtained from https://materialsproject.org.
 <div>
<img src="tutorial_images/phase_diagram.png" width="800"/>
</div>

Turns out, this system is pretty straightforward, with the only competing phases being the pure elements. Hence, we can define the chemical potentials in two limits:

* Mo-rich (S-poor) limit: $\mu_{\textrm{Mo}} = \mu_{\textrm{bcc.Mo}}$; $\mu_{\textrm{S}} = (\mu_{\textrm{MoS$_2$}} - \mu_{\textrm{bcc.Mo}})/2$
* S-rich (Mo-poor) limit: $\mu_{\textrm{Mo}} = \mu_{\textrm{MoS$_2$}} - 2\mu_{\textrm{elem.S}}$; $\mu_{\textrm{S}} = \mu_{\textrm{elem.S}}$

There may be some ambiguity in choosing the reference phase for the elemental S, so for simplicity, we'll stick with the Mo-rich limit for now. $\mu_{\textrm{MoS$_2$}}$ is simply the energy of a formula unit of monolayer MoS$_2$, which we already have. Therefore, all we need is the energy per Mo atom in bcc Mo.

* Download the input files from Materials Project (mp-129) and run this calculation. If the unit cell has more than 1 atom, don't forget to divide the total energy accordingly!

### 4.      Applying the charge correction:

We are basically done with the DFT calculations at this point, but there is one very important step left – applying the [charge correction method developed by Freysoldt and Neugebauer](https://doi.org/10.1103/PhysRevB.97.205425) to determine the correction $E_{\textrm{corr}}$ which corrects for the artefacts introduced by treating charged defects in the periodic supercell approach.

Remember, the equation for the defect formation energy is:

<div>
<img src="tutorial_images/eqn_Eform.png" width="400"/>
</div>

First, let’s see what happens when we don’t apply the correction:

* Run the following python script to parse the total energies from your completed DFT calculations:

In [6]:
%cd /ufrc/hennig/annemarietan/test/MoS2/Svac

parse_energies.main(["/ufrc/hennig/annemarietan/WS2/monolayer_Svac/GGA/mag/",
                     "/ufrc/hennig/annemarietan/WS2/monolayer_ref/GGA/mag/",
                     "test_Eform_Svac.xlsx"])
## python parse_energies.py /path/to/Svac /path/to/pristineref Eform_Svac.xlsx

INFO:parsing neutral 3x3x1 vac_20


/ufrc/hennig/annemarietan/test/MoS2/Svac


INFO:parsing neutral 3x3x1 vac_10
INFO:parsing neutral 3x3x1 vac_15
INFO:parsing neutral 4x4x1 vac_20
INFO:parsing neutral 4x4x1 vac_10
INFO:parsing neutral 4x4x1 vac_15
INFO:parsing charge_-1 3x3x1 vac_20
INFO:parsing charge_-1 3x3x1 vac_10
INFO:parsing charge_-1 3x3x1 vac_15
INFO:parsing charge_-1 4x4x1 vac_20
INFO:parsing charge_-1 4x4x1 vac_10
INFO:parsing charge_-1 4x4x1 vac_15
INFO:parsing restart subdirectory
INFO:parsing charge_1 3x3x1 vac_20
INFO:parsing charge_1 3x3x1 vac_10
INFO:parsing charge_1 3x3x1 vac_15
INFO:parsing charge_1 4x4x1 vac_20
INFO:parsing charge_1 4x4x1 vac_10
INFO:parsing charge_1 4x4x1 vac_15
DEBUG:Total time taken (s): 12.39
