# Finite-size corrections to charged defect supercells with FHI-aims
This notebook is designed for applying finite-size corrections to charged defect supercells using outputs from the [FHI-aims](https://aimsclub.fhi-berlin.mpg.de/) electronic structure software package and the FNV correction scheme (doi: 10.1103/PhysRevLett.102.016402) through integration with the CoFFEE python code.

First download and install CoFFEE from [here](https://www.sciencedirect.com/science/article/pii/S0010465518300158).

The steps below are designed to allow for easier processing of large sets of defect calculations. All user defined settings are in the first cell. All subsequent cells do not require user inputs. They just need to be run in order and there is an explanation of the analysis being performed before each cell. Each charged defect still needs to be processed one at a time by specifying the location of directories with the relevant data. The user must then extract the correction energy from the potential alignment plot at the end of this notebook.

# Notes to self and to-do's
- Should add error messages if no file found e.g. when trying to read defect and host geometry.in's?
- Re-read CoFFEE docs to check if steps were missed for creating in_V file and use this to test with Tong's plotting script
- Use python glob to look through all dir's when computing sigma for defect set + read in and count species
- Use above to help write script to extract coords of defect in perfect host
- Print min distances of defect from supercell boundaries to user

## 1. User inputs

In [49]:
# Dielectric properties of host crystal: electronic+ionic dielectric constants
dielectric_xx = 7.49 
dielectric_yy = 6.92
dielectric_zz = 7.19

# Location of CoFFEE code (this is the directory containing coffee.py and other Utilities of the CoFFEE package)
path_to_coffee_dir= '/Users/suzy/Desktop/DefectAnalysis/CoFFEE_1.1'
# Dir with all data for final one-shots of all defect supercells (inc. perfect supercell)
path_to_all_defects = '/Users/suzy/Desktop/DefectAnalysis/EnargiteDefects/fromLandau/Results/data/final_one_shots' 
# Dir with data for perfect host supercell
path_to_host = '/Users/suzy/Desktop/DefectAnalysis/EnargiteDefects/fromLandau/Results/data/final_one_shots/PerfectReference' 
# Dir with defect you want to perform correction for
path_to_defect = '/Users/suzy/Desktop/DefectAnalysis/EnargiteDefects/fromLandau/Results/data/final_one_shots/VacancySupercells/V-S/charged/+1/DefectSpacegroup1'
# Dir with neutral version of defect you want to correct
path_to_neutral = '/Users/suzy/Desktop/DefectAnalysis/EnargiteDefects/fromLandau/Results/data/final_one_shots/VacancySupercells/V-S/neutral/DefectSpacegroup1'
# Enter charge state of defect you want to perform the correction for
defect_charge = 1

# Files for outputs for corrections
charge_model_file = 'cm_V_S_q=+1_sg=1.dat' #Name for image-charge correction file for this defect
pa_plot_file = 'pa_V_S_q=+1_sg=1.png' #Name for plot for potential alignment plot for this defect

# Plane wave cutoff for Poisson solver will be computed based on sigma value of Gaussian charge model
# If this is found to not converge after running the script, option below to manually set the cutoff (e.g. 40.0)
# Leave as 'None' to let the script use the value computed based on the sigma value
manual_cutoff = None

### END OF USER INPUTS ###


In [None]:
# All python libraries used in notebook
defect_geom = path_to_defect+'/geometry.in'
import re # Python equivalent of grep
from pylab import *
from numpy.linalg import *
from numpy import dot,cross,pi
from scipy.interpolate import spline
import os,sys
import matplotlib.pyplot as plt
import matplotlib as mpl
import matplotlib.colors
import numpy as np

## 2. Determine sigma for the Gaussian models for your defect set
The purpose is to determine the value of sigma for the Gaussian charge models that is small enough to ensure that the charge is contained within the supercell for all supercells, including the ones where the defect is closest to the boundary of the supercell. For consistency, it is desirable to use the same value of sigma for all defects in your set of calculations.

This step needs to only be run once for your set of defects, so if it turns out to be slow for a large set of defects just run it once and for subsequent runs replace the cell below with a cell that just contains:

`sigma = computed_value_from_first_run`

In [45]:
# Script using glob to go through master directory for defect calculations

# Locate the defect in each - may be more complicated now we do not specify defect type? 
## Actually, can infer above by reading in species name at end of coord lines and comparing to perfect
### Read from end of lines that start with 'atom' up to the first white space to determine species
#### For each first occurence of species, define new count variable (species1 and species1_count)
##### Make above into 'define_defect_type' function?

# Make a 'find_defect_coord' function? (reuse this and above in step4)

# Compute the minimum x, y, z distances from the defect to the supercell boundary, append to a list
# Use min of the above list to set sigma, must be less than min dist to supercell boundary? -- test!



# Use glob to go through all dirs
# Read in species type at end of coord line
## count no. of each type, determine defect type by comparing number of each species to perfect list

sigma= 1.4

## 3. Determine cutoff for Poisson solver based on sigma of Gaussian charge model

The script below sets the cutoff for the plane waves used in CoFFEE's Poisson solver. Smaller sigma values require more plane waves to achieve convergence and the more plane waves the slower the solver will be. 

If it turns out that you are unable to achieve convergence with the cutoff value set based on sigma, there is also the option to override the value and set this manually. This is the 'manual_cutoff' variable in step 1.

In [46]:
# Script to compute plane wave cutoff based on sigma (20.0 Hartree was fine for sigma=1.4)
cutoff_ratio = 20.0/1.4 # Found to converge well in tests
cutoff = sigma*cutoff_ratio

# Override computed value if user has set a manual cutoff
cutoff = manual_cutoff if manual_cutoff else cutoff

## 4. Locate the defect coordinates in the perfect host crystal
This script locates the coordinates of the defect by comparing the defect supercell to the perfect host supercell.
The defect position is referenced to the species that would be in the perfect host, e.g. the S atom removed from the host to make a S vacancy.

These coordinates will be used to define the centre of the Gaussian used in the charge model in the 'in' file for coffee.py. The z-position of the defect will also be used when generating plots of planar average of potentials for the potential alignment (pa).

In [None]:
# Similar to step 2 script to identify type of defect by comparing no. of species in species lists
# Compare geometry.in in path_to_host and path_to_defect
# Once type of defect has been identified, use this to help determine coordinates of defect in host supercell


### Dummy values for script testing
defect_x = 5.1
defect_y = 5.2
defect_z = 5.3

### Smallest distances of the defect from the supercell boundary:

In [None]:
## Also print minimum distance of this defect to the supercell boundary to the user

# x, y, z min distances of defect from supercell boundary and supercell dimensions


## 5. Generate 'in' file for charge model calculation with coffee.py

In [54]:
defect_geom = path_to_defect+'/geometry.in' ### ADD ERROR MESSAGE IF GEOM FILE NOT FOUND???

### Writing 'in' file for coffee.py
coffee_in = open("in", "w")

## CELL PARAMS
coffee_in.write("&CELL_PARAMETERS\n")
coffee_in.write("\n")
coffee_in.write("Lattice_Vectors(normalized):\n")
# Read in lattice vectors from geometry.in, normalize and write to 'in' file for CoFFEE
x_vecs = []
y_vecs = []
z_vecs = []
for line in open(defect_geom, 'r'):
    if re.search('lattice_vector', line):
        words = line.split()
        x_vecs.append(words[1])
        y_vecs.append(words[2])
        z_vecs.append(words[3])
        if line == None:
            print('no matches found')
x_tot = float(x_vecs[0])+float(x_vecs[1])+float(x_vecs[2])
y_tot = float(y_vecs[0])+float(y_vecs[1])+float(y_vecs[2])
z_tot = float(z_vecs[0])+float(z_vecs[1])+float(z_vecs[2])
coffee_in.write(str(float(x_vecs[0])/x_tot)+"   "+str(float(y_vecs[0])/y_tot)+"   "+str(float(z_vecs[0])/z_tot)+"\n")
coffee_in.write(str(float(x_vecs[1])/x_tot)+"   "+str(float(y_vecs[1])/y_tot)+"   "+str(float(z_vecs[1])/z_tot)+"\n")
coffee_in.write(str(float(x_vecs[2])/x_tot)+"   "+str(float(y_vecs[2])/y_tot)+"   "+str(float(z_vecs[2])/z_tot)+"\n")
coffee_in.write("\n")
coffee_in.write("Cell_dimensions angstrom\n")
coffee_in.write(str(x_tot)+"  "+str(y_tot)+"   "+str(z_tot)+"\n")
coffee_in.write("\n")
coffee_in.write("Ecut="+str(cutoff)+" Hartree\n")
coffee_in.write("/\n")
coffee_in.write("\n")

## DIELECTRIC PARAMS
coffee_in.write("&DIELECTRIC_PARAMETERS Bulk\n")
coffee_in.write("Epsilon1_a1 = "+str(dielectric_xx)+"\n")
coffee_in.write("Epsilon1_a2 = "+str(dielectric_yy)+"\n")
coffee_in.write("Epsilon1_a3 = "+str(dielectric_zz)+"\n")
coffee_in.write("/\n")
coffee_in.write("\n")

## GAUSSIAN PARAMS (used for charge model)
coffee_in.write("&GAUSSIAN_PARAMETERS:\n")
coffee_in.write("Total_charge = "+str(defect_charge)+"\n")
coffee_in.write("Sigma = "+str(sigma)+"\n")
# Centre of Gaussian is set as defect location
coffee_in.write("Centre_a1 = "+str(defect_x)+"\n")
coffee_in.write("Centre_a2 = "+str(defect_y)+"\n")
coffee_in.write("Centre_a3 = "+str(defect_z)+"\n")
coffee_in.write("/\n")
coffee_in.close()


## 6. Run CoFFEE

### 6a. Obtain E_lat for the charge model
See the [CoFFEE paper](https://www.sciencedirect.com/science/article/pii/S0010465518300158) for details.

### 6b. Obtain planar average of the potential of the charge model
Outputs from step 6a are used to obtain the planar average of potential for charge model from CoFFEE. This is needed for the potential alignment step. 

The script below uses V_r.npy generated from step2 with in_V to obtain plavg_a3.plot ????? - check CoFFEE docs

In [5]:
# UPDATE TO RUN ALSO WITH PYTHON

%%bash


# Performing icc with CoFFEE
./${path_to_coffee_dir}/coffee.py in > ${charge_model_file}

# Obtaining planar average of potential for charge model with CoFFEE


/Users/suzy/Desktop/DefectAnalysis/CoFFEE_1.1/coffee.py in


## 7. Perform potential alignment (pa)
Use plavg_a3.plot for the planar average of the charge model obtained in step 6 from CoFFEE with 'plane_average_realspace_ESP.out' files from FHI-aims calculation for the perfect host, neutral defect and charged defect.

With thanks to Tong Zhu for the original plotting script the one below is based on.

In [None]:
%matplotlib inline

### USER INPUTS ### -- update to refer to relevant directories specified at the top of the notebook
host_pot = 
neutral_defect_pot = 
charged_defect_pot =
###################


"""
Created on Wed Oct 24 13:11:47 2018

@author: tong

For correction  only V_0- V_Host  
The usage should be:  plot.py charge lattice_constant_z defect_pos_z  Host_potential_filename  defect_neutral_potential_filename 
charge:  defect charges 
lattice_constant_z:  z-direction lattice_constant 
defect_pos_z:  defect_position at z-direction 
e.g.   plot.py  -1 18.52  4.59  origin_plane_averge_esp.out  m0_plane_averge_esp.out  



For the whole correction by CoFFEE, i.e. two terms picture  q(V_0 - V_host) and  q((V_charge-V_0) - V_model)

The usage should be: plot.py charge lattice_constant_z defect_pos_z  Host_potential_filename  defect_neutral_potential_filename defect_charge_potential_filename  defect_model_potential_filename  

"""



#unit conversion  
Ha_eV = 27.2116   # hartree to eV
b_A = 0.529177249  # bohr to angstrom 

def read_file(Filename):  
    # reading the plane-average-file from FHI-aims output, 
    #and obtain the free-average-contribution, and the raw data 
    F = open(Filename, "r")
    res = F.readline().split()
    dim=len(res)
    average_free = float(res[dim-1])
    F.readline()
    F.readline()
    data = F.readlines()
    dim = len(data)
    result  = np.zeros([dim,2])
    i = 0 
    for line in data:
      words = line.split()
      result[i][0] = float(words[0])
      result[i][1] = float(words[1])
      i=i+1
    return average_free, result 

dim_sys = len(sys.argv)
charge = float(sys.argv[1])
lattice_constant_z = float(sys.argv[2])
Defect_pos = float(sys.argv[3])
Origin_F=sys.argv[4]
Defect_neutral_F=sys.argv[5]
if dim_sys > 6: 
   Defect_charge_F=sys.argv[6] 
   Defect_model_F = sys.argv[7]


offset_o, origin = read_file(Origin_F)
offset_d, defect_neutral = read_file(Defect_neutral_F)


X = defect_neutral[:,0]
Y = charge*(defect_neutral[:,1]*Ha_eV - offset_d - (origin[:,1]*Ha_eV-offset_o)) 

print charge 
print lattice_constant_z

#begin ploting q*(V(Defect,0) - V(Host))
Y_min = min(Y)
Y_max = max(Y)
Ymin = Y_min - 0.15*(Y_max-Y_min)
Ymax = Y_max + 0.15*(Y_max-Y_min)
plt.xlim((0,lattice_constant_z))
plt.ylim((Ymin,Ymax))
plt.axvline(x=Defect_pos,linestyle='--',color='orange')
plt.plot(X,Y,'or',label='q(V(Defect,0)-V(Host))')
plt.legend()
plt.xlabel(r'Z-coordinates ($\AA$)',fontsize=15,fontname = "Times New Roman")
plt.ylabel('Potential (eV)',fontsize=15,fontname = "Times New Roman")
plt.savefig("V0-Vhost",dpi=600)

#begin plot    V(Defect,charge) - V(Defect,0) and   V_Model 
if dim_sys > 6:
   offset_d2, defect_charge = read_file(Defect_charge_F)
   Model = np.loadtxt(Defect_model_F)
   X1 = defect_charge[:,0]
   Y1 = defect_charge[:,1]*Ha_eV-offset_d2 - (defect_neutral[:,1]*Ha_eV-offset_d)
   X_model = Model[:,0]*b_A
   Y_model = Model[:,1]*-1.0  

   Y_min = min(Y_model)
   Y_max = max(Y_model)
   Ymin = Y_min - 0.15*(Y_max-Y_min)
   Ymax = Y_max + 0.15*(Y_max-Y_min)


   plt.clf()
   plt.xlim((0,lattice_constant_z))
   plt.ylim((Ymin,Ymax))
   plt.axvline(x=Defect_pos,linestyle='--',color='orange')
   plt.plot(X1,Y1,'or',label='V(Defect,q)-V(Defect,0)')
   plt.plot(X_model,Y_model,'--k',label='V(Model)')
   plt.legend()
   plt.xlabel(r'Z-coordinates ($\AA$)',fontsize=15,fontname = "Times New Roman")
   plt.ylabel('Potential (eV)',fontsize=15,fontname = "Times New Roman")
   plt.savefig("V_charge-V0",dpi=600)
