Back to the main [Index](index.ipynb)

# Creating an `AbinitInput` object

The creation of the Abinit input file is one of the most repetive and error-prone operations we have to perform
before running our calculations.
To facilitate the creation of the input files, one can use `AbinitInput`, a dict-like object 
that stores the Abinit variables and provides helper functions to automate the specification 
of multiple parameters.

This notebook discusses how to create an `AbiniInput` and how to define the parameters of the calculation.

In [2]:
from __future__ import division, print_function, unicode_literals

import os
import warnings
warnings.filterwarnings("ignore") # to get rid of deprecation warnings

import abipy.data as abidata
import abipy.abilab as abilab
from abipy.abilab import AbinitInput

To create an Abinit input we must specify the paths of the pseudopotential files.
In this case, we have a pseudo named `14si.pspnc` located 
in the abidata.pseudo_dir directory.

In [3]:
inp = AbinitInput(structure=abidata.cif_file("si.cif"), 
                  pseudos="14si.pspnc", pseudo_dir=abidata.pseudo_dir)

`print(inp)` returns a string with our input. 
In this case, the input is almost empty since only the structure and the pseudos have been specified.

In [43]:
print(inp)

############################################################################################
#                                         STRUCTURE                                         
############################################################################################
 natom 2
 ntypat 1
 typat 1 1
 znucl 14
 xred
    0.0000000000    0.0000000000    0.0000000000
    0.2500000000    0.2500000000    0.2500000000
 acell    1.0    1.0    1.0
 rprim
    6.3285005272    0.0000000000    3.6537614829
    2.1095001757    5.9665675167    3.6537614829
    0.0000000000    0.0000000000    7.3075229659


#<JSON>
#{
#    "pseudos": [
#        {
#            "basename": "14si.pspnc",
#            "type": "NcAbinitPseudo",
#            "symbol": "Si",
#            "Z": 14,
#            "Z_val": 4.0,
#            "l_max": 2,
#            "md5": "3916b143991b1cfa1542b130be320e5e",
#            "filepath": "/Users/gmatteo/git_repos/abipy/abipy/data/pseudos/14si.pspnc",
#            "@module": "py

Inside the jupyter notebook, it's possible to visualize the input with HTML and links to the official documentation: 

In [44]:
inp

Use  `set_vars` to set the value of multiple variable with a single call:

In [45]:
inp.set_vars(ecut=8, paral_kgb=0)

{'ecut': 8, 'paral_kgb': 0}

`AbinitInput` is a dict-like object, hence one can test for the presence of a variable in the input:

In [46]:
"ecut" in inp

True

To list all the variables that have been defined, use:

In [47]:
list(inp.keys())

['ecut', 'paral_kgb']

To access the value of a particular variable use the syntax:

In [48]:
inp["ecut"]

8

Iterating over keywords and values:

In [49]:
for varname, varvalue in inp.items(): 
    print(varname, "-->", varvalue)

ecut --> 8
paral_kgb --> 0


Use lists, tuples or numpy arrays when abinit expects arrays

In [50]:
inp.set_vars(kptopt=1, 
             ngkpt=[2, 2, 2], 
             nshiftk=2, 
             shiftk=[0.0, 0.0, 0.0, 0.5, 0.5, 0.5]  # 2 shifts in one list
            )

# It's possible to use strings but use them only for special cases such as:
inp["istwfk"] = "*1"
inp

If you mistype the name of the variable, `AbinitInput` raises an error:

In [51]:
try:
    inp.set_vars(perl=0)
except Exception as exc:
    print(exc)

perl is not a valid ABINIT variable.
If the name is correct, try to remove ~/.abinit/abipy/abinit_vars.pickle
and rerun the code. If the problems persists, contact the abipy developers
or use input.set_spell_check(False)
or add the variable to ~abipy/data/variables/abinit_vars.json



# Defining the crystalline structure

The `set_structure` method sets the value of the ABINIT variables:
    
   * acell
   * rprim
   * ntypat
   * natom 
   * typat
   * znucl
   * xred
    
It is always a good idea to set the structure immediately after the creation of `AbinitInput`
because several methods use this information to faciliate the specification of other variables 
For example, the `set_kpath` method use the structure to generate the $k$-path for band structure calculations.

<div class="alert alert-warning">
`typat` must be consistent with the list of pseudopotentials passed to `AbinitInput`
</div>

### Structure from dictionary:

In [52]:
structure = dict(
    ntypat=1,         
    natom=2,
    typat=[1, 1],
    znucl=14,
    acell=3*[10.217],
    rprim=[[0.0,  0.5,  0.5],   
           [0.5,  0.0,  0.5],
           [0.5,  0.5,  0.0]],
    xred=[[0.0 , 0.0 , 0.0],
          [0.25, 0.25, 0.25]]
)

inp = AbinitInput(structure, pseudos=abidata.pseudos("14si.pspnc"))

### Structure from file

From a CIF file:

In [53]:
inp.set_structure(abidata.cif_file("si.cif"))

From a Netcdf file produced by ABINIT:

In [54]:
inp.set_structure(abidata.ref_file("si_scf_GSR.nc"))

Supported formats include:

   * CIF
   * POSCAR/CONTCAR
   * CHGCAR 
   * LOCPOT,
   * vasprun.xml
   * CSSR 
   * ABINIT Netcdf 
   * pymatgen's JSON serialized structures

## From the Materials Project database:

In [55]:
# https://www.materialsproject.org/materials/mp-149/
inp.set_structure(abilab.Structure.from_material_id("mp-149"))

Note that you can avoid the call to `set_structure` if the `structure` argument is passed to 
`AbiInput`:

In [56]:
AbinitInput(structure=abidata.cif_file("si.cif"), pseudos=abidata.pseudos("14si.pspnc"))

## Brillouin zone sampling

There are two different types of sampling of the BZ: homogeneous and high-symmetry k-path.
The later is mainly used for band structure calculations and requires the specification of: 

   * kptopt
   * kptbounds
   * ndivsm
    
whereas the homogeneous sampling is needed for all the calculations in which 
we have to compute integrals in the Brillouin zone e.g. total energy calculations, DOS, etc.
The $k$-mesh is usually specified via:

   * ngkpt
   * nshiftk
   * shiftk

### Explicit $k$-mesh

In [57]:
inp = AbinitInput(structure=abidata.cif_file("si.cif"), pseudos=abidata.pseudos("14si.pspnc"))

# Set ngkpt, shiftk explictly  
inp.set_kmesh(ngkpt=(1, 2, 3), shiftk=[0.0, 0.0, 0.0, 0.5, 0.5, 0.5])

{'kptopt': 1,
 'ngkpt': (1, 2, 3),
 'nshiftk': 2,
 'shiftk': array([[ 0. ,  0. ,  0. ],
        [ 0.5,  0.5,  0.5]])}

### Automatic $k$-mesh 

In [58]:
# Define a homogeneous k-mesh. 
# nksmall is the number of divisions to be used to sample the smallest lattice vector,
# shiftk is automatically selected from an internal database.
inp.set_autokmesh(nksmall=4)

{'kptopt': 1,
 'ngkpt': array([4, 4, 4]),
 'nshiftk': 4,
 'shiftk': array([[ 0.5,  0.5,  0.5],
        [ 0.5,  0. ,  0. ],
        [ 0. ,  0.5,  0. ],
        [ 0. ,  0. ,  0.5]])}

### High-symmetry $k$-path

In [59]:
# Generate a high-symmetry k-path (taken from an internal database)
# 2 points are used to sample the smallest segment, 
# the other segments are sampled so that proportions are preserved.
inp.set_kpath(ndivsm=10)

{'iscf': -2, 'kptbounds': array([[ 0.   ,  0.   ,  0.   ],
        [ 0.5  ,  0.   ,  0.5  ],
        [ 0.5  ,  0.25 ,  0.75 ],
        [ 0.375,  0.375,  0.75 ],
        [ 0.   ,  0.   ,  0.   ],
        [ 0.5  ,  0.5  ,  0.5  ],
        [ 0.625,  0.25 ,  0.625],
        [ 0.5  ,  0.25 ,  0.75 ],
        [ 0.5  ,  0.5  ,  0.5  ],
        [ 0.375,  0.375,  0.75 ],
        [ 0.625,  0.25 ,  0.625],
        [ 0.5  ,  0.   ,  0.5  ]]), 'kptopt': -11, 'ndivsm': 10}

# Utilities

### Once the structure has been defined, one can compute the number of valence electrons with:

In [60]:
print("The number of valence electrons is: ", inp.num_valence_electrons)

The number of valence electrons is:  8.0


### Generating inputs for convergence studies

In [61]:
# When using a non-integer step, such as 0.1, the results will often not
# be consistent.  It is better to use ``linspace`` for these cases.
# See also numpy.arange and numpy.linspace
ecut_inps = inp.arange("ecut", start=2, stop=5, step=2)

print([i["ecut"] for i in ecut_inps])

[2, 4]


In [62]:
tsmear_inps = inp.linspace("tsmear", start=0.001, stop=0.003, num=3)
print([i["tsmear"] for i in tsmear_inps])

[0.001, 0.002, 0.0030000000000000001]


# Invoking Abinit with AbinitInput

Once you have an `AbinitInput` you can call Abinit to get useful information 
or simply to validate the input file before running the real calculation.
All the method that invoke Abinit starts with `abi` e.g. `abiget` or `abivalidate`.

In [63]:
inp = AbinitInput(structure=abidata.cif_file("si.cif"), pseudos=abidata.pseudos("14si.pspnc"))

inp.set_vars(ecut=-2)
inp.set_autokmesh(nksmall=4)

v = inp.abivalidate() 
if v.retcode != 0: 
    # If there's a mistake in the input, one can acces the log file of the run with the log_file object
    print("".join(v.log_file.readlines()[-10:]))

src_file: chkinp.F90
src_line: 3394
mpi_rank: 0
message: |
    Checking consistency of input data against itself gave 2 inconsistencies.
    The details of the problems can be found above.
...


 leave_new: decision taken to exit ...



In [64]:
# Fix the problem with the negative ecut and rerun validate!
inp["ecut"] = 2
inp["toldfe"] = 1e-10
v = inp.abivalidate()
if v.retcode == 0: 
    print("All ok")
else:
    print(v)

All ok


At this point, we have a valid input file and we can get the k-points in the irreducible zone with:

In [65]:
ibz = inp.abiget_ibz()
print("number of k-points:", len(ibz.points))
print("k-points:", ibz.points)
print("weights:", ibz.weights)
print("weights are normalized to:", ibz.weights.sum())

number of k-points: 10
k-points: [[-0.125 -0.25   0.   ]
 [-0.125  0.5    0.   ]
 [-0.25  -0.375  0.   ]
 [-0.125 -0.375  0.125]
 [-0.125  0.25   0.   ]
 [-0.25   0.375  0.   ]
 [-0.375  0.5    0.   ]
 [-0.25   0.5    0.125]
 [-0.125  0.     0.   ]
 [-0.375  0.     0.   ]]
weights: [ 0.09375  0.09375  0.09375  0.1875   0.09375  0.09375  0.09375  0.1875
  0.03125  0.03125]
weights are normalized to: 1.0


In [66]:
# To get the list of possible parallel configurations for this input up to 5 max_ncpus
inp["paral_kgb"] = 1
pconfs = inp.abiget_autoparal_pconfs(max_ncpus=5)

In [67]:
print("best efficiency:\n", pconfs.sort_by_efficiency()[0])
print("best speedup:\n", pconfs.sort_by_speedup()[0])

best efficiency:
 {'efficiency': 0.975,
 'mem_per_cpu': 0.0,
 'mpi_ncpus': 5,
 'omp_ncpus': 1,
 'tot_ncpus': 5,
 'vars': {'bandpp': 1,
          'npband': 1,
          'npfft': 1,
          'npimage': 1,
          'npkpt': 5,
          'npspinor': 1}}

best speedup:
 {'efficiency': 0.975,
 'mem_per_cpu': 0.0,
 'mpi_ncpus': 5,
 'omp_ncpus': 1,
 'tot_ncpus': 5,
 'vars': {'bandpp': 1,
          'npband': 1,
          'npfft': 1,
          'npimage': 1,
          'npkpt': 5,
          'npspinor': 1}}



In [68]:
# To get the list of irreducible phonon perturbations at Gamma (Abinit notation)
inp.abiget_irred_phperts(qpt=(0, 0, 0))

[{'idir': 1, 'ipert': 1, 'qpt': [0.0, 0.0, 0.0]}]

# Multiple datasets

Multiple datasets are handy when you have to generate several input files sharing several common
variables e.g. the crystalline structure, the smearing value etc...
In this case, one can use the `MultiDataset` object that is essentially 
a list of `AbinitInput` objects.

In [69]:
# A MultiDataset object with two datasets (a.k.a. AbinitInput) 
multi = abilab.MultiDataset(structure=abidata.cif_file("si.cif"),
                            pseudos="14si.pspnc", pseudo_dir=abidata.pseudo_dir, ndtset=2)

# A MultiDataset is essentially a list if AbinitInput objects 
# with handy methods to perform global modifications.
# i.e. changes that will affect all the inputs in the MultiDataset
# For example:
multi.set_vars(ecut=4)

# is equivalent to
#
#   for inp in multi: inp.set_vars(ecut=4)
#
# and indeed:

for inp in multi: 
    print(inp["ecut"])

4
4


In [70]:
# To change the values in a particular dataset use:
multi[0].set_vars(ngkpt=[2,2,2], tsmear=0.004)
multi[1].set_vars(ngkpt=[4,4,4], tsmear=0.008)

multi

<div class="alert alert">
Remember that in python we start to count from zero hence the first dataset has index 0.
</div>

In [71]:
# Calling set_structure on MultiDataset will set the structure of the inputs
multi.set_structure(abidata.cif_file("si.cif"))

# The structure attribute of a MultiDataset returns a list of structures 
# equivalent to [inp.structure for inp in multi]
print(multi.structure)

[Structure Summary
Lattice
    abc : 3.8669746200000001 3.8669746200000001 3.8669746200000001
 angles : 59.999999999999993 59.999999999999993 59.999999999999993
 volume : 40.888291793468909
      A : 3.3488982567096763 0.0 1.9334873100000005
      B : 1.1162994189032256 3.1573715557642927 1.9334873100000005
      C : 0.0 0.0 3.8669746200000001
PeriodicSite: Si (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000]
PeriodicSite: Si (1.1163, 0.7893, 1.9335) [0.2500, 0.2500, 0.2500], Structure Summary
Lattice
    abc : 3.8669746200000001 3.8669746200000001 3.8669746200000001
 angles : 59.999999999999993 59.999999999999993 59.999999999999993
 volume : 40.888291793468909
      A : 3.3488982567096763 0.0 1.9334873100000005
      B : 1.1162994189032256 3.1573715557642927 1.9334873100000005
      C : 0.0 0.0 3.8669746200000001
PeriodicSite: Si (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000]
PeriodicSite: Si (1.1163, 0.7893, 1.9335) [0.2500, 0.2500, 0.2500]]


The function `split_datasets` return the list of `AbinitInput` stored in MultiDataset

In [72]:
inp0, inp1 = multi.split_datasets()
inp0

<div class="alert alert">
You can use `MultiDataset` to build your input files but remember that 
`Abipy` workflows will never support input files with more than one dataset.
As a consequence, you should always pass an `AbinitInput` to the 
AbiPy functions that are building `Tasks`, `Works` or `Flows`.
</div>

In [73]:
print("Number of datasets:", multi.ndtset)

Number of datasets: 2


In [74]:
# To create and append a new dataset (initialized from dataset number 1)
multi.addnew_from(1)
multi[-1].set_vars(ecut=42)
print("Now multi has", multi.ndtset, "datasets and the ecut in the last dataset is:", 
      multi[-1]["ecut"])

Now multi has 3 datasets and the ecut in the last dataset is: 42


# Factory functions

`abilab` provides factory functions to build input files for typical calculations.
Note that the default values do not correspond to the default behaviour of Abinit.
In particular, the majority of the factory functions construct input files 
for spin-polarized calculations (`nsppol=2`) with a Fermi-Dirac occupation scheme and 
a physical temperature of 0.1 eV. It always possible to change the default behaviour either
by passing these options to the factory function or by changing the `AbinitInput` object returned by the factory.

Let's try to generate an input file for a standard GS calculation for Silicon in which 
the structure is read from a CIF file

In [75]:
si_cif = abidata.cif_file("si.cif")
pseudos = os.path.join(abidata.pseudo_dir, "14si.pspnc")

# Build input for GS calculation (unpolarized, no smearing, 1000 k-points per reciprocal atom) 
# ecut must be specified because this pseudopotential does not provide hints for ecut.
# kppa stands for k-point per reciprocal atom.
gs_inp = abilab.gs_input(
    si_cif, pseudos,
    kppa=1000, ecut=8, spin_mode="unpolarized", smearing=None) # change default

inp.set_mnemonics(True)
gs_inp

## Input variables for band structure calculation + DOS

In [76]:
# A slightly more complicated example:
# GS run + NSCF on a path + NSCF run on a k-mesh to compute the DOS
multi = abilab.ebands_input(si_cif, pseudos,
                            ecut=8, spin_mode="unpolarized", smearing=None, dos_kppa=5000)

multi

## Input variables for GW calculations:

In [77]:
# Generate an input file for GW calculations with the plasmon-pole model.
# The calculations consists of a GS run to get the density followed by a 
# nscf-run to compute the WFK file with `nscf_nband` states.
# The cutoff for the screening is given by `ecuteps` while the cutoff for
# the exchange part of the self-energy is equal to ecut.
# kppa defines the k-point sampling.
kppa = 1000
ecut = ecutsigx = 8
ecuteps = 2
nscf_nband = 50

multi = abilab.g0w0_with_ppmodel_inputs(
    si_cif, pseudos, kppa, nscf_nband, ecuteps, ecutsigx,
    ecut=ecut, smearing=None, spin_mode="unpolarized")

multi