# Introduction: Welcome to the PySCF tutorial

This component of the tutorial is primarily concerned with computational chemistry calculations and will not cover the theoretical details of the methods utilized [i.e., Hartree–Fock (HF), Kohn–Sham Density Functional Theory (KS-DFT), Møller–Plesset Peturbation Theory (MP), Coupled Cluster Theory (CC), etc.]. For specifics regarding these methods, please see [TODO].

We will go through the steps necessary to carry out a variety of relevant calculations. These include:

* Single Point Energies
  * Ground State
    1. atomization energies and bond dissociation energies
    2. reaction energies
    3. electron affinities, ionization potentials, and proton affinities (vertical and adiabatic)
    4. relative energies of isomers
    5. binding energies of non-covalent complexes (weak/hydrogen/rare-gas)
    6. barrier heights
    7. potential energy curves
  * Excited State (Time-Dependent)
    1. TD-HF/TD-DFT
    2. wavefunction stability analysis
    3. orbital visualization

* Forces/Geometry Optimizations
  * Ground State
    1. force calculation of equilibrium structure
    2. force calculation of non-equilibrium structure
    3. transition state search
    4. geometry optimization
  * Excited State (Time-Dependent)
    1. ???

* Frequencies
  * Ground State
    1. vibrational modes
    2. free-energy calculation
    3. is the result of a geometry optimization a local minimum?
  * Excited State (Time-Dependent)
    1. ???

Furthermore, we will demonstrate the effect of the most important computational settings on the final calculated results using a handful of the above interactions:
  * Relevant Settings
    1. basis set (minimal vs. double vs. triple. vs. quadruple vs. basis set limit) BSIE
    2. DFT integration grid (coarse vs. fine, etc.)
    3. method (HF/DFT/MP2/CC/etc.)
    4. counterpoise corrections (which is an extension of basis set category) talk about BSIE/BSSE
    5. integral threshes and convergence criteria
    6. cart vs spherical?
    7. restricted vs unrestricted
    8. density fitting
    9. opt algo, like diis gdm Newton etc

---

# Getting started

In order to get started, we need to import the pyscf module, which is accomplished by the command in the Code cell below. If this command fails, please see the [quick setup guide](http://sunqm.github.io/pyscf/tutorial.html#quick-setup) or the [detailed installation instructions](http://sunqm.github.io/pyscf/install.html).

In [1]:
import pyscf

The next step is to define the molecule, which requires importing the gto (gaussian type orbital) submodule:

In [2]:
from pyscf import gto

---

# Creating a molecule (mol) object with gto.Mole() <a name="molecule"></a>

We will now go through the steps involved in setting up a molecule, including a description of the eight most important attributes. The first molecule used in this demonstration is the water dimer (two hydrogen-bonded water molecules) from the widely-used [S22 dataset](http://pubs.rsc.org/en/Content/ArticleLanding/2006/CP/B600027D#!divAbstract) of non-covalent interactions, and can be found [here](http://www.begdb.com/index.php?action=oneMolecule&state=show&id=82). The molecule is pictured here:

<img style="float: left;" src="images/water_dimer.jpg"><br clear="all" />

The first step is to create the molecule object, mol:

In [3]:
mol=gto.Mole()

Note: The molecule object can be named nearly anything (e.g., mol1, mol2, mymol, etc.).

## atom attribute
#### (NO DEFAULT)

PySCF allows for a variety of [molecular input formats](http://sunqm.github.io/pyscf/gto.html#geometry), but the one that is most suitable for the current example is the use of triple-quotes. This allows one to simply copy and paste the geometry from the BEGDB website without further modification:

In [4]:
mol.atom="""
O  -1.551007  -0.114520   0.000000
H  -1.934259   0.762503   0.000000
H  -0.599677   0.040712   0.000000
O   1.350625   0.111469   0.000000
H   1.680398  -0.373741  -0.758561
H   1.680398  -0.373741   0.758561"""

The triple-quotes can be used to read four unique molecular input formats:

1. XYZ (see geom/water_dimer.xyz) [TODO]
2. Z-matrix (see geom/water_dimer.zmat) [TODO]
3. Cartesian coordinates only (see geom/water_dimer.mol and example above)
4. Cartesian coordinates or Z-matrix prepended by a line containing the charge and spin separated by a space (see geom/water_dimer.qc) [TODO]

Another way to set up a molecule is to read the coordinates from a file on disk. The file containing the molecule can be formatted in any of the four ways listed above. As long as the file adheres to one of these formats, PySCF can automatically parse the file regardless of the extension. Thus, reading in a molecule file can be accomplished simply via [TODO]:

In [5]:
#mol.atom=read_file('water_dimer.mol')

## basis attribute
#### (NO DEFAULT)

The simplest way to set up the basis is to specify the same basis set for all atoms:

In [6]:
mol.basis='aug-cc-pVDZ'

In this case, the aug-cc-pVDZ basis set is used for all of the hydrogen and oxygen atoms.

Alternatively, one can use a different basis set for different elements by using Python's dictionary data structure:

In [7]:
mol.basis={'H': 'cc-pVDZ', 'O': 'aug-cc-pVDZ'}

In this case, the cc-pVDZ basis set is used for all of the hydrogen atoms, while the diffuse aug-cc-pVDZ basis set is used for all of the oxygen atoms.

One can also use different basis sets for different atoms of the same element by labeling the atoms with integers:

In [8]:
mol.atom="""
O1  -1.551007  -0.114520   0.000000
H2  -1.934259   0.762503   0.000000
H3  -0.599677   0.040712   0.000000
O4   1.350625   0.111469   0.000000
H5   1.680398  -0.373741  -0.758561
H6   1.680398  -0.373741   0.758561"""
mol.basis={'O1': 'cc-pVDZ', 'H2': 'cc-pVDZ', 'H3': 'aug-cc-pVDZ', 'O4': 'aug-cc-pVDZ', 'H5': 'cc-pVDZ', 'H6': 'cc-pVDZ'}

In this case, the cc-pVDZ basis set is used for all of the atoms except for those involved in the hydrogen bond (H3 and O4). The diffuse aug-cc-pVDZ basis set is used for H3 and O4.

Finally, one common way of reducing basis set superposition error (BSSE) is to use counterpoise corrections by employing ghost functions on ghost atoms. A useful guide to BSSE can be found [here](http://vergil.chemistry.gatech.edu/notes/cp.pdf). In PySCF, counterpoise corrections can be applied via simple modifications to the atom and basis attributes.

For example, running the water dimer calculation with ghost functions on the second water monomer can be accomplished via:

In [10]:
mol.atom="""
O  -1.551007  -0.114520   0.000000
H  -1.934259   0.762503   0.000000
H  -0.599677   0.040712   0.000000
X-O   1.350625   0.111469   0.000000
X-H   1.680398  -0.373741  -0.758561
X-H   1.680398  -0.373741   0.758561"""
mol.basis='aug-cc-pVDZ'

In this case, the parser understands that the latter three atoms are meant to be ghost atoms since the atomic symbol is prepended with the string, 'X-'.

Advanced users who are interested in using custom basis sets or basis sets that are not pre-defined in PySCF are directed [here](http://sunqm.github.io/pyscf/gto.html#input-basis). Useful resources for the latter include the [EMSL basis set exchange](https://bse.pnl.gov/bse/portal), the [PSI4 basis set page](http://www.psicode.org/psi4manual/master/basissets_byfamily.html), Grant Hill's [ccRepo](http://www.grant-hill.group.shef.ac.uk/ccrepo/), and the [TURBOMOLE basis set library](http://www.cosmologic-services.de/basis-sets/basissets.php). PySCF can parse NWChem- and Gaussian94-formatted basis sets [TODO].

For example, the 'doubly-augmented' d-aug-cc-pVDZ basis set is not pre-defined in PySCF. If one wants to use the d-aug-cc-pVDZ basis set for the oxygen atoms and the cc-pVDZ basis set for the hydrogen atoms, then the appropriate basis setup would be:

In [10]:
mol.basis={'H': 'cc-pVDZ','O': gto.basis.parse('''
O    S
  11720.0000000              0.0007100             -0.0001600        
   1759.0000000              0.0054700             -0.0012630        
    400.8000000              0.0278370             -0.0062670        
    113.7000000              0.1048000             -0.0257160        
     37.0300000              0.2830620             -0.0709240        
     13.2700000              0.4487190             -0.1654110        
      5.0250000              0.2709520             -0.1169550        
      1.0130000              0.0154580              0.5573680        
O    S
      0.3023000              1.0000000        
O    S
      0.0789600              1.0000000        
O    S
      0.0206000              1.0000000        
O    P
     17.7000000              0.0430180        
      3.8540000              0.2289130        
      1.0460000              0.5087280        
O    P
      0.2753000              1.0000000        
O    P
      0.0685600              1.0000000        
O    P
      0.0171000              1.0000000        
O    D
      1.1850000              1.0000000        
O    D
      0.3320000              1.0000000        
O    D
      0.0930000              1.0000000        
''')}

Additionally, PySCF can read a file containing the desired basis set [TODO]:

In [11]:
#mol.basis={'H': gto.basis.read('basis/STO-2G.dat'),'O': 'STO-3G'}

## cart attribute
#### DEFAULT: `mol.cart=0`

The cart attribute determines whether the d and/or higher basis functions are taken to be *spherical* (i.e., d=5 functions, f=7 functions, g=9 functions, etc.) or *Cartesian* (i.e., d=6 functions, f=10 functions, g=15 functions, etc.). `mol.cart=0` specifices *spherical* functions and `mol.cart=1` specifies *Cartesian* functions. If an existing basis set in PySCF is being used, the cart attribute is automatically set based on how the basis set itself was optimized. Thus, it is not necessary to define this attribute for most calculations.

There are two instances where specifying the cart attribute is necessary:

1.  If either a custom basis set is being used or one that is not pre-defined in PySCF, it is important to determine the appropriate value of cart and set it accordingly. For example, the [UGBS basis set](https://aip.scitation.org/doi/abs/10.1063/1.475959) is commonly used to determine absolute atomic energies. This basis set was optimized for use with spherical functions. Since UGBS is not pre-defined in PySCF, it is important to set `mol.cart=0`.

2.  If the user is interested in experimenting with the effect of using spherical vs. Cartesian functions for a given basis set, it is possible to override the default for pre-defined basis sets. For example, the Dunning basis sets (cc-pVXZ and aug-cc-pVXZ) are meant to be used with spherical functions (`mol.cart=0`), yet one can set `mol.cart=1` to gauge the sensitivity of absolute and relative energies to this setting.

## charge attribute
#### DEFAULT: `mol.charge=0`

The charge attribute sets the charge of the molecule. One should set `mol.charge=0` for neutral systems, `mol.charge=1` for cations, `mol.charge=2` for dications, `mol.charge=-1` for anions, `mol.charge=-2` for dianions, etc.

## ecp attribute
#### DEFAULT: `mol.ecp={}`

When heavy elements are present in a molecule, standard Gaussian basis sets are oftentimes both insufficient (because they are unable to capture the relativistic nature of heavier elements) and impractical (because as one descends the rows of the periodic table, Gaussian basis sets tend to become very large and heavily-contracted due to the increased number of electrons). For this reason, a great deal of work has been devoted to developing effective core potentials (ECP) which replace core electrons around a nucleus by pseudopotentials.

In PySCF, the ecp attribute is very similar to the basis attribute. To demonstrate the use of the ecp attribute, we will switch to a system with heavy elements, namely, TeH<sub>2</sub>–HI. This system is from the [HEAVY28](http://www.thch.uni-bonn.de/tc.old/downloads/GMTKN/GMTKN55/HEAVY28.html) dataset of the [GMTKN55](https://www.chemie.uni-bonn.de/pctc/mulliken-center/software/GMTKN/gmtkn55) database. Specifically, it is [data point](http://www.thch.uni-bonn.de/tc.old/downloads/GMTKN/GMTKN55/HEAVY28ref.html) \#27, an intermolecular bond between tellerium hydride and hydrogen iodide:

<img style="float: left;" src="images/TeH2_HI.jpg" width="50%"><br clear="all" />

The atom and basis attributes can be specified as:

In [12]:
mol.atom="""
Te         0.03866320      2.16215120      0.00000000
H         -0.10453280      0.53339420      0.00000000
H         -1.58803480      2.30795520      0.00000000
H          1.61524120     -2.39760280      0.00000000
I          0.03866320     -2.60589780      0.00000000"""
mol.basis='def2-TZVPP'

It is possible to specify a single ECP for all atoms present in the system:

In [13]:
mol.ecp='def2-TZVPP'

However, if there is an atom in the system that does not have an ECP defined (hydrogen in this case), a warning will be shown once the object is built:

`ECP def2-TZVPP not found for  H`

Therefore, it is advisable to use the dictionary data structure to specify the ECP for the heavy elements:

In [14]:
mol.ecp={'Te': 'def2-TZVPP', 'I': 'def2-TZVPP'}

It is also possible to use ECPs that are not defined in PySCF. Useful resources include the four websites listed earlier, as well as the [Stuttgart/Cologne ECP web page](http://www.tc.uni-koeln.de/PP/index.en.html). For example, if one wants to use the older def-TZVPP ECP for Te only (the newer version, def2-TZVPP, is pre-defined in PySCF), it is possible to get the neccessary information from the [TURBOMOLE basis set library](http://www.cosmologic-services.de/basis-sets/basissets.php):

In [15]:
mol.ecp={'Te': gto.basis.parse('''
ECP
Te nelec 46
Te F
2       1.93927000     -17.86464100
Te S
2       2.92379400      50.08380500
2       1.15275400       1.96814000
2       1.93927000      17.86464100
Te P
2       2.60308600     119.82070200
2       0.98544800      -2.03904800
2       1.93927000      17.86464100
Te D
2       1.43501900      37.75721400
2       1.93927000      17.86464100
'''),'I': 'def2-TZVPP'}

PySCF can parse both NWChem-/Dalton- and Gaussian94-formatted ECP files. In addition, PySCF can read a file containing the desired ECP [TODO]:

In [16]:
#mol.ecp={'Te': gto.basis.read('ecp/def-TZVPP.dat'),'I': 'def2-TZVPP'}

## spin attribute
#### DEFAULT: `mol.spin=0`

The spin attribute simply sets the number of unpaired electrons. By default, a closed-shell system is assumed: `mol.spin=0` For open-shell systems, the spin must be set. For example, the boron atom has a single unpaired electron: `mol.spin=1` The carbon atom has two unpaired electrons: `mol.spin=2` And finally, nitrogen has three unpaired electrons: `mol.spin=3` The same concept applies to open-shell molecules. For instance, the hydroxyl radical (HO•) should have `mol.spin=1`, since it has one unpaired electron.

## unit attribute
#### DEFAULT: `mol.charge='Angstrom'`

The unit attribute can be either set to 'Angstrom' or 'Bohr'.

## verbose attribute
#### DEFAULT: `mol.verbose=0`

The verbose attribute controls the print level for the molecule object. Setting `mol.verbose=0` will print little to no information, while setting `mol.verbose=4` prints useful information about the basis and number of basis functions [TODO]. Users who want to see detailed information should set `mol.verbose=10`.

The final step is to build the molecule object (if any attributes are specified or modified after the molecule is built, this command should be executed again):


In [17]:
mol.build()

Warn: Ipython shell catchs sys.args


<pyscf.gto.mole.Mole at 0x7f4054cb4c90>

The following Code cell combines the content covered above into a useable sample input for setting up a molecule. The more advanced attributes are listed (commented) at the bottom of the Code cell and the advanced user can find more information about them [here](http://sunqm.github.io/pyscf/gto.html).

In [18]:
del mol

In [3]:
mol=gto.Mole()

mol.atom="""
O  -1.551007  -0.114520   0.000000
H  -1.934259   0.762503   0.000000
H  -0.599677   0.040712   0.000000
O   1.350625   0.111469   0.000000
H   1.680398  -0.373741  -0.758561
H   1.680398  -0.373741   0.758561"""
mol.basis='aug-cc-pVDZ'
mol.cart=False
mol.charge=0
mol.ecp={}
mol.spin=0
mol.unit='Angstrom'
mol.verbose=0

mol.build()

#ADVANCED ATTRIBUTES – SEE MANUAL
#mol.groupname=
#mol.incore_anyway=
#mol.irrep_id=
#mol.irrep_name=
#mol.max_memory=
#mol.nucmod=
#mol.output=
#mol.symm_orb=
#mol.symmetry=
#mol.symmetry_subgroup=
#mol.topgroup=

Warn: Ipython shell catchs sys.args


<pyscf.gto.mole.Mole at 0x7fc8c2135690>

---

# Mean-field calculations

After the molecule object has been built, it is necessary to create a mean-field object [Hartree–Fock (HF) or Kohn–Sham Density Functional Theory (KS-DFT)].

In order to create a mean-field object, it is necessary to import the scf (self-consistent field) submodule:

In [20]:
from pyscf import scf

For most users, the relevant types of SCF calculations will be:

1. RHF (Restricted Hartree–Fock)
2. UHF (Unrestricted Hartree–Fock)
3. ROHF (Restricted Open-Shell Hartree–Fock)
4. RKS (Restricted Kohn–Sham)
5. UKS (Unrestricted Kohn–Sham)
6. ROKS (Restricted Open-Shell Kohn–Sham)

Initializing the mean-field object for these six types of calculations is demonstrated below. These classes require at least one argument, namely, a molecule object, mol.

1. `mf=scf.RHF(mol)`
2. `mf=scf.UHF(mol)`
3. `mf=scf.RHF(mol)` (where mol is defined to be open-shell, mol.spin!=0)
4. `mf=scf.RKS(mol)`
5. `mf=scf.UKS(mol)`
6. `mf=scf.RKS(mol)` (where mol is defined to be open-shell, mol.spin!=0)

---

# Creating a mean-field (mf) object with scf.RHF()

Since RHF, ROHF, and UHF have nearly identical attributes, the tutorial will proceed with demonstrating an RHF calculation. The first step is to create the mean-field object (RHF in this case):

In [21]:
mf=scf.RHF(mol)

The input object is the molecule object (mol) that was discussed in detail [above](#molecule).

## conv_tol attribute
#### DEFAULT: `mf.conv_tol=1e-09`
In order for an SCF calculation to converge, PySCF requires two criteria to be met. The first is controlled by the conv_tol attribute, namely, the difference in the SCF energy (in Hartrees) between two sucessive cycles.

## conv_tol_grad attribute
#### DEFAULT: `mf.conv_tol_grad=numpy.sqrt(mf.conv_tol)`
The second criterion for convergence is the conv_tol_grad attribute, namely, the root-mean-square [TODO] of the orbital gradient. This is a vector of length nocc\*nvirt.

## direct_scf_tol attribute
#### DEFAULT: `mf.direct_scf_tol=1e-13`
The direct_scf_tol attribute is the threshold for discarding integrals.

## init_guess attribute
#### DEFAULT: `mf.init_guess='minao'`
A good initial guess is vital for the efficient completion of the SCF procedure. There are four initial guess options in PySCF:
1. `mf.init_guess='minao'`
2. `mf.init_guess='atom'`
3. `mf.init_guess='1e'`
4. `mf.init_guess='chkfile'`

The first option generates an initial guess for the density matrix based on the ANO basis, and then projects this onto the basis set specified in `mol.basis`.

The second option generates an initial guess based on a superposition of atomic densities computed in the basis set specified in `mol.basis`.

The third option sets the initial density matrix to zero. This is generally a very poor guess and should only be used for debugging purposes or for checking values between different packages.

The last option is somewhat advanced and reads in an existing density matrix from disk. Further information on this option can be found [here](http://sunqm.github.io/pyscf/scf.html#pyscf.scf.hf.SCF.init_guess_by_chkfile).

## max_cycle attribute
#### DEFAULT: `mf.max_cycle=50`
The max_cycle attribute simply sets the maximum number of SCF cycles that should be performed before the calculation terminates. For systems that are notoriously difficult to converge, the value should be increased to 100 or even 1000.

## max_memory attribute
#### DEFAULT: `mf.max_memory=4000`
The max_memory attribute determines the maximum amount of memory (in Megabytes) that PySCF is allowed to utilize during the SCF procedure. This should be set by the user based on the memory limitations of the computer/server utilized.

## verbose attribute
#### DEFAULT: `mf.verbose=0`
The verbose attribute controls the print level for the mean-field object. Setting `mol.verbose=0` will print only the SCF energy, while setting `mol.verbose=4` prints useful information about SCF settings as well as the SCF energy per iteration, HOMO/LUMO energies, and convergence metrics. Users who want to see detailed information should set `mol.verbose=10`, which provides additional information such as molecular orbital energies.

The final step is to run the SCF calculation:

In [22]:
mf.kernel()

-152.08859934750245

After the SCF kernel has finished running, the mean-field object is updated with several useful output attributes:
1. `mf.mo_coeff` - Molecular orbital (MO) coefficients (matrix where rows are atomic orbitals (AO) and columns are MOs)
2. `mf.mo_energy` - MO energies (vector with length equal to number of MOs)
3. `mf.mo_occ` - MO occupancy (vector with length equal to number of MOs)
4. `mf.e_tot` - Total SCF energy in units of Hartrees
5. `mf.converged` - Status of SCF convergence (True indicates converged and False indicates unconverged)

The following Code cell combines the content covered above into a useable sample input for setting up a mean-field object. The more advanced attributes are listed (commented) at the bottom of the Code cell and the advanced user can find more information about them [here](http://sunqm.github.io/pyscf/scf.html).

In [23]:
mf=scf.RHF(mol)

mf.conv_tol=1e-12
mf.conv_tol_grad=1e-8
mf.direct_scf_tol=1e-13
mf.init_guess='atom'
mf.max_cycle=100
mf.max_memory=8000
mf.verbose=0

mf.kernel()

#ADVANCED ATTRIBUTES – SEE MANUAL
#mf.chkfile=
#mf.conv_check=
#mf.damp=
#mf.diis=
#mf.diis_file=
#mf.diis_space=
#mf.diis_space_rollback=
#mf.diis_start_cycle=
#mf.direct_scf=
#mf.level_shift=

-152.08859934750365

In order to run an unrestricted Hartree–Fock calculation, the steps taken above can be followed, except the mean-field object should be initiated as:

In [24]:
mf=scf.UHF(mol)

---

# Creating a mean-field (mf) object with scf.RKS()

The most widely-used SCF method is undoubtedly density functional theory (DFT), and running a DFT calculation in PySCF is nearly as straightforward as running a Hartree–Fock calculation.

In order to gain access to additional functions that do not pertain to standard SCF/HF, it is necessary to import the dft (density functional theory) module:

In [25]:
from pyscf import dft

The first step is to create the method object (RKS in this case):

In [26]:
mf=scf.RKS(mol)

RHF and RKS (as well as UHF and UKS) share many attributes, so we will only cover the ones that pertain specifically to DFT. These additional attributes primarily concern the exchange-correlation functional and the grid used for its numerical integration.

## xc attribute
#### (NO DEFAULT)

The xc attribute sets the density functional approximation. A vast number of exchange-correlation functionals are available for use in PySCF, essentially all of those that are available in the installed version of libxc. For example, the functionals available in libxc 4.0.4 are listed [here](https://gitlab.com/libxc/libxc/wikis/Functionals-list-4.0.4).

While the parsing rules for the xc attribute are explained in the [PySCF manual](http://sunqm.github.io/pyscf/dft.html#customizing-xc-functional), we will go through a main relevant examples here:

1. Standard (pre-coded) exchange-correlation functional

  * A classic example of this scenario is the beloved B3LYP functional. To run a B3LYP calculation, simply specify `mf.xc='B3LYP'`.
  
  * As long as commas and operators are not used, PySCF will assume that the string indicates a full exchange-correlation functional. Other examples may include `mf.xc='PBE'`, `mf.xc='TPSS'`, `mf.xc='M06-L'`, etc.

2. Combination of a single exchange functional with a single correlation functional

  * This is a popular choice, particularly for using various combinations of the older exchange and correlation functionals from the 1980s and 1990s.
  
  * In PySCF, inserting a comma into the xc attribute string indicates a separation into exchange (left of the comma) and correlation (right of the comma). For instance, if once wants to run B88 exchange with PBE correlation, it is as simple as `mf.xc='B88,PBE'`.
  
  * Since the PBE exchange-correlation functional is a combination of PBE exchange and PBE correlation, it follows that `mf.xc='PBE'` and `mf.xc='PBE,PBE'` are exactly equivalant ways of using the PBE xc functional.
  
  * Other popular combinations of a single exchange functional with a single correlation functional include `mf.xc='B88,LYP'`, `mf.xc='revPBE,PBE'`, and `mf.xc='OPTX,LYP'`. These combinations are historically known as BLYP, revPBE, and OLYP, and can equivalently be called as `mf.xc='BLYP'`, `mf.xc='revPBE'`, and `mf.xc='OLYP'`.
  
  * This option is most useful when one is interested in an unconventional combination, for example, the combination of PBE exchange with TPSS correlation, `mf.xc='PBE,TPSS'`.

3. Single exchange functional or single correlation functional

  * Using a single exchange functional, for example, PBE exchange only, is as simple as `mf.xc='PBE,'`. Similarly, PBE correlation only can be specified with `mf.xc=',PBE'`. In the first case (PBE exchange only), the fact that there is a comma indicates to the parser that that a pre-coded xc functional is not being utilized. Therefore, anything to the left of the comma (PBE) is taken to be the exchange functional, while anything to the right of the comma (nothing) is taken to be the correlation functional. The reverse situation applies to the second case with PBE correlation only.

4. Custom exchange-correlation functional

  * Naturally, it is possible to fully customize the xc attribute. Once again, anything to the left of the comma is assumed to be referring to exchange, and anything to the right of the comma is assumed to be reffering to correlation.
  
  * In order to start with something simple, we will demonstrate how to specify the popular [PBE0](https://aip.scitation.org/doi/abs/10.1063/1.478522) xc functional by building it from its components (the easy way to do this is simply `mf.xc='PBE0'`). PBE0 is comprised of 25% exact (Hartree–Fock) exchange, 75% PBE exchange, and 100% PBE correlation. This can be accomplished by the following line: `mf.xc='0.25*HF+0.75*PBE,PBE'`.
  
  * A more complicated example is that of B3LYP. B3LYP is comprised of 20% exact (Hartree–Fock) exchange, 8% LDA exchange, 72% B88 exchange, 19% VWN correlation, and 81% LYP correlation. Unfortunately, there are 6 existing parameterization of the VWN LDA correlation functional, and this has led to much confusion through the years.
  
  * PySCF, TURBOMOLE, GAMESS, ORCA, and PSI4 all use the VWN5 parameterization, and the following two assignments are equivalent:
    * `mf.xc='B3LYP'`
    * `mf.xc='0.2*HF+0.08*LDA+0.72*B88,0.19*VWN5+0.81*LYP'`
  * Q-Chem and NWChem use the VWN1RPA parameterization:
    * `mf.xc='0.2*HF+0.08*LDA+0.72*B88,0.19*VWN_RPA+0.81*LYP'`
  * Gaussian uses the VWN3 parameterization:
    * `mf.xc='0.2*HF+0.08*LDA+0.72*B88,0.19*VWN3+0.81*LYP'`

Running B3LYP is as simple as:

In [27]:
mf.xc='B3LYP'

## nlc attribute
#### DEFAULT: `mf.nlc=''`

The nlc attribute is used to specify the inclusion of the VV10 nonlocal correlation functional. This can simply be accomplished by `mf.nlc='VV10'`. Certain density functionals such as [ωB97X-V](http://pubs.rsc.org/en/Content/ArticleLanding/2014/CP/c3cp54374a#!divAbstract), [ωB97M-V](https://aip.scitation.org/doi/abs/10.1063/1.4952647), and [B97M-V](https://aip.scitation.org/doi/abs/10.1063/1.4907719) require this attribute to run correctly. More information on the VV10 nonlocal correlation functional can be found [here](https://aip.scitation.org/doi/abs/10.1063/1.3521275).

For example, the following correctly specifies the ωB97M-V density functional:

In [28]:
mf.xc='wB97M-V'
mf.nlc='VV10'

The internal VV10 parameters (b and C) are automatically set by libxc.

## grids attribute/class

Although PySCF allows for the comprehensive customization of the DFT integration grid, non-expert users are advised to rely on the default settings as they have been fine-tuned to maximize accuracy and minimize computational cost.

### grids.level attribute
#### DEFAULT: `mf.grids.level=3`

This is the simplest grid attribute to manipulate in order to increase or decrease the number of grid points. While the default value is 3, the range of possible values starts at `mf.grids.level=0` and continues to `mf.grids.level=3`.

### grids.atom_grid attribute
#### (NO DEFAULT)

This attribute can be used to set the total number of radial shells and the number of angular grid points per shell. The format is (radial,angular). It is possible to choose the same setting for all atoms:

In [29]:
mf.grids.atom_grid=(99,590)

or to use Python's dictionary data structure to specify a different setting for different atoms/elements:

In [30]:
mf.grids.atom_grid={'H': (50,194),'O': (75,302)}

The number of radial shells can be set to any number larger than 0, while the number of angular grid points per shell must adhere to a valid Lebedev quadrature:

{6,14,26,38,50,74,86,110,146,170,194,230,266,302,350,434,590,770,974,1202,1454,1730,2030,2354,2702,3074,3470,3890,4334,4802,5294,5810}

Commonly used combinations include:
* very coarse: `mf.grids.atom_grid=(23,170)`
* coarse: `mf.grids.atom_grid=(50,194)`
* moderate: `mf.grids.atom_grid=(75,302)`
* fine: `mf.grids.atom_grid=(99,590)`
* extremely fine: `mf.grids.atom_grid=(500,974)`

Setting this attribute will override the `mf.grids.level` attribute.

### grids.atomic_radii attribute
#### DEFAULT: `mf.grids.atomic_radii=dft.radi.BRAGG_RADII` [TODO]

This attribute chooses the atomic radius values that will be used to adjust the spacing of the radial shells. If `grids.radii_adjust=None`, setting this parameter will have no affect. There are two built-in options in PySCF:

1. `mf.grids.atomic_radii=dft.radi.BRAGG_RADII`
2. `mf.grids.atomic_radii=dft.radi.COVALENT_RADII`

The first one corresponds to the values found in this 1964 [paper](https://aip.scitation.org/doi/10.1063/1.1725697) by J. C. Slater, while the second one is primarily constructed from the values found in this [paper](http://pubs.rsc.org/en/content/articlelanding/2008/dt/b801115j#!divAbstract).

### grids.becke_scheme attribute
#### DEFAULT: `mf.grids.becke_scheme=dft.gen_grid.original_becke` [TODO]

This attribute selects the partition function used to determine the grid point weights. PySCF allows for two options:

1. `mf.grids.becke_scheme=dft.gen_grid.original_becke`
2. `mf.grids.becke_scheme=dft.gen_grid.stratmann`

The first scheme is from the famous 1988 [paper](https://aip.scitation.org/doi/abs/10.1063/1.454033) by Axel Becke, while the second scheme devised by Stratmann, Scuseria, and Frisch is found in this [paper](https://www.sciencedirect.com/science/article/pii/0009261496006008).

### grids.prune attribute
#### DEFAULT: `mf.grids.prune=dft.gen_grid.nwchem_prune` [TODO]

Pruning integration grids offers the benefit of reducing the number of grid points (as well as the computation time) without significantly affecting the total energy. PySCF has several build in pruning options:

1. `mf.grids.prune=dft.gen_grid.nwchem_prune`
2. `mf.grids.prune=dft.gen_grid.sg1_prune`
3. `mf.grids.prune=dft.gen_grid.treutler_prune`
4. `mf.grids.prune=None`

### grids.radi_method attribute
#### DEFAULT: `mf.grids.radi_method=dft.radi.treutler_ahlrichs` [TODO]

Setting up the radial shells.

1. `mf.grids.radi_method=dft.radi.treutler_ahlrichs`
2. `mf.grids.radi_method=dft.radi.delley`
3. `mf.grids.radi_method=dft.radi.mura_knowles`
4. `mf.grids.radi_method=dft.radi.gauss_chebyshev`

### grids.radii_adjust attribute
#### DEFAULT: `mf.grids.radii_adjust=dft.radi.treutler_atomic_radii_adjust` [TODO]

There are two options for adjusting the atomic radii.

1. `mf.grids.radii_adjust=dft.radi.treutler_atomic_radii_adjust`
2. `mf.grids.radii_adjust=dft.radi.becke_atomic_radii_adjust`
3. `mf.grids.radii_adjust=None`

### grids.verbose attribute
#### DEFAULT: `mf.grids.verbose=0`

The verbose attribute controls the print level for the integration grid. Setting `mf.grids.verbose=0` will print no information about the grid, while setting `mf.grids.verbose=4` prints useful information such as the total number of grid points, as well as information regarding the schemes used to set up the grid.

## nlcgrids attribute/class

The nlcgrids attribute functions identically to the grids attribute. The only difference is that nlcgrids controls the grid settings for the nonlocal correlation functional, whereas grids controls the grid for the local exchange-correlation functional. The VV10 nonlocal correlation functional requires a much coarser grid than needed for the local xc functional, which is a good thing b/c the scaling of nlc is quadratic in grid points (loops through r and r'). Example of specifying a coarse grid here.

## small_rho_cutoff attribute
#### DEFAULT: `mf.small_rho_cutoff=1e-07`

This attribute is used to discard grid points that contribute negligibly to the total electron count as computed by the chosen integration grid. Given a vector, **r**, of length ng (number of grid points), that contains the value of the electron density at each grid point, and a vector, **w**, of length ng that contains the weights associated with each grid point, grid points are discarded if **r**∘**w** ≤ `mf.small_rho_cutoff`.

In [31]:
mf=scf.RKS(mol)

mf.conv_tol=1e-12
mf.conv_tol_grad=1e-8
mf.direct_scf_tol=1e-13
mf.init_guess='atom'
mf.max_cycle=100
mf.max_memory=8000
mf.verbose=0

mf.xc='B3LYP'
mf.grids.atom_grid=(50,194)
mf.grids.becke_scheme=dft.gen_grid.original_becke
mf.grids.prune=None
mf.grids.radi_method=dft.radi.gauss_chebyshev
mf.grids.radii_adjust=None
mf.grids.verbose=4
mf.small_rho_cutoff=1e-10

mf.kernel()

radial grids: Gauss-Chebyshev (JCP, 108, 3226) radial grids
becke partition: Becke, JCP, 88, 2547 (1988)
pruning grids: None
grids dens level: 3
symmetrized grids: False
User specified grid scheme (50, 194)
tot grids = 58200


-152.82239954988552

---

---

---

---

In [1]:
import pyscf,numpy
from pyscf import gto,scf,dft

def read_molecule(path):

    with open(path,'r') as myfile:
        output=myfile.read()
        output=output.lstrip()
        output=output.rstrip()
        output=output.split('\n')

    try:
        int(output[0])
    except ValueError:
        try:
            int(output[0].split(' ')[0])
            int(output[0].split(' ')[1])
        except ValueError:
            mymol=output
        else:
            mymol='\n'.join(output[1:])
    else:
        if int(output[0])==len(output)-2:
            mymol='\n'.join(output[2:])
        else:
            print "THIS IS NOT A VALID XYZ FILE"

    return mymol

def run(method,molecule,basis,
        charge=0,ecp={},spin=0,unit='Angstrom',
        conv_tol=1e-12,conv_tol_grad=1e-8,direct_scf_tol=1e-13,init_guess='minao',level_shift=0,max_cycle=100,max_memory=8000,
        #xc=None,nlc='',xc_grid=(99,590),nlc_grid=(50,194),small_rho_cutoff=1e-7,
        #atomic_radii='BRAGG',becke_scheme='BECKE',prune=None,radi_method='GAUSS_CHEBYSHEV',radii_adjust=None,
        xc=None,nlc='',xc_grid=3,nlc_grid=1,small_rho_cutoff=1e-7,
        atomic_radii='BRAGG',becke_scheme='BECKE',prune='NWCHEM',radi_method='TREUTLER_AHLRICHS',radii_adjust='TREUTLER',
        algo='DIIS',lin_dep_thresh=1e-8,prop=False,save=False,scf_type=None,stable=False,stable_cyc=3,
        verbose=0,datapath=''):

    #strings
    method=method.upper()
    atomic_radii=atomic_radii.upper()
    becke_scheme=becke_scheme.upper()
    if isinstance(prune,str):
        prune=prune.upper()
    radi_method=radi_method.upper()
    if isinstance(radii_adjust,str):
        radii_adjust=radii_adjust.upper()
    algo=algo.upper()

    #create molecule object
    mol=gto.Mole()

    #set molecule attributes

    try:
        gto.Mole(atom=molecule,charge=charge,spin=spin).build()
    except KeyError:
        mol.atom=read_molecule(datapath+molecule)
    else:
        mol.atom=molecule
    mol.basis=basis
    mol.charge=charge
    mol.ecp=ecp
    mol.spin=spin
    mol.unit=unit
    mol.verbose=verbose
    mol.build()

    #check method for density functional
    DFT=False
    if method!='HF':
        try:
            dft.libxc.parse_xc(method)
        except KeyError:
            pass
        else:
            xc=method
            method='KS'

    #determine restricted/unrestricted if unspecified
    #if unspecified, automatically sets R if closed-shell and U if open-shell
    if (method=='HF' or method=='KS' or method=='DFT') and scf_type==None:
        if spin==0:
            scf_type='R'
        else:
            scf_type='U'

    #create HF/KS object
    if method=='RHF' or method=='ROHF' or (method=='HF' and (scf_type=='R' or scf_type=='RO')):
        mf=scf.RHF(mol)
        scf_type='R'
    elif method=='UHF' or (method=='HF' and scf_type=='U'):
        mf=scf.UHF(mol)
        scf_type='U'
    elif method=='RKS' or method=='ROKS' or method=='RDFT' or method=='RODFT' or ((method=='KS' or method=='DFT') and (scf_type=='R' or scf_type=='RO')):
        mf=scf.RKS(mol)
        scf_type='R'
        DFT=True
    elif method=='UKS' or method=='UDFT' or ((method=='KS' or method=='DFT') and scf_type=='U'):
        mf=scf.UKS(mol)
        scf_type='U'
        DFT=True
    else:
        print "CRASH 1"

    #set HF attributes
    mf.conv_check=False
    mf.conv_tol=conv_tol
    mf.conv_tol_grad=conv_tol_grad
    mf.direct_scf_tol=direct_scf_tol
    mf.init_guess=init_guess
    mf.level_shift=level_shift
    mf.max_cycle=max_cycle
    mf.max_memory=max_memory
    mf.verbose=verbose

    #set KS attributes
    if DFT:

        mf.xc=xc
        if mf.xc==None:
            print "CRASH 2"
        mf.nlc=nlc
        
        if isinstance(xc_grid,int):
            mf.grids.level=xc_grid
        elif isinstance(xc_grid,tuple) or isinstance(xc_grid,dict):
            mf.grids.atom_grid=xc_grid
        else:
            print "CRASH 3"

        if isinstance(nlc_grid,int):
            mf.nlcgrids.level=nlc_grid
        elif isinstance(nlc_grid,tuple) or isinstance(nlc_grid,dict):
            mf.nlcgrids.atom_grid=nlc_grid
        else:
            print "CRASH 4"

        if atomic_radii=='BRAGG':
            mf.grids.atomic_radii=mf.nlcgrids.atomic_radii=dft.radi.BRAGG_RADII
        elif atomic_radii=='COVALENT':
            mf.grids.atomic_radii=mf.nlcgrids.atomic_radii=dft.radi.COVALENT_RADII
        else:
            print "CRASH 5"

        if becke_scheme=='BECKE':
            mf.grids.becke_scheme=mf.nlcgrids.becke_scheme=dft.gen_grid.original_becke
        elif becke_scheme=='STRATMANN':
            mf.grids.becke_scheme=mf.nlcgrids.becke_scheme=dft.gen_grid.stratmann
        else:
            print "CRASH 6"

        if prune=='NWCHEM':
            mf.grids.prune=mf.nlcgrids.prune=dft.gen_grid.nwchem_prune
        elif prune=='SG1':
            mf.grids.prune=mf.nlcgrids.prune=dft.gen_grid.sg1_prune
        elif prune=='TREUTLER':
            mf.grids.prune=mf.nlcgrids.prune=dft.gen_grid.treutler_prune
        elif prune=='NONE' or prune==None:
            mf.grids.prune=mf.nlcgrids.prune=None
        else:
            print "CRASH 7"

        if radi_method=='TREUTLER_AHLRICHS' or radi_method=='TREUTLER' or radi_method=='AHLRICHS':
            mf.grids.radi_method=mf.nlcgrids.radi_method=dft.radi.treutler_ahlrichs
        elif radi_method=='DELLEY':
            mf.grids.radi_method=mf.nlcgrids.radi_method=dft.radi.delley
        elif radi_method=='MURA_KNOWLES' or radi_method=='MURA' or radi_method=='KNOWLES':
            mf.grids.radi_method=mf.nlcgrids.radi_method=dft.radi.mura_knowles
        elif radi_method=='GAUSS_CHEBYSHEV' or radi_method=='GAUSS' or radi_method=='CHEBYSHEV':
            mf.grids.radi_method=mf.nlcgrids.radi_method=dft.radi.gauss_chebyshev
        else:
            print "CRASH 8"

        if radii_adjust=='TREUTLER':
            mf.grids.radii_adjust=mf.nlcgrids.radii_adjust=dft.radi.treutler_atomic_radii_adjust
        elif radii_adjust=='BECKE':
            mf.grids.radii_adjust=mf.nlcgrids.radii_adjust=dft.radi.becke_atomic_radii_adjust
        elif radii_adjust=='NONE' or radii_adjust==None:
            mf.grids.radii_adjust=mf.nlcgrids.radii_adjust=None
        else:
            print "CRASH 9"

        mf.small_rho_cutoff=small_rho_cutoff

    #finally select optimizer
    if algo=='DIIS':
        #mf.DIIS=scf.diis.DIIS
        mf.diis=True
    elif algo=='ADIIS':
        mf.diis=scf.diis.ADIIS()
    elif algo=='EDIIS':
        mf.diis=scf.diis.EDIIS()
    elif algo=='NEWTON':
        mf=mf.newton()
    else:
        print "CRASH 12"

    en=mf.kernel()

    if stable==True:
        print 'Initial SCF energy: ', en
        print 'Performing Stability Analysis (up to ', stable_cyc, 'iterations)'
        for i in range(stable_cyc):
            new_mo_coeff=mf.stability(internal=True,external=False)[0]
            if numpy.linalg.norm(numpy.array(new_mo_coeff)-numpy.array(mf.mo_coeff))<10**-14:
                print "Stable!"
                break
            else:
                print "Unstable!"
                if scf_type=='U':
                    nalpha=numpy.count_nonzero(mf.mo_occ[0])
                    nbeta=numpy.count_nonzero(mf.mo_occ[1])
                    en=mf.kernel(dm0=(numpy.dot(new_mo_coeff[0][:,:nalpha],new_mo_coeff[0].T[:nalpha]),numpy.dot(new_mo_coeff[1][:,:nbeta],new_mo_coeff[1].T[:nbeta])))
                elif scf_type=='R' or scf_type=='RO':
                    nalpha=numpy.count_nonzero(mf.mo_occ)
                    en=mf.kernel(dm0=(2*numpy.dot(new_mo_coeff[:,:nalpha],new_mo_coeff.T[:nalpha])))
                else:
                    print "CRASH 11"
                print 'Updated SCF energy: ', en

    #properties
    if prop==True:

        #dipole moment
        mf.dip_moment()

        #S^2
        if scf_type=='U':
            print '(S^2,2S+1):', mf.spin_square()
        elif scf_type=='R' or scf_type=='RO':
            S=float(mol.spin)/2.
            print '(S^2,2S+1):', (S*(S+1.),2.*S+1.)
        else:
            print "CRASH 10"

        #population analysis
        print "Mulliken population analysis"
        for ia in range(mol.natm):
            symb=mol.atom_symbol(ia)
            print symb,':', mf.mulliken_meta(verbose=verbose)[1][ia]

        print "Mulliken population analysis, based on meta-Lowdin AOs"
        for ia in range(mol.natm):
            symb=mol.atom_symbol(ia)
            print symb,':', mf.mulliken_pop(verbose=verbose)[1][ia]

        #atomic spins?

    return en

#return summary/JSON

# User Guide

The QuSim user interface is the easiest way to perform computational chemistry calculations using PySCF. The `run` function accepts three mandatory, ordered inputs (as well as a variety of optional ones):

* method
* geometry
* basis

This guide will cover the basics of the QuSim user interface and demonstrate the computation of various types of chemical interactions such as atomization energies, bond dissociation energies, binding energies, etc. Furthermore, it will highlight the sensitivity of certain chemical interactions and methods to both the basis set setting as well as the integration grid setting (which pertains specifically to DFT).

To begin, we will calculate the energy of H$_2$ with a bond length of 0.74 Angstrom. First, it is necessary to define the geometry:

In [2]:
h2="""
H 0.0 0.0 0.0
H 0.0 0.0 0.74"""

For this specific example, we will use Hartree–Fock (HF) and the Dunning cc-pVDZ basis set:

In [3]:
run('HF',h2,'cc-pVDZ')

-1.1287000935564406

This should return an energy of -1.1287000935564406 Hartree.

The `run` function has a variety of useful features in addition to these three basic ones. One can optionally set the charge and spin of a molecule as follows:

In [4]:
run('HF',h2,'cc-pVDZ',charge=1,spin=1)

-0.5652012007354672

The calculation above computes the energy of H$_2$$^+$, which has a single unpaired electron (hence `spin=1`, where spin=2S).

By default, `run` will perform a restricted calculation if the molecule is closed-shell (no unpaired electrons) and an unrestricted calculation if the molecule is open-shell. This default can be overwritten in a few ways. Either the method can be explicitly stated (i.e., `RHF`, `ROHF`, or `UHF`), or the `scf_type` input can be set explicitly (`R`, `RO`, or `U`).

As an example, we will compute the energy of an open-shell molecule, namely, the hydroxyl radical, OH•.

In [5]:
oh="""
O 0.0000000000 0.0000000000 0.0000000000
H 0.0000000000 0.0000000000 0.9706601900
"""

We will first compute the energy with by specifying the method simply as `HF`, followed by `RHF`, `ROHF`, and `UHF`.

In [30]:
#unrestricted
print 'HF: ', run('HF',oh,'cc-pVDZ',spin=1)
print 'UHF: ', run('UHF',oh,'cc-pVDZ',spin=1)
print 'HF + U: ', run('HF',oh,'cc-pVDZ',spin=1,scf_type='U')

#restricted
print 'RHF: ', run('RHF',oh,'cc-pVDZ',spin=1)
print 'ROHF: ', run('ROHF',oh,'cc-pVDZ',spin=1)
print 'HF + R: ', run('HF',oh,'cc-pVDZ',spin=1,scf_type='R')
print 'HF + RO: ', run('HF',oh,'cc-pVDZ',spin=1,scf_type='RO')

HF:  -75.3938226865
RHF:  -75.3899856282
ROHF:  -75.3899856282
UHF:  -75.3938226865
HF + R:  -75.3899856282
HF + RO:  -75.3899856282
HF + U:  -75.3938226865


Naturally, the 1st, 4th, and 7th runs will all return the lower energy corresponding to a UHF calculation, while the rest will return a higher energy corresponding to an ROHF calculation. This demonstration is merely used to show the different ways in which one can specify the same calculation.

Another useful feature that can be turned on is `prop`. When `prop=True`, the a variety of molecular properties are computed after the SCF converges. For example:

In [32]:
run('UHF',oh,'cc-pVDZ',spin=1,prop=True)

Dipole moment(X, Y, Z, Debye):  0.00000,  0.00000,  1.80400
(S^2,2S+1): (0.75461173279801885, 2.0046064280032816)
Mulliken population analysis
O : -0.323214025599
H : 0.323214025599
Mulliken population analysis, based on meta-Lowdin AOs
O : -0.184995587217
H : 0.184995587217


-75.393822686483375

The above calculation on the hydroxyl radical not only returns the energy, but also the dipole moment, S$^2$ and 2S+1 values, as well as two different population analysis results.

When dealing with radicals or multireference systems, it is possible to land on an SCF solution that is unstable. For these notorious cases, the `stable=True` setting will attempt to resolve the instability 3 times (this can be modified by modifying `stable_cyc`). As an example, consider the C$_2$ molecule:

In [33]:
c2="""
C 0.0 0.0 0.0
C 0.0 0.0 1.24"""

In [34]:
run('UHF',c2,'cc-pVDZ',stable=True)

Initial SCF energy:  -75.386817114
Performing Stability Analysis (up to  3 iterations)
Unstable!
Updated SCF energy:  -75.5053140149
Stable!


-75.505314014915271

The above calculation first converges to an unstable energy of -75.386817114 Hartree, but the following energy of -75.5053140149 Hartree is stable.

Another useful feature is the ability to read molecules from disk. The parser can automatically detect any of the three file formats contained in the geom folder (XYZ, MOL, QC). It also takes an input variable called `datapath` which is the path to where the files are contained. For example, the geom folder contains:

* water_dimer.mol
* water_dimer.qc
* water_dimer.xyz

The energies for the molecules contained in these three files can be computed very easily:

In [37]:
print run('HF','water_dimer.mol','cc-pVDZ',datapath='geom/')
print run('HF','water_dimer.qc','cc-pVDZ',datapath='geom/')
print run('HF','water_dimer.xyz','cc-pVDZ',datapath='geom/')

-152.06253625
-152.06253625
-152.06253625


This tutorial also comes with all of the molecules from the MGCDB84 database, found in this [DFT review paper](https://www.tandfonline.com/doi/full/10.1080/00268976.2017.1333644). These molecules can be found in the data folder. Running any of these molecules is also very easy:

In [38]:
print run('HF','126_c2h2_W4-11.xyz','cc-pVDZ',datapath='data/')

-76.8255572993


In [26]:
mol_list=['169_f2_W4-11.xyz',"""F"""]
mol_int=[1,-2]
spin_int=[0,1]
basis_set_list=['6-31G','cc-pVDZ','cc-pVTZ','cc-pVQZ','cc-pV5Z']

for j in range(len(basis_set_list)):
    en=0.
    for i in range(len(mol_list)):
        en+=mol_int[i]*run('0.2*HF+0.08*LDA+0.72*B88,0.19*VWN_RPA+0.81*LYP',mol_list[i],basis_set_list[j],spin=spin_int[i],stable=False,datapath='/home/narbe/Tutorial_PySCF/data/',scf_type='U')
    print basis_set_list[j], ':', en*627.5095

6-31G : -30.3459946951
cc-pVDZ : -37.7296552681
cc-pVTZ : -37.9622957659
cc-pVQZ : -37.4237604124
cc-pV5Z : -37.0068454437


In [27]:
mol_list=['169_f2_W4-11.xyz',"""F"""]
mol_int=[1,-2]
spin_int=[0,1]
basis_set_list=['6-31G','cc-pVDZ','cc-pVTZ','cc-pVQZ','cc-pV5Z']

for j in range(len(basis_set_list)):
    en=0.
    for i in range(len(mol_list)):
        en+=mol_int[i]*run('HF',mol_list[i],basis_set_list[j],spin=spin_int[i],stable=False,datapath='/home/narbe/Tutorial_PySCF/data/',scf_type='U')
    print basis_set_list[j], ':', en*627.5095

6-31G : 47.4536014544
cc-pVDZ : 40.7275466101
cc-pVTZ : 37.1085010276
cc-pVQZ : 37.2732645162
cc-pV5Z : 37.3276165346


In [10]:
#mol.multiplicity=1 for closed shell
#gaussian basis paerser in basis/parse_gaussian.py

#TO-DO LIST

#allow mol.atom to take something like @H or !H -- something to indicate Ghost easier (X-H updates)
#mol.cart needs to be set based on the basis set (mostly for Pople -- do we care?)
#be able to read mol.atom from any file, and the file can either be just XYZ/Zmat, coordinates, Q-Chem like, etc.

#D3 has to be implemented, and VV10 sped up, and maybe D2 D3(BJ) D4 and all that new stuff

#what about fragment monomoners?
#be able to take input of Gaussian94 basis/ecp
#be able to read basis from file on disk both formats
#maybe for mol.cart instead of 0 and 1, do 'spherical' and 'Cartesian'

#there should be a verbose level for Mole that doesn't print basis set coeffs and stuff
#shouldn't the extra cycle for SCF (conv_check) only be initiated IF there is a level shift?
#should we be using the norm of the orbital gradient for convergence or the rms
#printing HOMO/LUMO on each cycle is unnecessary or at least we should be able to turn it off
#we should print like orbital energies at end, split into occ and virt

#what exactly is the minao guess? will it be expensive for large systems?
#mp.frozen and cc.frozen should have a default!! like if true, it should just do something obvious
#pt2 kernel should not return t2?!
#cc kernel shouldt return t2?!
#it's not clear what the default SCF conv_tol is? 1e-9 or 1e-10? it is 1e-9
#nalpha and nbeta for UHF?

#if atom_grid is specified, then don't print grids.level, that is confusing
#some properties should be computed for free and by default -- atomic charges, S^2, multipole moments, mulliken population
    #group all properties into def analyze and just make it default run in kernel
#lin dep should be automatically removed!!!
#stability_analysis easier
#opitmizer options: diis, ediis, adiis, newton, gdm
#breakdown of energy printed at end (nuclear rep./1e/2e/DFT/disp/PCM/EFP/total)
#for DFT, it should allow 'M06-L' not 'M06_L,M06_L' need to hard-code the XC
    #for DFT mf.xc=',' in addition to 'HF' maybe we should have 'SR-HF' and 'LR-HF'
#allow setting of b and C for VV10
#give RHF/ROHF a spin square function just to print it out so ppl dont have to compute in their head
#spin per atom for UHF calc?

#PSI4 output info:
#rotational constants?
#charge/mult/electron alph beta
#dipole in different units, both nuclear/electron/full

#NWChem - VWN1RPA
#PySCF - VWN5
#TURBOMOLE - VWN5
#GAMESS - VWN5
#MOLPRO - VWN5
#ORCA - VWN5
#PSI4 - VWN5
#Q-Chem - VWN1RPA
#Gaussian - VWN3

#generate file that run function psueodcode
#mf.RKS()
#...
#mf=mf.newton()

#scf_guess read impement