# ErgoSCF methods: NAC workflow ("step2")

## Table of Content <a name="TOC"></a>

1. [General setups](#setups)

2. [Fock matrix diagonalization approach](#fock_matrix_mo)

  2.1. [Restricted HF (RHF) case](#case-1)  
  
  2.2. [Unrestricted HF (UHF) case](#case-2)  
  
3. [Direct MO approach](#direct_mo)

  3.1. [Restricted HF (RHF) case](#case-3)  
  
  3.2. [Unrestricted HF (UHF) case](#case-4)  

4. [Optional cleanup](#cleanup) 



### A. Learning objectives

- to setup and run RHF and UHF electronic structure calculations with ErgoSCF package
- to produce molecular orbitlas in two different ways
- to compute NACs along the pre-computed MD trajectories using ErgoSCF and Libra codes


### B. Use cases

- NAC calculations


### C. Functions

- `libra_py`
  - `workflows.nbra`
    - [`step2_ergoscf`](#run_step2-1) | [also here](#run_step2-2) | [also here](#run_step2-2)      


### D. Classes and class members

None

## 1. General setups
<a name="setups"></a>[Back to TOC](#TOC)

This tutorial explains how to compute NACs along the pre-computed MD trajectories using ErgoSCF and Libra codes

We will consider two major approaches to produce MOs: a) either by the Fock matrix diagonalization or b) by reading the MO projections ErgoSCF does.

### Fock matrix diagonalization approach

In this case, we run the SCF calculations until we reach its convergence. 
We then read the last (converged) Fock matrix and diagonalize it. 
As a result of the diagonalization, we obtain both orbital energies and MOs


### Direct MO approach

In this case, both MOs and orbital energies are produced by ErgoSCF
We simply read and use them. Internally, this is done via a projection algorithm.

Note: 
>  As of now (analyze the output of this tutorial), it seems that the two approaches may give somewhat different
> orbital energies and NACs. This happens for some of the orbitals, but not for all. 
>
> Also, we can observe that the two approaches may lead to different degree of the state (orbital) reordering 
> problem - in this example, the reordering with the first method (internal diagonalization) is of smaller issue 
> than in the second method (direct MO)

All right, let's get it started. First, we import all the needed modules

In [1]:
import os
import sys

# Fisrt, we add the location of the library to test to the PYTHON path
if sys.platform=="cygwin":
    from cyglibra_core import *
elif sys.platform=="linux" or sys.platform=="linux2":
    from liblibra_core import *
import util.libutil as comn
from libra_py import ERGO_methods
from libra_py import units
from libra_py.workflows.nbra import step2_ergoscf


  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)


Also, let's define add the path to the `ergo` binary to the environment avariables.

This instruction is all that need to be done to ensure porting of this tutorial to other systems

Thanks to [this link](https://stackoverflow.com/questions/1681208/python-platform-independent-way-to-modify-path-environment-variable)

In [2]:
ergo_bin = "/mnt/c/cygwin/home/Alexey-user/Soft/ergo-3.8/source/"
os.environ["PATH"] += os.pathsep + ergo_bin

In [3]:
print(os.environ["PATH"])

/home/alexey/Conda/Miniconda3/envs/libra/bin:/home/alexey/Conda/Miniconda3/condabin:/home/alexey/.rvm/gems/ruby-3.0.0/bin:/home/alexey/.rvm/gems/ruby-3.0.0@global/bin:/usr/share/rvm/rubies/ruby-3.0.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/mnt/c/Program Files/WindowsApps/CanonicalGroupLimited.UbuntuonWindows_2004.2021.222.0_x64__79rhkp1fndgsc:/mnt/c/ProgramData/Oracle/Java/javapath:/mnt/c/Program Files (x86)/Intel/iCLS Client:/mnt/c/Program Files/Intel/iCLS Client:/mnt/c/Windows/system32:/mnt/c/Windows:/mnt/c/Windows/System32/Wbem:/mnt/c/Windows/System32/WindowsPowerShell/v1.0:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/IPT:/mnt/c/Program Files/Intel/Intel(R) Management Engine Components/IPT:/mnt/c/WINDOWS/

## 2. Fock matrix diagonalization approach
<a name="fock_matrix_mo"></a>[Back to TOC](#TOC)

In this approach, we compute the MOs self-consistently for any pair of geometries `R(t)` and `R(t+dt)`. As a result, we obtain the converged Fock matrices for each geometry, `F(t)` and `F(t+dt)`, as well as the AO-basis overlaps for each geometry, `S(t)` and `S(t+dt)`, which satisfy the following conditions:

$$
F(t) C(t) = S(t) C(t) E(t) \\
F(t+dt) C(t+dt) = S(t+dt) C(t+dt) E(t+dt)
$$


We then compute the overlap matrix for the super-system $R(t) + R(t+dt)$ but do not do any self-consistent calculations. The resulting super-matrix will have a diagonal structure with the off-diagonal blocks containing the time-overlaps of the AO basis functions:

$$S_{ij}^{AO}(t;t+dt)= <\chi_i(t)|\chi_j(t+dt)>$$

Using the MOs `C(t)` and `C(t+dt)` determined from the SCF calculations at every geometry, we can compute the time-overlaps in the MO basis:

$$S_{ij}^{MO}(t;t+dt)= \sum_{ab} { C_{ia}(t) S_{ab}^{AO}(t; t+dt) C_{bj}(t+dt) } $$

In all cases, we will be using the `step2_ergoscf.run_step2` function:
<a name="run_step2-1"></a>

In [4]:
help(step2_ergoscf.run_step2)

Help on function run_step2 in module libra_py.workflows.nbra.step2_ergoscf:

run_step2(params, run1, run2)
    Calculate the overlaps, transition dipole moments, and vibronic Hamiltonian matrix elements in the AO basis
    
    Args:
        params ( dictionary ): the control parameters of the simulation
        
        * **params["dt"]** ( double ): nuclear dynamics timestep - as encoded in the trajectory [ units: a.u., default: 41.0 ]
        * **params["isnap"]** ( int ): initial frame  [ default: 0 ]
        * **params["fsnap"]** ( int ): final frame  [ default: 1 ]
        * **params["out_dir"]** ( string ): the path to the directory that will collect all the results
            If the directory doesn't exist, it will be created  [ default: "res" ]
    
        * **params["EXE"]** ( string ): path to the ErgoSCF executable [ default: ergo ]
        * **params["md_file"]** ( string ): the name of the xyz file containing the trajectory - the 
            file should be in the gener

### 2.1. Restricted HF (RHF) case
<a name="case-1"></a>[Back to TOC](#TOC)

First, lets define the function to run the ErgoSCF calculations for each geometry. 

This function essentially defines the core of the input file to the ErgoSCF code. 

In this file we define all the key parameters, such as:

* basis - in this case, STO-3G
* spin polarization - in this case we do non-spin-polarized calculations
* electronic structure method - in this case we use one of the pure density functionals "LDA"

We also instruct ErgoSCF to produce Fock matrix in the AO basis (with the **scf.create_mtx_files_F = 1**), the AO overlap matrix (with the **scf.create_mtx_file_S = 1**)

To ensure we take advantage of some parallelizaiton, we also instruct ErgoSCF to detect the number of available threads 
(with the **set_nthreads("detect")**)

Note the "Angstrom" keyword following the "molecule_inline" - it indicates that the input is provided in Angstrom units.
If your input (e.g. in the xyz trajectory file) is in the a.u. (Bohr) units, just remove the "Angstrom" keyword

All of the above parameters will be kept constant during the run, only the coordinates and the name of the executable will be changed in the run.


In [5]:
def scf_restricted(EXE, COORDS):
    inp = """#!bin/sh
%s << EOINPUT > /dev/null
set_nthreads("detect")
spin_polarization = 0
molecule_inline Angstrom
%sEOF
basis = "STO-3G"
use_simple_starting_guess=1
scf.create_mtx_files_F = 1
scf.create_mtx_file_S = 1
XC.sparse_mode = 1
run "LDA"
EOINPUT
""" % (EXE, COORDS)
    return inp

We also define a similar function to run the calculations for the "weird" geometry, when two snapshots are overlapped and form a "super-molecule". The super-molecule will be composed of the molecular geometries at two adjacent timesteps t and t+dt as extracted from the MD trajectory file. Then, the off-diagonal block matrix will contain the overlaps bettween AOs centered at displaced geometries - this is something we'll need to compute transition density matrices. 

Of course, the SCF on such a supermolecule doesn't make sense and will probably not converge. That's why we instruct ErgoSCF to just compute the AO matrix and quit (with **scf.create_mtx_files_S_and_quit = 1**), not to do the costly and meaningless SCF calculations.

In [6]:
def compute_AO_overlaps(EXE, COORDS):
    inp = """#!bin/sh
%s << EOINPUT > /dev/null
spin_polarization = 0
molecule_inline Angstrom
%sEOF
basis = "STO-3G"
use_simple_starting_guess=1
scf.create_mtx_file_S = 1
scf.create_mtx_files_S_and_quit = 1
XC.sparse_mode = 1
run "LDA"
EOINPUT
""" % (EXE, COORDS)
    return inp

Ok, so now we can run the step2 calculations! 


**IMPORTANT:** before running the "run_step2", don't forget to create the corresponding output directory, since this is where the results will be printed out.

In this case:

* we expect to produce the orbitals ourselves: (**"direct_MO":0**)
* because of this, the indexing of the active space orbitals is done in the absolute indexing convention: (**"mo_indexing_convention":"abs"**)
* we expect that SCF is done without spin polarization (as setup in the function above): (**"spinpolarized":0**)
<a name="run_step2-2"></a>

In [7]:
params = {"EXE":"ergo", "md_file":"md1.xyz",
          "isnap":0, "fsnap":5, "dt": 1.0 * units.fs2au,
          "out_dir": "restricted_indirect",
          "mo_indexing_convention":"abs", "direct_MO":0, "spinpolarized":0
         }

os.system("mkdir restricted_indirect")
step2_ergoscf.run_step2(params, scf_restricted, compute_AO_overlaps)

### 2.2. Unrestricted HF (UHF) case
<a name="case-2"></a>[Back to TOC](#TOC)

In this case, we want to perfom spin-polarized calculation on a singlet.

In ErgoSCF we can enforce such type of calculations with **scf.force_unrestricted = 1** 

Moreover, for the unrestricted calculations to be really meaningfull, we need to break the initial symmetry of orbitals. This is done via **scf.starting_guess_disturbance = 0.01**

In [8]:
def scf_unrestricted(EXE, COORDS):
    inp = """#!bin/sh
%s << EOINPUT > /dev/null
set_nthreads("detect")
spin_polarization = 0
molecule_inline Angstrom
%sEOF
basis = "STO-3G"
use_simple_starting_guess=1
scf.create_mtx_files_F = 1
scf.create_mtx_file_S = 1
XC.sparse_mode = 1
scf.force_unrestricted = 1
scf.starting_guess_disturbance = 0.01
run "LDA"
EOINPUT
""" % (EXE, COORDS)
    return inp

**IMPORTANT:** before running the "run_step2", don't forget to create the corresponding output directory, since this is where the results will be printed out.

According to our ErgoSCF setup, we also need to change some parameters in the **run_step2** function.

In comparison to Case 1, in this case we change this:

* we expect that SCF is done wit the spin polarization (as setup in the function above): (**"spinpolarized":1**)

In [9]:
params = {"EXE":"ergo", "md_file":"md1.xyz",
          "isnap":0, "fsnap":5, "dt": 1.0 * units.fs2au,
          "out_dir": "unrestricted_indirect",
          "mo_indexing_convention":"abs", "direct_MO":0, "spinpolarized":1
         }

os.system("mkdir unrestricted_indirect")
step2_ergoscf.run_step2(params, scf_unrestricted, compute_AO_overlaps)

## 3. Direct ErgoSCF MOs approach
<a name="direct_mo"></a>[Back to TOC](#TOC)

Now, we are about to setup calculations in such as way that ErgoSCF would produce the orbital eigenvalues as well as corresponding orbitals. 

This is done with the function below. 

Note, it differs from that in Case 1 only by:

* **scf.output_homo_and_lumo_eigenvectors = 1** - to request the printout of the orbitals and their eigenvalues
* **scf.number_of_occupied_eigenvectors = 2** - to setup how many occupied orbitals we want to produce. LiH molecule has only 4 electrons, so we can have only 2 occupied orbitals (H-1 and H), so we request 2. Don't set this number to larger than what can be accommodated, or the program may not work correctly.
* **scf.number_of_unoccupied_eigenvectors = 2** - to setup how many unoccupied orbitals we want to produce. This number is defined by the basis size. For instance, if using STO-3G on He2, one can not have any unoccupied orbitals, if using STO-3G on H2 or HF molecules, one can have only 1 unoccupied orbital. In this example, we have enough orbitals to have 2 unoccupied orbitals requested. Don't set this number to larger than what can be accommodated by the basis size, or the program may not work correctly.
* **scf.eigenvectors_method = "projection"** - this parameter has to be used, if the above key-words are defined. The default value for this parameters won't work.

### 3.1. Restricted HF (RHF) case
<a name="case-3"></a>[Back to TOC](#TOC)

We start with the setups for RHF calculations

In [10]:
def scf_restricted_direct(EXE, COORDS):
    inp = """#!bin/sh
%s << EOINPUT > /dev/null
set_nthreads("detect")
spin_polarization = 0
molecule_inline Angstrom
%sEOF
basis = "STO-3G"
use_simple_starting_guess=1
scf.create_mtx_files_F = 1
scf.create_mtx_file_S = 1
XC.sparse_mode = 1

scf.output_homo_and_lumo_eigenvectors = 1
scf.number_of_occupied_eigenvectors = 2
scf.number_of_unoccupied_eigenvectors = 2
scf.eigenvectors_method = "projection"

run "LDA"
EOINPUT
""" % (EXE, COORDS)
    return inp

Now, we need to reflect the type of calculations in our **run_step2** function

**IMPORTANT:** before running the "run_step2", don't forget to create the corresponding output directory, since this is where the results will be printed out.

In this case:

* we expect to read the MOs and orbital energies from ErgoSCF output: (**"direct_MO":1**)
* because of this, the indexing of the active space orbitals is done in the relative indexing convention: (**"mo_indexing_convention":"rel"**)
* we expect that SCF is done without spin polarization (as setup in the function above): (**"spinpolarized":0**)

<a name="run_step2-3"></a>

In [11]:
params = {"EXE":"ergo", "md_file":"md1.xyz",
          "isnap":0, "fsnap":5, "dt": 1.0 * units.fs2au,
          "out_dir": "restricted_direct",
          "mo_indexing_convention":"rel", "direct_MO":1, "spinpolarized":0
         }

os.system("mkdir restricted_direct")
step2_ergoscf.run_step2(params, scf_restricted_direct, compute_AO_overlaps)

### 3.2. Unestricted HF (UHF) case
<a name="case-4"></a>[Back to TOC](#TOC)

... and continue with the example for the unrestricted case

This is just a hybrid of Case 2 (how to setup spin-polarized calculations in ErgoSCF and run_step2) and Case 3 (how to setup direct MO calculations to be performed by ErgoSCF)

In [12]:
def scf_unrestricted_direct(EXE, COORDS):
    inp = """#!bin/sh
%s << EOINPUT > /dev/null
set_nthreads("detect")
spin_polarization = 0
molecule_inline Angstrom
%sEOF
basis = "STO-3G"
use_simple_starting_guess=1
scf.create_mtx_files_F = 1
scf.create_mtx_file_S = 1
XC.sparse_mode = 1
scf.force_unrestricted = 1
scf.starting_guess_disturbance = 0.01

scf.output_homo_and_lumo_eigenvectors = 1
scf.number_of_occupied_eigenvectors = 2
scf.number_of_unoccupied_eigenvectors = 2
scf.eigenvectors_method = "projection"

run "LDA"
EOINPUT
""" % (EXE, COORDS)
    return inp

In [13]:
params = {"EXE":"ergo", "md_file":"md1.xyz",
          "isnap":0, "fsnap":5, "dt": 1.0 * units.fs2au,
          "out_dir": "unrestricted_direct",
          "mo_indexing_convention":"rel", "direct_MO":1, "spinpolarized":1
         }

os.system("mkdir unrestricted_direct")
step2_ergoscf.run_step2(params, scf_unrestricted_direct, compute_AO_overlaps)

## 4. Optional clean up
<a name="cleanup"></a>[Back to TOC](#TOC)

To remove all the files and folders created by this tutorial, uncommend the below line

In [14]:
#!sh ./clean.sh