This notebook has 2 main steps
- First, it generates an `input.yml` file for ARC to run single point calculations on the experimental geometries for the 16 reference species in our database.
- After the ARC calculations have completed, the notebook uses Arkane to perform AEC fitting.

Briefly, [ARC](https://github.com/ReactionMechanismGenerator/ARC) (Automated Rate Calculator) is a software developed by the RMG team for automating electronic structure calculations, which are then used to calculate thermochemical and kinetic parameters. Installation instructions can be found on ARC's documentation [link](https://reactionmechanismgenerator.github.io/ARC/installation.html).

In [None]:
import os

import yaml

from arc.species.converter import xyz_to_str

from arkane.encorr.reference import ReferenceDatabase
from arkane.encorr.ae import SPECIES_LABELS, AEJob

In [None]:
############# USER INPUT #############
LOT = {'method': 'CCSD(T)-F12',
       'basis':'cc-pVTZ-F12',
       'software':'molpro',
       }
arc_project_name = 'aec_bac'
#######################################
model_chemistry = '/'.join([LOT.get(key, '') for key in ['method', 'basis']])

# Step 1: Write ARC input file to conveniently run single point calculations in preparation for AEC fitting

In [None]:
# load database
database = ReferenceDatabase()
database.load()

In [None]:
# create dictionary to map species labels to folder names from ARC's output
SPECIES_LABELS = {
    'Br2': 'Dibromine',
    'BrH': 'Hydrogen bromide',
    'CH3': 'Methyl',
    'CH3Cl': 'Chloromethane',
    'CH4': 'Methane',
    'Cl2': 'Dichlorine',
    'ClH': 'Hydrogen chloride',
    'F2': 'Difluorine',
    'FH': 'Hydrogen fluoride',
    'H2': 'Dihydrogen',
    'H2O': 'Water',
    'H2S': 'Hydrogen sulfide',
    'H3N': 'Ammonia',
    'N2': 'Dinitrogen',
    'O2': 'Dioxygen',
    'S2': 'Disulfur',
}

In [None]:
# load experimental geometries from reference database
exp_geometries = {}
for label in SPECIES_LABELS.values():
    ref_spec = database.get_species_from_label(label)[0]
    xyz = ref_spec.reference_data['CCCBDB'].xyz_dict
    exp_geometries[label] = xyz

In [None]:
# define input file structure
input_file = f"""project: {arc_project_name}
sp_level: {model_chemistry}

compute_thermo: false

job_types:
  conformers: false
  opt: false
  fine_grid: false
  freq: false
  sp: true
  rotors: false

species:"""

species = """
  - label: {formula}
    smiles: '{smiles}'
    charge: {charge}
    multiplicity: {multiplicity}
    xyz: |
      {xyz}"""

In [None]:
# create input yaml file for ARC
for spc in exp_geometries.keys():
    label = spc    
    xyz = xyz_to_str(exp_geometries[label])

    xyz_indented = ''
    for i, line in enumerate(xyz.split('\n')):
        if i == 0:
            xyz_indented += f'{line}\n'
        else:
            xyz_indented += f'      {line}\n'
    
    ref_spec = database.get_species_from_label(label)[0]
    charge = ref_spec.charge
    multiplicity = ref_spec.multiplicity
    formula = ref_spec.formula
    smiles = ref_spec.smiles
    
    input_file += species.format(formula=formula,
                                 smiles=smiles,
                                 charge=charge,
                                 multiplicity=multiplicity,
                                 xyz=xyz_indented)

In [None]:
with open("input.yml", "w") as f:
    f.write(input_file)

Now use ARC to run this input file and obtain single point energies for the experimental geometries

# Step 2: After running sp calculations, get AEC

In [None]:
from arkane.ess.factory import ess_factory
from arkane.modelchem import LevelOfTheory
import rmgpy.constants as constants

In [None]:
# define path to the "Species" folder from ARC's output
sp_dir = 'calcs/Species/'

`arkane/encorr.ae.py` says:

```
Notes:
    The species energies should be provided as a dictionary
    containing the species labels as keys and their single-
    point electronic energies in Hartree as values. The
    energies should be calculated using the experimental
    geometry provided for the species in the reference
    database, and **the zero-point energy should not be included
    in the electronic energy.**
            
```

The `load_energy()` method on Arkane adapters adapter returns the energy in J/mol and does NOT include the zero-point energy. So just convert the energy from J/mol to Hartrees before fitting.

In [None]:
# get the energy for each QM sp job
species_energies = {}
for root, dirs, files in os.walk(sp_dir):
    dirs.sort()
    print(len(dirs))
    print(dirs)
    for directory in dirs:
        for root1, dirs1, files1 in os.walk(os.path.join(root, directory)):
            for d1 in dirs1:
                for root2, dirs2, files2 in os.walk(os.path.join(root1, d1)):
                    files2.sort()
                    # extract the output.out file
                    output = os.path.join(root2, 'output.out')
                    log = ess_factory(output)
                    energy = log.load_energy()
                    energy = energy / (constants.E_h * constants.Na)  # convert from J/mol to Hartrees
                    species_energies.update({SPECIES_LABELS[directory]: energy})
    break           
species_energies

In [None]:
lot = LevelOfTheory(**LOT)

In [None]:
ae = AEJob(species_energies=species_energies,
           level_of_theory=lot,
           )

In [None]:
output_file = 'AEC_' + '_'.join([LOT.get(key, '') for key in ['method', 'basis']])
ae.execute(output_file=f'{output_file}.out')

Copy paste the values from the output file to `RMG-database/input/quantum_corrections/data.py`