# Loading our results 
This tutorial explains how to load the results that were published with our paper. The data can be found [here](https://doi.org/10.18126/unc8-336t). In this tutorial we'll use a snippet of that data, which contains a few tens of species that have experimental data. In a later tutorial we'll see how to load results that you run on your own molecules!

Let's first start with imports:

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import pickle
import json
import nglview as nv




There are three types of files that we can load. The file ending with `.json` contains detailed information about transition states (TSs), singlet-triplet crossings, etc., but does not contain any geometries. The two files ending with `.pickle` contain the most important summary data bout TSs and crossings, plus the actual geometries.

In the file ending with `_atoms.pickle`, the geometries are represented as [ASE](https://wiki.fysik.dtu.dk/ase/) atoms. In the file ending with `_xyz.pickle`, the geometries are given in the `xyz` format.

Let's load the file with ASE atoms objects:

In [3]:
with open("data/ts_query_atoms.pickle", 'rb') as f:
    info = pickle.load(f)


The keys in the dictionary are SMILES strings (we use the *cis* isomer for each species):

In [4]:
keys = list(info.keys())
key = keys[0]
print(key)

Fc1ccccc1/N=N\c1ccccc1F


Each value is a sub-dictionary with results:

In [5]:
sub_dic = info[key]
display(sub_dic.keys())

dict_keys(['results_by_mechanism', 'summary', 'stable', 'unstable', 'cis_smiles', 'trans_smiles'])

## Miscellaneous keys

The `cis_smiles` and `trans_smiles` keys tell you the SMILES strings of the *cis* and *trans* isomers.  The `stable` and `unstable` keys tell you which isomer is thermodynamically stable, and which is unstable. Usually the unstable one is *cis* and that's also the case here:

In [6]:
display({key: val for key, val in sub_dic.items() if key not in ['results_by_mechanism', 'summary']})

{'stable': 'trans',
 'unstable': 'cis',
 'cis_smiles': 'Fc1ccccc1/N=N\\c1ccccc1F',
 'trans_smiles': 'Fc1ccccc1/N=N/c1ccccc1F'}

## `results_by_mechanism`

`results_by_mechanism` has results for each of the four mechanisms for each species. Let's see what it looks like:

In [7]:
display(sub_dic['results_by_mechanism'].keys())
display(sub_dic['results_by_mechanism']['cis'].keys())

dict_keys(['trans', 'cis'])

dict_keys(['s_t_crossing', 'ts'])

`results_by_mechanism` is divided into *cis* and *trans*. These provide activation free energies (and energies, enthalpies, etc.) relative to the *cis* and *trans* isomers, respectively. The classical activation free energies are in `ts`, while the effective activation free energies from intersystem crossing are in `s_t_crossing`.

### Classical transition states

Let's look at the classical transition states (TSs) first:

In [8]:
cis_dic = sub_dic['results_by_mechanism']['cis']
ts_list = cis_dic['ts']

print(type(ts_list))
print(len(ts_list))
display(list(ts_list[0].keys()))

<class 'list'>
4


['ts_geom_id',
 'endpoint_geom_id',
 'delta_free_energy',
 'delta_energy',
 'delta_enthalpy',
 'delta_conf_free_energy',
 'delta_avg_conf_energy',
 'delta_entropy_j_mol_k',
 'delta_conf_entropy_j_mol_k',
 'delta_vib_entropy_j_mol_k',
 'endpoint_conf_free_energy',
 'ts_conf_free_energy',
 'ts_atoms',
 'endpoint_atoms']

We see that `ts` is a list of 4 dictionaries, one for each mechanism. Here is the meaning of each key (ignore anything with `geom_id` in it):

- `delta_<quantity>`: Difference in `<quantity>` between the TS and the *cis* isomer (*cis* because we're looking at `sub_dic['results_by_mechanism']['cis']`). Different quantities are:
    - `energy`
    - `enthalpy`
    - `conf_free_energy`: Conformational free energy ($-TS_{\mathrm{conf}}$, where $S_{\mathrm{conf}}$ is the conformational entropy)
    - `entropy`
    - `conf_entropy`: Conformational entropy
    - `vib_entropy`: Vibrational entropy (entropy without the conformational compnonent)
    - `avg_conf_energy`: The average energy of the conformer ensemble, relative to the lowest energy conformer. Given in kcal/mol, and equal to $\sum_i p_i (E_i-E_{\mathrm{min}})$, where the sum is over conformers, $p_i = \mathrm{exp} (-E_i/k_{\mathrm{B}} T) \ / \ [\sum_j \mathrm{exp}(E_j/k_{\mathrm{B}} T)]$ is the probability of the $i^{\mathrm{th}}$ conformer, and $E_{\mathrm{min}}$ is the lowest energy in the ensemble.

- `endpoint_<quantity>`: the value of a quantity for an endpoint (i.e., reactant or product). Since we're look at `sub_dic['results_by_mechanism']['cis']`, the endpoint here is *cistrans*. If `quantity` is atoms, then the value is an ASE atoms object.

- `ts_<quantity>`: the value of a quantity for the TS

**Units**: By default, all *absolute* energies are given in Hartrees (atomic units), while all *relative* energies are given in kcal/mol. If `j_mol_k` is specified, then entropies are given in units of $\mathrm{J} / (\mathrm{mol \  K})$. Otherwise, both absolute and relative entropies are given in atomic units as $T \cdot S$, where $T=298.15 \ \mathrm{K}$ is the temprature. 

Let's look at each mechanism and visualize the TSs:



In [9]:
for ts_dic in ts_list:
    display({key: val for key, val in ts_dic.items() if 'geom_id' not in key
             and 'atoms' not in key})
    trj = [ts_dic[key] for key in ['endpoint_atoms', 'ts_atoms']]
    display(nv.show_asetraj(trj))

{'delta_free_energy': 30.40815908886956,
 'delta_energy': 32.984694537517385,
 'delta_enthalpy': 30.69544174426622,
 'delta_conf_free_energy': -0.26768503575758185,
 'delta_avg_conf_energy': 0.23053687616104576,
 'delta_entropy_j_mol_k': 4.031496327954598,
 'delta_conf_entropy_j_mol_k': 3.7564789186977112,
 'delta_vib_entropy_j_mol_k': 0.2750174092568528,
 'endpoint_conf_free_energy': -0.001018021300492071,
 'ts_conf_free_energy': -0.0014446048805802957}

NGLWidget(max_frame=1)

{'delta_free_energy': 34.65076455645786,
 'delta_energy': 36.367195562381816,
 'delta_enthalpy': 34.98019783151161,
 'delta_conf_free_energy': -0.26768503575758185,
 'delta_avg_conf_energy': 0.23053687616104576,
 'delta_entropy_j_mol_k': 4.6230046044771065,
 'delta_conf_entropy_j_mol_k': 3.7564789186977112,
 'delta_vib_entropy_j_mol_k': 0.8665256857793607,
 'endpoint_conf_free_energy': -0.001018021300492071,
 'ts_conf_free_energy': -0.0014446048805802957}

NGLWidget(max_frame=1)

{'delta_free_energy': 34.64967445891508,
 'delta_energy': 36.36717288290034,
 'delta_enthalpy': 34.97874634469739,
 'delta_conf_free_energy': -0.26768503575758185,
 'delta_avg_conf_energy': 0.23053687616104576,
 'delta_entropy_j_mol_k': 4.61793315483195,
 'delta_conf_entropy_j_mol_k': 3.7564789186977112,
 'delta_vib_entropy_j_mol_k': 0.861454236134204,
 'endpoint_conf_free_energy': -0.001018021300492071,
 'ts_conf_free_energy': -0.0014446048805802957}

NGLWidget(max_frame=1)

{'delta_free_energy': 30.407467401305883,
 'delta_energy': 32.98490621267779,
 'delta_enthalpy': 30.69855364402701,
 'delta_conf_free_energy': -0.26768503575758185,
 'delta_avg_conf_energy': 0.23053687616104576,
 'delta_entropy_j_mol_k': 4.0848728477116065,
 'delta_conf_entropy_j_mol_k': 3.7564789186977112,
 'delta_vib_entropy_j_mol_k': 0.3283939290138611,
 'endpoint_conf_free_energy': -0.001018021300492071,
 'ts_conf_free_energy': -0.0014446048805802957}

NGLWidget(max_frame=1)

We see two inversion mechanisms and two rotations. The rotational barrier is 4 kcal/mol lower than the inversion barrier.

### Singlet-triplet crossings

Now let's look at singlet-triplet crossings:

In [10]:
s_t_list = cis_dic['s_t_crossing']
display(s_t_list)


[[{'s_t_crossing_geom_id': 243495721,
   'endpoint_geom_id': 223213660,
   'delta_free_energy': 26.069174272403334,
   'delta_enthalpy': 26.356456927799986,
   'delta_conf_free_energy': -0.26768503575758185,
   'delta_avg_conf_energy': 0.23053687616104576,
   'delta_eff_free_energy': 28.492724274717613,
   'delta_entropy_j_mol_k': 4.031496327954598,
   'delta_conf_entropy_j_mol_k': 3.7564789186977112,
   'delta_vib_entropy_j_mol_k': 0.2750174092568528,
   'delta_eff_entropy_j_mol_k': -29.978677107171762,
   'endpoint_conf_free_energy': -0.001018021300492071,
   's_t_crossing_conf_free_energy': -0.0014446048805802957,
   'endpoint': 'cis',
   't_isc': 5.83526486568076e-12,
   's_t_crossing_atoms': Atoms(symbols='FC6N2C6FH8', pbc=False),
   'endpoint_atoms': Atoms(symbols='FC6N2C6FH8', pbc=False)},
  {'s_t_crossing_geom_id': 243495723,
   'endpoint_geom_id': 248437562,
   'delta_free_energy': 25.226068091148722,
   'delta_enthalpy': 25.513350746545374,
   'delta_conf_free_energy': -0.267

Many of the keys are the same as for TSs, but the structure is different. Here we have a list of lists. Each sub-list has two dictionaries, one for the crossing closer to *cis*, and one for the crosisng closer to *trans*. The closer isomer is indicated by `endpoint`. We can test to see if that's true:

In [11]:
trj = [d['s_t_crossing_atoms'] for d in s_t_list[0]]
display(nv.show_asetraj(trj))
diheds = [i.get_dihedral(9, 10, 11, 12) for i in trj]
print(diheds)


NGLWidget(max_frame=1)

[0.1832232198624463, 0.4662472575493782]


As promised, the first geometry is closer to *cis* (the first dictionary has `'endpoint': 'cis'`), and the second is closer to *trans*. We can see that both visually and from looking at the dihedral angles, which are $75^{\circ} < 90^{\circ} $ and $105^{\circ} > 90^{\circ}$, respectively.

There are also some keys that aren't in the TS dictionary:

- `delta_eff_free_energy`: The effective free energy difference between the crossing geometry and the higher energy relaxed isomer (in this case, the *cis* isomer). It is given by

$\Delta G^{X, \mathrm{eff}} = \Delta H^{\mathrm{X}} + \Delta S^{\mathrm{app}}$

$ \Delta S^{\mathrm{app}} = \Delta S^{\mathrm{X}} + k_{\mathrm{B}} \left( \mathrm{log} \frac{h k_{\mathrm{ISC}} }{k_{\mathrm{B}} T } - \frac{1}{2} \right)  $


 Here $\Delta H^{\mathrm{X}}$ is the difference in enthalpy between the crossing point ("`X`") and the endpoint. $\Delta S^{\mathrm{app}}$ is the entropy that would be inferred from experiment if one assumed an Arrhenius TS relation. It includes the actual entropy difference, $\Delta S^{\mathrm{X}}$, and a term related to the intersystem crossing rate, $k_{\mathrm{ISC}}$.

- `delta_eff_entropy`: the effective (or apparent) activation entropy defined above 
- `s_t_crossing_conf_free_energy`: the conformational free energy at the crossing point
- `t_isc`: the intersystem crossing time, defined as $1/k_{\mathrm{ISC}}$, where $k_{\mathrm{ISC}}$ is the intersystem crossing rate. Given in seconds.
    - Note that `t_isc` is about 1.3 times longer for the crossing on the *cis* side than the crossing on the *trans* side. The rate is therefore 1.3 times smaller. This is because the rate gets multiplied by 2 when going from singlet to triplet (*cis* side), and three when going from triplet to singlet (*trans* side). The ratio is not exactly 1.5, since the different potential energy surfaces on either side also affect the rates.



## `summary` 

Now that we've seen `results_by_mechanism`, let's take a look at `summary`:

In [12]:
summary = sub_dic["summary"]
display(summary.keys())


dict_keys(['s_t_crossing', 'trans', 'cis'])

`summary` has three keys:
- `trans`: Information about the TS free energy, energy, etc. relative to the *trans* isomer
- `cis`: Information about the TS free energy, energy, etc. relative to the *cis* isomer
- `s_t_crossing`: summary information about singlet-triplet crossing


###  `trans`  and `cis`
`summary["trans"]` and `summary["cis"]` look a lot like one of our mechanism dictionaries, but it's specifically for the TS with the lowest free energy:

In [13]:
display(list(summary["trans"].keys()))
print(summary["trans"].keys() == summary["cis"].keys())

['ts_geom_id',
 'endpoint_geom_id',
 'delta_free_energy',
 'delta_energy',
 'delta_enthalpy',
 'delta_conf_free_energy',
 'delta_avg_conf_energy',
 'delta_entropy_j_mol_k',
 'delta_conf_entropy_j_mol_k',
 'delta_vib_entropy_j_mol_k',
 'endpoint_conf_free_energy',
 'ts_conf_free_energy',
 'ts_atoms',
 'endpoint_atoms']

True


### `s_t_crossing`.

Now let's look at `s_t_crossing`:

In [14]:
display(list(summary["s_t_crossing"].keys()))
display(list(summary["s_t_crossing"]["unstable_side"].keys()))
print(summary["s_t_crossing"]["unstable_side"].keys() == summary["s_t_crossing"]["stable_side"].keys())

['unstable_side', 'stable_side']

['s_t_crossing_geom_id',
 'endpoint_geom_id',
 'delta_free_energy',
 'delta_enthalpy',
 'delta_conf_free_energy',
 'delta_avg_conf_energy',
 'delta_eff_free_energy',
 'delta_entropy_j_mol_k',
 'delta_conf_entropy_j_mol_k',
 'delta_vib_entropy_j_mol_k',
 'delta_eff_entropy_j_mol_k',
 'endpoint_conf_free_energy',
 's_t_crossing_conf_free_energy',
 'endpoint',
 't_isc',
 's_t_crossing_atoms',
 'endpoint_atoms']

True


We see that `s_t_crossing` has two separate dictionaries, one with free energies (and effective free energies, enthalpies, etc.) relative to the unstable isomer, and one relative to the stable isomer.

In [15]:
for key, dic in summary["s_t_crossing"].items():
    print(key)
    display({key: val for key, val in dic.items() if 'geom_id' not in key
             and 'atoms' not in key})
    trj = [dic[key] for key in ['endpoint_atoms', 's_t_crossing_atoms']]
    display(nv.show_asetraj(trj))

unstable_side


{'delta_free_energy': 26.069174272403334,
 'delta_enthalpy': 26.356456927799986,
 'delta_conf_free_energy': -0.26768503575758185,
 'delta_avg_conf_energy': 0.23053687616104576,
 'delta_eff_free_energy': 28.492724274717613,
 'delta_entropy_j_mol_k': 4.031496327954598,
 'delta_conf_entropy_j_mol_k': 3.7564789186977112,
 'delta_vib_entropy_j_mol_k': 0.2750174092568528,
 'delta_eff_entropy_j_mol_k': -29.978677107171762,
 'endpoint_conf_free_energy': -0.001018021300492071,
 's_t_crossing_conf_free_energy': -0.0014446048805802957,
 'endpoint': 'cis',
 't_isc': 5.83526486568076e-12}

NGLWidget(max_frame=1)

stable_side


{'delta_free_energy': 25.226068091148722,
 'delta_enthalpy': 25.513350746545374,
 'delta_conf_free_energy': -0.26768503575758185,
 'delta_avg_conf_energy': 0.23053687616104576,
 'delta_eff_free_energy': 27.486210492693427,
 'delta_entropy_j_mol_k': 4.031496327954598,
 'delta_conf_entropy_j_mol_k': 3.7564789186977112,
 'delta_vib_entropy_j_mol_k': 0.2750174092568528,
 'delta_eff_entropy_j_mol_k': -27.68554478579036,
 'endpoint_conf_free_energy': -0.001018021300492071,
 's_t_crossing_conf_free_energy': -0.0014446048805802957,
 'endpoint': 'trans',
 't_isc': 4.428758216975692e-12}

NGLWidget(max_frame=1)

## Detailed data

We now know our way around the `_atoms.pickle` file; the `_xyz.pickle` file is the exact same, but just with `xyz` coordinates instead of Atoms objects, which can be easily converted into each other. Now let's look at the `json` file, which contains some more detailed information.

We'll start by loading it:

In [16]:
with open("data/ts_query.json", 'rb') as f:
    json_info = json.load(f)

Let's take a look at the keys in each sub-dictionary:

In [17]:
key = list(json_info.keys())[0]
sub_dic = json_info[key]
display(list(sub_dic.keys()))

['transition_states',
 'exp_data',
 'results_by_mechanism',
 'summary',
 'stable',
 'unstable',
 'trans',
 'cis',
 'cis_smiles',
 'trans_smiles']

Here we see some new keys:
- `exp_data`: Experimental thermal lifetime data, if available
- `transition_states`: Detailed information about all optimized TSs for this species
- `cis`: Detailed information about the optimized *cis* conformer
- `trans`: Detailed information about the optimized *trans* conformer

### `exp_data`

Let's take a look at the experimental data:

In [18]:
sub_dic = json_info[list(json_info.keys())[0]]

display(sub_dic['cis_smiles'])
display(sub_dic['exp_data'])


'Fc1ccccc1/N=N\\c1ccccc1F'

[{'DOI': '10.1002/chem.201404649',
  'dielec': 37.5,
  'solvent': 'acetonitrile',
  'temperature_C': 60.0,
  'lifetime_hours': 14.950550056524747,
  'reported_dS_cal': None,
  'inferred_dG_kcal': 26.7894153916709,
  'reported_dE_kcal': None,
  'reported_dH_kcal': None,
  'temperature_kelvin': 333.15,
  'reported_dS_cal_err': None,
  'reported_dE_kcal_err': None,
  'reported_dH_kcal_err': None,
  'dG_room_temp_from_S_and_H': None,
  'dG_room_temp_from_S_and_H_err': None}]

The data is given as a list of dictionaries. Each dictionary contains information from a different literature sources. The dictionary keys have the following meaning:

- `DOI`: DOI of the literature source. You can access the original article by navigating to `doi.org/<DOI>`.
- `dielec`: Dielectric constant of the solvent used
- `solvent`: Name of the solvent
- `temperature_C`: Temperature at which the experiment was performed, in degrees Celius
- `temperature_kelvin`: Temperature at which the experiment was performed, in Kelvin
- `lifetime_hours`: Lifetime in hours. Given by $1 / k$, where $k$ is the reaction rate. Note that this is the $1/e$ lifetime, related to the half-life through $\mathrm{half \ life} = \mathrm{lifetime } \cdot 2 / \mathrm{e} $.
- `inferred_dG_kcal`: Activation free energy. Given in kcal/mol and inferred from the experimental lifetime through $\Delta G = -k_{\mathrm{B}} T \cdot \mathrm{ln}(k \cdot \frac{h}{k_{\mathrm{B}} T})$, where $T$ is the measurement temperature.
- `reported_dS_cal`: reported $\Delta S$ in $\mathrm{cal}/\mathrm{mol \ K}$, if given
- `reported_dG_cal`: reported $\Delta G$ in $\mathrm{kcal}/\mathrm{mol}$, if given
- `reported_dH_cal`: reported $\Delta H$ in $\mathrm{kcal}/\mathrm{mol}$, if given
- `dG_room_temp_from_S_and_H`: $\Delta G = \Delta H - T\Delta S$, computed if $\Delta H$ and $\Delta S$ were reported experimentally
- `<item>_err`: The reported uncertainty (error) in `<item`>



## `transition_states`

This is a list of lists. The inner list is a set of TS dictionaries for a given mechanism, where each TS was optimized from a different conformer. There are four such lists, one for each mechanism:

In [19]:
print(type(sub_dic['transition_states']))
print(len(sub_dic['transition_states'])) # 4 mechanisms
print(len(sub_dic['transition_states'][0])) # 3 conformers for this mechanism

<class 'list'>
4
3


Let's see what kind of information is in each dictionary:

In [20]:
ts_dic = sub_dic['transition_states'][0][0]
display(list(ts_dic.keys()))

['job_id',
 'status',
 'parentjob_id',
 'geom_id',
 'converged',
 'rxn_path_id',
 'entropy',
 'free_energy',
 'enthalpy',
 'vibfreqs',
 'energy',
 'ts_specific_conf_free_energy',
 'ts_specific_avg_conf_energy',
 'ts_specific_conf_entropy',
 'ts_specific_entropy',
 'ts_specific_free_energy',
 'ts_specific_enthalpy',
 'conf_free_energy',
 'avg_conf_energy',
 'conf_entropy',
 'vib_entropy',
 's_t_crossing']

#### "TS-specific" keys

We see a new set of keys that start with `ts_specific`. This is related to a subtlety of the conformational entropy. One can compute the conformational entropy for a given TS using the ensemble of conformers generated for that TS. But if we have 4 possible TSs, the conformational entropy should actually taken into account all conformers from all 4 TSs. So all keys that start with `ts_specific` use only the conformers from that TS, whereas the other keys take into account the ensembles of all TSs.

Let's compare them:



In [27]:
ha_to_kcal = 627.5

print(ts_dic['ts_specific_conf_free_energy'] * ha_to_kcal)
print(ts_dic['conf_free_energy'] * ha_to_kcal)

-0.45582969952020397
-0.9064895625641356


We see that the TS-specific conformational free energy is $-0.45$. This is basically because there are two rotational TS conformers. One has the fluorines near each other and one has them on opposite sides. Hence the conformational free energy is approximately $-k_{\mathrm{B}} T \ \mathrm{ln}(2) \approx -0.41$ kcal/mol. The *total* conformational free energy is double that, because the TS ensembles of the clockwise and counter-clockwise mechanisms have identical energies. Hence we double the number of conformers. This effect is the same as accounting for the symmetry number $\sigma$ in the free energy calculation (see for example [this reference](https://iopscience.iop.org/article/10.1088/1361-648X/aa75bd/meta?casa_token=oHytsk0BqnsAAAAA:gFRdgLshNymPnuhHh1rgsDSmOT4J2jQHN26cpn5wZx6vkeZJZPJk7tcBtTjLcnxFS3LI2EsNY78)).

#### Other new keys

Ignoring the keys with `id` or `status` in them and the keys defined above, we have the following new keys:
- `converged`: Whether or not eigenvector following converged for this geometry, giving a forces under the tolerance, and a single negative frequency with magnitude $> 200 \ \mathrm{cm}^{-1}$
- `vibfreqs`: Sorted vibrational frequencies, in $\mathrm{cm}^{-1}$
- `s_t_crossing`: The same information as in `s_t_crossing` from the `atoms` file. Here we see it associated with a specific TS, and so we know which TS the crossing calculation was launched from.

We can take a look at the vibrational frequencies and make sure we have a true TS:

In [28]:
ts_dic['vibfreqs'][:5]

[-842.0945451836966,
 42.08440759282712,
 57.518808541565306,
 62.92304032347095,
 143.68334273247208]

### `cis` and `trans`

Lastly, we have detailed information about the optimized *cis* and *trans* isomers. Let's look at the dictionaries' content:

In [29]:
display(list(sub_dic['cis'].keys()))

['confgen_job_id',
 'hess_job_id',
 'geom_id',
 'converged',
 'status',
 'free_energy',
 'enthalpy',
 'entropy',
 'vibfreqs',
 'energy',
 'conf_free_energy',
 'avg_conf_energy',
 'conf_entropy',
 'vib_entropy']

The only new keys here were already seen in `transition_states` above. The only difference is that, for the *cis* and *trans* endpoints, `converged` means that all frequencies are positive:

In [30]:
display(sub_dic['cis']['vibfreqs'][:5])

[31.643654799284878,
 45.57167595808862,
 63.129765384813304,
 137.2431807817611,
 146.29533108980726]

# Conclusion

In this tutorial we learned how to load and understand the published barrier data. Now you're ready for the next tutorial, in which we plot and analyze the data!