# A Demo for QM Parser

Author: Xiaorui Dong

This notebook showcases how RDMC and cclib can be combined into a interactive QM result log parser.

Notes:
- If you see a warning in an interactive cell indicating you need to install py3dmol while you have already do so, don't worry, the warning message should disappear once you drag the slider or change the selections.

In [None]:
from rdmc import Mol
from rdmc.external.logparser import GaussianLog, ORCALog, QChemLog
from rdtools.reaction.ts import examine_normal_mode
from rdtools.view import base_viewer, mol_viewer, reaction_viewer


def general_info(glog):
    print(f'Success?: {glog.success}')
    print(f'TS?: {glog.is_ts}')
    print(f'Involved job types: {", ".join(glog.job_type)}')
    try:
        print(f'Scanning: {", ".join(glog.scan_names)}')
    except:
        pass
    print(f'Charge: {glog.charge}, Multiplicity: {glog.multiplicity}')

%load_ext autoreload
%autoreload 2
%matplotlib inline

## Input the path of log file <a id='HOME'></a>
Currently, RDMC has three parsers `GaussianLog`, `QChemLog`, and `ORCALog`.
You need to assign the path to `log`. Some gaussian results are provided for trying out this notebook. 

In [None]:
############ EXAMPLES #################
# non-TS
log = 'data/well-cbsqb3.out'
# TS
# log = 'data/ts-cbsqb3.out'
# scan
# log = 'data/scan.out'
# IRC
# log = 'data/irc.out'
######################################

glog = GaussianLog(log)
general_info(glog)

## Hyperlinks: Analyze by job types

- [Optimization](#OPT)
- [Frequency](#FREQ)
- [Scan](#SCAN)
- [IRC](#IRC)

## 1. Optimization <a id='OPT'></a>

### 1.1 Visualize molecule

If the optimization was converged, then show the converged geometry. Otherwise, show the geometry that is the closest to the convergence criteria.

[back](#HOME)

In [None]:
xyz = glog.get_best_opt_geom(xyz_str=True)
if glog.success:
    print('Converged XYZ:\n')
else:
    print('Geometry that is the closest to the convergence criteria:\n')
base_viewer(xyz, 'xyz').update()

# XYZ format
print(xyz)
# Gaussian format
# g_xyz = f"{glog.charge} {glog.multiplicity}\n" + "\n".join([l for l in xyz.splitlines()[2:]]) + "\n\n"
# print(g_xyz)

### 1.2 Convergence analysis

Check the trend for each convergence criterion
- `logy`: plot the y axis in log scale
- `relative`: Plot the relative value to the convergence criteria

[back](#HOME)

In [None]:
glog.plot_opt_convergence(logy=True, relative=True)

### 1.3 Interact with opt job
[back](#HOME)

In [None]:
glog.interact_opt();

### 1.4 Modify the molecule
[back](#HOME)

In [None]:
# Get the molecule in the file
mol = glog.get_mol(converged=False)

In [None]:
# Choose the conformer you want to edit
conf_id = 4

conf = mol.GetEditableConformer(conf_id)
############  Edit Conformer #########
# These numbers correpond to the file "well-cbsqb3.out"
# Bond
conf.SetBondLength([4, 11], 1.8)

# Angle
conf.SetAngleDeg([3, 4, 11], 100)

# Torsion
conf.SetTorsionDeg([2, 3, 4, 11], 40)
######################################
# Visualize
mol_viewer(mol, conf_id=conf_id).update()

## 2. Frequency <a id='FREQ'></a>

### 2.1 Summary 
[back](#HOME)

In [None]:
print(f'Number of freqs: {glog.freqs.shape[0]}')
print(f'Number of negative freqs: {glog.num_neg_freqs}')
print(f'Negative freqs: {glog.neg_freqs}')

### 2.2 Interact with freq job

- select the frequency you want to visualize
- change its number of frames (the smaller the faster the animation is; the higher the more detailed the animation is)
- change the amplitude of the mode

[back](#HOME)

In [None]:
glog.interact_freq();

### 2.3 Guess reaction from the imaginary frequency

Guess the reactants and the products from the imaginary frequency mode. This requires the frequency job involves a Transition state molecule. Please be cautious that this method is not very accurate.

- `amplitude`: The amplitude factor on the displacement matrix to generate theguess geometry for the reactant and the product. A smaller factor makes the geometry close to the TS, while a wildly large factor makes the geometry nonphysical.
- `inverse`: Inverse the sequence of the reactant and the product.

There will be messages about SaturateMol, you want to make sure the cell generate `r_mol` and `p_mol` has no failure, while ignore the failure message in the other cell. You may also ignore the SMILES generated for the TS

[back](#HOME)

In [None]:
r_mols, p_mols = glog.guess_rxn_from_normal_mode(amplitude=[0.1, 0.25], atom_weighted=True, inverse=True)
print(f'{len(r_mols)} potential reactants and {len(p_mols)} potential products are identified.')

In [None]:
#####  INPUT  #####
r_idx, p_idx = 0, 0
###################

assert (r_idx < len(r_mols)) and (p_idx < len(p_mols)), "invalid index of reactant/product mol provided"

ts = glog.get_mol(embed_conformers=False)
r_mol, p_mol = r_mols[r_idx], p_mols[p_idx]

print('\nReactant    TS      Product')

reaction_viewer(
    r_mol, p_mol, ts,
    broken_bond_color='red',
    formed_bond_color='green',
    broken_bond_width=0.1,
    formed_bond_width=0.1,
    viewer_size=(800, 100),
    atom_index=False,
).update()

### 2.4 Examine the imaginary frequency

Check if the displacement of the imaginary frequency mode corresponds to the bond formation/breaking.

- `r_smi`: The atom-labeled smi for the reactant complex.
- `p_smi`: The atom-labeled smi for the product complex.
- `amplitude`: The amplitude factor on the displacement matrix, usually between 0-1. This analysis is not very sensitive to its value.

[back](#HOME)

In [None]:
# Example based on ts-cbsqb3.output
r_smi = '[C:1]([C:2]([C:3](=[C:4]([H:13])[H:14])[H:12])([H:9])[H:10])([H:15])([H:16])[H:17].[O:5]=[C:6]([O:7][H:11])[C:8]([H:18])([H:19])[H:20]'
p_smi = '[C:1]([C:2]([C:3]([C:4]([H:13])([H:14])[O:5][C:6](=[O:7])[C:8]([H:18])([H:19])[H:20])([H:11])[H:12])([H:9])[H:10])([H:15])([H:16])[H:17]'
amplitude = 0.1

In [None]:
examine_normal_mode(
    Mol.FromSmiles(r_smi),
    Mol.FromSmiles(p_smi),
    ts_xyz=glog.converged_geometries[0],
    disp=glog.cclib_results.vibdisps[0],
    amplitude=amplitude,
    weights=True,
    verbose=True,
    as_factors=False,
)

## 3. Scan <a id='SCAN'></a>

### 3.1 Visualize the scan
- `align_scan`: if align the scanned coordinate to make the animaiton cleaner
- `align_frag_idx`: in the animation, you will see one part of the molecule fixed while the other part moving. this argument sets which part of the body to move (value should be either 1 or 2).
- `atom_index`: whether to show the atom index

[back](#HOME)

In [None]:
glog.view_traj(
    interval=100,
    align_scan=True,
    align_frag_idx=1,
    backend='openbabel'
).update();

### 3.2 Plot the scan energies

- `converged`: only plot energies for converged geometries
- `relative_x`: plot the x-axis as a relative value (value for initial geom mas the baseline)
- `relative_y`: plot the y-axis as a relative value (value for initial geom as the baseline)

[back](#HOME)

In [None]:
glog.plot_scan_energies(converged=True, relative_x=True, relative_y=True);

### 3.3 Interact with the scan job

[back](#HOME)

In [None]:
glog.interact_scan(align_scan=True, align_frag_idx=1, backend='xyz2mol');

## 4. IRC <a id='IRC'></a>

### 4.1 Visualize the IRC
- `loop`: the way animation plays (`'backAndForth'`, `'forward'`, `'reverse'`)
- `interval`: the time interval between each frame (the smaller the fast the animiation is)

Note: you don't need to worry about the comment of `SaturateMol` failure since we are dealing with TS.

[back](#HOME)

In [None]:
glog.view_traj(
    loop='backAndForth',
    interval=1000,
).update();

### 4.2 Plot the IRC energies
- `converged`: only display the energies for the converged geometries

[back](#HOME)

In [None]:
glog.plot_irc_energies(converged=True);

### 4.3 Interact with the IRC job

[back](#HOME)

In [None]:
glog.interact_irc();

### 4.4 Guess the reaction

Guess the reactants and the products from the IRC results. This requires the IRC job to bidirectional.
- `index`: the index of the conformer pair that is the distance from the TS. To use the geometries at both ends of the IRC curve, you can assign it to `0`.
- `inverse`: Inverse the sequence of the reactant and the product.

There will be messages about SaturateMol, you want to make sure the cell generate `r_mol` and `p_mol` has no failure, while ignore the failure message in the other cell. You may also ignore the SMILES generated for the TS

[back](#HOME)

In [None]:
r_mol, p_mol = glog.guess_rxn_from_irc(index=0, inverse=False)

In [None]:
ts = glog.get_mol(embed_conformers=False)

print('\nReactant    TS      Product')

reaction_viewer(
    r_mol, p_mol, ts,
    broken_bond_color='red',
    formed_bond_color='green',
    broken_bond_width=0.1,
    formed_bond_width=0.1,
    viewer_size=(800, 100),
    atom_index=False,
).update()