# How RMG estimates kinetics non-deterministically?

Han, Kehang (hkh12@mit.edu)

Nov.16, 2015

## Introduction

One of RMG's main principles is to only keep the first occurance and eliminate the duplicates such as species and reactions. But this is not a universally appropriate approach at least for some parts of RMG.

In this post, I'm trying to give an example of how this approach could eventually lead to non-deterministic estimation of kinetics.

## Set-up
Before investigating, several pre-steps should be needed for set-up.

- import all the necessary modules

- create RMG object and load database needed

- create a reactant (O=C[C]=C goes through R_Addition_MultipleBond) for reacting research

In [3]:
from rmgpy.rmg.main import RMG, CoreEdgeReactionModel
from rmgpy.data.rmg import RMGDatabase, database
from rmgpy.rmg.model import Species
from rmgpy.molecule import Molecule
from rmgpy import settings
import os

In [4]:
# set-up RMG object
rmg = RMG()
rmg.reactionModel = CoreEdgeReactionModel()

# load kinetic database and forbidden structures
rmg.database = RMGDatabase()
path = os.path.join(settings['database.directory'])

# forbidden structure loading
database.loadForbiddenStructures(os.path.join(path, 'forbiddenStructures.py'))
# kinetics family Disproportionation loading
database.loadKinetics(os.path.join(path, 'kinetics'), \
                          kineticsFamilies=['R_Addition_MultipleBond'])



In [5]:
spc = Species().fromSMILES("O=C[C]=C")
print spc.molecule[0].toSMILES()
print spc.molecule[0].toAdjacencyList()

C=[C]C=O
multiplicity 2
1 O u0 p2 c0 {2,D}
2 C u0 p0 c0 {1,D} {4,S} {5,S}
3 C u0 p0 c0 {4,D} {6,S} {7,S}
4 C u1 p0 c0 {2,S} {3,D}
5 H u0 p0 c0 {2,S}
6 H u0 p0 c0 {3,S}
7 H u0 p0 c0 {3,S}



In [67]:
newReactions = []
spc.generateResonanceIsomers()
newReactions.extend(rmg.reactionModel.react(database, spc))

# try to pick out the target reaction I want to show
mol_H = Molecule().fromSMILES("[H]")
mol_C3H2O = Molecule().fromSMILES("C=C=C=O")
for rxn in newReactions:
    reactants = rxn.reactants
    products = rxn.products
    rxn_specs = reactants + products
    for rxn_spec in rxn_specs:
        if rxn_spec.isIsomorphic(mol_H):
            for rxn_spec1 in rxn_specs:
                if rxn_spec1.isIsomorphic(mol_C3H2O):
                    for rxn_spec in rxn_specs:
                        rxn_spec.label = rxn_spec.molecule[0].toSMILES()
                    print rxn
                    print rxn.template

C=C=C=O + [H] <=> C=[C]C=O
[<Entry index=127 label="Ck_Ca">, <Entry index=915 label="HJ">]
C=C=C=O + [H] <=> C=C=C[O]
[<Entry index=6 label="Ck_O">, <Entry index=915 label="HJ">]


As you can see, the reactions `C=C=C=O + [H] <=> C=[C]C=O` and `C=C=C=O + [H] <=> C=C=C[O]` have same left side and right sides are resonance isomers. After encapsulating molecule into species, these two reactions will be treated as same reactions in current RMG design although their matched templates are fairly different. 

In [68]:
rmg.reactionModel.processNewReactions(newReactions, spc, None)
for rxn in rmg.reactionModel.edge.reactions:
    # try to pick out the target reaction I want to show
    reactants = rxn.reactants
    products = rxn.products
    rxn_specs = reactants + products
    for rxn_spec in rxn_specs:
        if rxn_spec.isIsomorphic(mol_H):
            for rxn_spec1 in rxn_specs:
                if rxn_spec1.isIsomorphic(mol_C3H2O):
                    print rxn
                    print rxn.template

C=C=C=O(1) + [H](2) <=> C=[C]C=O(3)
[<Entry index=127 label="Ck_Ca">, <Entry index=915 label="HJ">]


But after `processNewReactions`, only `[<Entry index=127 label="Ck_Ca">, <Entry index=915 label="HJ">]` is retained because it appears first.

## Change order of molecule list in the spc

As you can imagine, the order of moleucle in the spc could eventually determine the order of the reactions in list `newReactions`

In [69]:
spc.molecule = list(reversed(spc.molecule))

In [70]:
newReactions = []
newReactions.extend(rmg.reactionModel.react(database, spc))

mol_H = Molecule().fromSMILES("[H]")
mol_C3H2O = Molecule().fromSMILES("C=C=C=O")
for rxn in newReactions:
    reactants = rxn.reactants
    products = rxn.products
    rxn_specs = reactants + products
    for rxn_spec in rxn_specs:
        if rxn_spec.isIsomorphic(mol_H):
            for rxn_spec1 in rxn_specs:
                if rxn_spec1.isIsomorphic(mol_C3H2O):
                    for rxn_spec in rxn_specs:
                        rxn_spec.label = rxn_spec.molecule[0].toSMILES()
                    print rxn
                    print rxn.template

C=C=C=O + [H] <=> C=C=C[O]
[<Entry index=6 label="Ck_O">, <Entry index=915 label="HJ">]
C=C=C=O + [H] <=> C=[C]C=O
[<Entry index=127 label="Ck_Ca">, <Entry index=915 label="HJ">]


As you can see, the two reactions have changed the order, which will lead to `[<Entry index=6 label="Ck_O">, <Entry index=915 label="HJ">]` to retain because it appears firt this time.

In [72]:
# set-up RMG object
rmg_new = RMG()
rmg_new.reactionModel = CoreEdgeReactionModel()

rmg_new.reactionModel.processNewReactions(newReactions, spc, None)

for rxn in rmg_new.reactionModel.edge.reactions:
    reactants = rxn.reactants
    products = rxn.products
    rxn_specs = reactants + products
    for rxn_spec in rxn_specs:
        if rxn_spec.isIsomorphic(mol_H):
            for rxn_spec1 in rxn_specs:
                if rxn_spec1.isIsomorphic(mol_C3H2O):
                    print rxn
                    print rxn.template

C=C=C=O(1) + [H](2) <=> C=C=C[O](3)
[<Entry index=6 label="Ck_O">, <Entry index=915 label="HJ">]


## Conclusion

The order of molecules in `spc.molecule` can eventually lead to same reaction but different templates matched and therefore different kinetics. This difference could be exposed as **modelling diverging** if newly developed features have a different molecule order than `master` branch.

And here is an example, below is the spc constructed early in the post

In [74]:
spc = Species().fromSMILES("O=C[C]=C")
print spc.molecule[0].toSMILES()
print spc.molecule[0].toAdjacencyList()

C=[C]C=O
multiplicity 2
1 C u1 p0 c0 {2,S} {3,D}
2 C u0 p0 c0 {1,S} {4,D} {5,S}
3 C u0 p0 c0 {1,D} {6,S} {7,S}
4 O u0 p2 c0 {2,D}
5 H u0 p0 c0 {2,S}
6 H u0 p0 c0 {3,S}
7 H u0 p0 c0 {3,S}



If we convert it into InChISpecies and then convert it back, the order cannot be preserved.

In [5]:
# to run the code below you should checkout the `edge_inchi_rxn` branch
from rmgpy.rmg.model import InChISpecies
ispc = InChISpecies(spc)

In [17]:
spc_new = Species(molecule=[Molecule().fromAugmentedInChI(ispc.getAugmentedInChI())])
print spc_709_new.molecule[0].toSMILES()
print spc_709_new.molecule[0].toAdjacencyList()

C=C=C[O]
multiplicity 2
1 C u0 p0 c0 {2,D} {5,S} {6,S}
2 C u0 p0 c0 {1,D} {3,D}
3 C u0 p0 c0 {2,D} {4,S} {7,S}
4 O u1 p2 c0 {3,S}
5 H u0 p0 c0 {1,S}
6 H u0 p0 c0 {1,S}
7 H u0 p0 c0 {3,S}



One can argue everything should be kept as same as `master`, but I treated it as a bug in current `master` branch, that is, final kinetic data is dependent on molecule list ordering of a reacting species.