In [1]:
import sys
sys.path.insert(0,'../') #enable loading rxn package

import imolecule
from rxn.data_structures import MolGraph

We can generate a molecule graph and visualize it using imolecule. Note that MolGraph inherits all the graph methods of the [networkx Graph class](https://networkx.github.io/documentation/networkx-1.10/tutorial/tutorial.html).

In [2]:
molecule = MolGraph()
molecule.generate('ONNO',data_format='smi') #generate N2O2H4 using "SMILES" string.
print(repr(molecule)) #print full dictionary representation
print(str(molecule)) # print shorthand string notation

MolGraph({'bonds': [{'order': 1, 'atoms': [0, 4]}, {'order': 1, 'atoms': [0, 3]}, {'order': 1, 'atoms': [0, 1]}, {'order': 1, 'atoms': [1, 2]}, {'order': 1, 'atoms': [1, 6]}, {'order': 1, 'atoms': [3, 5]}, {'order': 1, 'atoms': [6, 7]}], 'atoms': [{'charge': -0.115194, 'location': [0.240315, 0.798261, -0.035857], 'element': 'N'}, {'charge': -0.115194, 'location': [-0.244389, -0.517724, 0.134226], 'element': 'N'}, {'charge': 0.175753, 'location': [0.043011, -1.085406, -0.669302], 'element': 'H'}, {'charge': -0.298539, 'location': [1.677716, 0.806578, 0.117112], 'element': 'O'}, {'charge': 0.175753, 'location': [-0.06081, 1.410616, 0.734186], 'element': 'H'}, {'charge': 0.23798, 'location': [1.949494, 0.45727, -0.754832], 'element': 'H'}, {'charge': -0.298539, 'location': [-1.685265, -0.53594, -0.016613], 'element': 'O'}, {'charge': 0.23798, 'location': [-1.920072, -1.333654, 0.491081], 'element': 'H'}]})
HHOHHONN


In [3]:
imolecule.draw(molecule.to_dict(),format='json') #visualize using imolecule
print(molecule.nodes()) #show the nodes of the graph.
nodes = sorted(molecule.nodes()) 
#note that node ordering is random by default. Ordering the nodes will make the workflow repeatable

['N[0.240315, 0.798261, -0.035857]', 'N[-0.244389, -0.517724, 0.134226]', 'H[0.043011, -1.085406, -0.669302]', 'O[1.677716, 0.806578, 0.117112]', 'H[-0.06081, 1.410616, 0.734186]', 'H[1.949494, 0.45727, -0.754832]', 'O[-1.685265, -0.53594, -0.016613]', 'H[-1.920072, -1.333654, 0.491081]']


Note that node names are defined by the element name and atomic position of each atom. These node names are *static* and *arbitrary*. Static means that if you change positions of atoms within the graph structure the node names will not dynamically update. Arbitrary means that the graph properties do not depend on the names of the nodes. Next we can test a few convenient features of the MolGraph class, such as identifying equivalent isomers (regardless of atomic labels or positions). We will start by generating some test data.

In [4]:
HONHNHO = molecule.copy()
HONHNHO.remove_node(nodes[0])
imolecule.draw(HONHNHO.to_dict(),format='json')

ONHNHOH = molecule.copy()
ONHNHOH.remove_node(nodes[2])
imolecule.draw(ONHNHOH.to_dict(),format='json')

HONNHOH = molecule.copy()
HONNHOH.remove_node(nodes[1])
imolecule.draw(HONNHOH.to_dict(),format='json')

All of these molecules have the same composition, but the first two have the same bond structure, while the third has a different bond structure. We can check that the MolGraph data structure knows this.

In [5]:
print(HONHNHO == ONHNHOH) #check that 1 and 2 are equal
print(HONHNHO == HONNHOH) #check that 1 and 3 are not equal

True
False


We can also test that MolGraph only analyzes molecular *topology*, i.e. bond structure, and does not care about the actual positions of atoms, lengths of bonds, or labels of nodes.

In [6]:
ONNO_0 = molecule.copy()
atom_labels = ONNO_0.nodes()
atom_0 = ONNO_0.node[atom_labels[0]]
print('Old location: {}'.format(atom_0['location']))
atom_0['location'][0] +=1
print('New location: {}'.format(atom_0['location']))
imolecule.draw(ONNO_0.to_dict(),format='json')
print('Equivalent to original? {}'.format(ONNO_0 == molecule))

Old location: [0.240315, 0.798261, -0.035857]
New location: [1.240315, 0.798261, -0.035857]


Equivalent to original? True


Clearly these molecules are different, but they have the same topology and are hence considered equivalent. This is a useful feature since it ignores minor differences in molecular structure, but it can also be dangerous if you do not understand its limitations.

Finally, let's make sure that MolGraph doesn't care about node labels.

In [7]:
import networkx as nx
int_molecule = nx.relabel.convert_node_labels_to_integers(molecule)
print('Original Nodes: {}'.format(molecule.nodes()))
print('Integer Nodes: {}'.format(int_molecule.nodes()))
print('Are the equivalent? {}'.format(molecule == int_molecule))

Original Nodes: ['N[0.240315, 0.798261, -0.035857]', 'N[-0.244389, -0.517724, 0.134226]', 'H[0.043011, -1.085406, -0.669302]', 'O[1.677716, 0.806578, 0.117112]', 'H[-0.06081, 1.410616, 0.734186]', 'H[1.949494, 0.45727, -0.754832]', 'O[-1.685265, -0.53594, -0.016613]', 'H[-1.920072, -1.333654, 0.491081]']
Integer Nodes: [0, 1, 2, 3, 4, 5, 6, 7]
Are the equivalent? True


So clearly node labels do not matter. However, it is worth noting that if you try to add multiple nodes with the same label to the same MolGraph then they will overwrite. This can also be very dangerous.