# Molecular Graph Model Tutorial

Learn how MolPy uses graph models to represent molecular structures and perform pattern matching with SMARTS!


## What is a Molecular Graph Model?

MolPy uses graph-based representations for:

- **SMARTS Pattern Matching**: Match chemical patterns in molecules
- **Atom Type Assignment**: Identify atom types based on connectivity
- **Reaction Modeling**: Find and transform molecular substructures

The graph model converts `Atomistic` structures into `igraph.Graph` objects for efficient pattern matching.


In [None]:
from molpy.typifier.adapter import build_mol_graph
from molpy.typifier.graph import SMARTSGraph

## Building Molecular Graphs

Convert an `Atomistic` structure to a graph representation:


In [None]:
# Create a simple molecule - water (H2O)
from molpy.core.atomistic import Atomistic

atomistic = Atomistic()
o = atomistic.def_atom(symbol="O", xyz=[0.0, 0.0, 0.0])
h1 = atomistic.def_atom(symbol="H", xyz=[0.96, 0.0, 0.0])
h2 = atomistic.def_atom(symbol="H", xyz=[-0.24, 0.93, 0.0])

atomistic.def_bond(o, h1)
atomistic.def_bond(o, h2)

print(f"Created molecule: {atomistic}")

# Convert to graph
graph, vs_to_atomid, atomid_to_vs = build_mol_graph(atomistic)

print(f"Graph vertices: {graph.vcount()}")
print(f"Graph edges: {graph.ecount()}")

# Access vertex attributes
print("\nVertex attributes:")
for i, v in enumerate(graph.vs):
    print(f"  Vertex {i}: element={v['element']}, degree={v['degree']}")

## Graph Attributes

The graph includes rich vertex and edge attributes:


In [None]:
# Vertex attributes include:
# - element: str (e.g., "C", "N", "O")
# - number: int (atomic number)
# - is_aromatic: bool
# - charge: int
# - degree: int (number of bonds)
# - hyb: int | None (1=sp, 2=sp2, 3=sp3)
# - in_ring: bool
# - cycles: set of tuples (ring membership)

# Edge attributes include:
# - order: int | str (1, 2, 3, or ":")
# - is_aromatic: bool
# - is_in_ring: bool

print("Graph attributes documentation")

## SMARTS Graphs

SMARTSGraph represents a SMARTS pattern as a graph for matching:


In [None]:
from molpy.parser.smarts import SmartsParser

# Create a SMARTS pattern (e.g., match an oxygen with 2 hydrogens - water)
smarts_string = "[O][H][H]"
parser = SmartsParser()

# Parse and create SMARTSGraph
smarts_graph = SMARTSGraph(
    smarts_string=smarts_string, parser=parser, atomtype_name="water", priority=1
)

print(f"Created SMARTSGraph: {smarts_graph}")
print(f"Pattern vertices: {smarts_graph.vcount()}")
print(f"Pattern edges: {smarts_graph.ecount()}")

## Pattern Matching

Use graph isomorphism to find matches:


In [None]:
# Match SMARTS pattern against molecular graph
# Note: The graph needs to match the pattern structure
# For a simple pattern like [O][H][H], we need to check if it matches

# Get subgraph isomorphisms
matches = graph.get_subisomorphisms_vf2(smarts_graph)

print(f"Found {len(matches)} matches")

if matches:
    print("\nMatch details:")
    for i, match in enumerate(matches):
        print(f"  Match {i + 1}: vertices {match}")
        # Show which atoms matched
        for v_idx in match:
            atom_id = vs_to_atomid[v_idx]
            # Find the atom in atomistic
            for atom in atomistic.atoms:
                if id(atom) == atom_id:
                    print(f"    Vertex {v_idx} -> {atom.get('symbol')}")
                    break

## Use Cases

Molecular graph models are used in:

1. **Typifiers**: Assign atom types based on SMARTS patterns
2. **Reaction Modeling**: Find reaction sites and transform structures
3. **Structure Analysis**: Identify functional groups and substructures

The graph representation makes these operations efficient and flexible!
