# Quickstart Guide

**Get productive with MolPy in 15 minutes!**

This guide shows you the essential workflows through runnable examples:

1. **Core Data Structures** - Frame, Block, Atomistic, Box
2. **Parsing Molecules** - SMILES, BigSMILES, GBigSMILES
3. **Building Polymers** - Monomers, reactions, chains
4. **Force Fields** - Atom typing, parameters
5. **File I/O** - PDB, LAMMPS, trajectories
6. **Complete Workflow** - End-to-end polymer system

---

## 1. Setup & Imports

In [1]:
import numpy as np
import molpy as mp
from molpy import Frame, Block, Box, Atom, Atomistic, Bond

print(f"âœ… MolPy version: {mp.__version__}")

âœ… MolPy version: 0.2.0


---

## 2. Frame & Block: Tabular Data

`Frame` stores tabular data in `Block` objects (like pandas DataFrames but NumPy-native).

**Use cases:**
- Store coordinates, velocities, forces
- Read/write simulation files
- Analyze trajectories

In [2]:
# Create a water molecule
frame = Frame()
atoms = Block()

# Add atomic data
atoms["element"] = ["O", "H", "H"]
atoms["xyz"] = np.array([[0.0, 0.0, 0.0],
                          [0.96, 0.0, 0.0],
                          [-0.24, 0.93, 0.0]])
atoms["charge"] = [-0.834, 0.417, 0.417]

frame["atoms"] = atoms
frame.metadata["box"] = Box.cubic(20.0)

print(f"Atoms: {frame['atoms'].nrows}")
print(f"Coordinates shape: {frame['atoms']['xyz'].shape}")
print(f"Total charge: {frame['atoms']['charge'].sum():.3f}")

Atoms: 3
Coordinates shape: (3, 3)
Total charge: 0.000


---

## 3. Atomistic: Molecular Graph

`Atomistic` represents molecules as graphs with atoms, bonds, angles, dihedrals.

**Use cases:**
- Build molecules from scratch
- Edit structures (add/remove atoms)
- Run reactions

In [3]:
# Build methane (CH4)
methane = Atomistic()

# Add carbon
c = Atom(symbol="C", xyz=[0.0, 0.0, 0.0])
methane.add_atom(c)

# Add 4 hydrogens in tetrahedral geometry
positions = np.array([
    [ 0.63,  0.63,  0.63],
    [-0.63, -0.63,  0.63],
    [-0.63,  0.63, -0.63],
    [ 0.63, -0.63, -0.63]
])

for pos in positions:
    h = Atom(symbol="H", xyz=pos)
    methane.add_atom(h)
    methane.add_bond(Bond(c, h))

# Generate topology (angles, dihedrals)
methane.get_topo(gen_angle=True, gen_dihe=True)

print(f"Atoms: {len(list(methane.atoms))}")
print(f"Bonds: {len(list(methane.bonds))}")
print(f"Angles: {len(list(methane.angles))}")
print(f"Dihedrals: {len(list(methane.dihedrals))}")

Atoms: 5
Bonds: 4
Angles: 6
Dihedrals: 0


---

## 4. Parser: SMILES to Molecules

Parse SMILES/BigSMILES/GBigSMILES strings into molecular structures.

**Supported:**
- SMILES: `CCO` (ethanol)
- BigSMILES: `{[<]CC[>]}` (polyethylene monomer)
- GBigSMILES: With molecular weight distributions

In [4]:
from molpy.parser.smiles import parse_bigsmiles, bigsmilesir_to_monomer
from molpy.external import RDKitAdapter, Generate3D

# Parse ethylene oxide monomer
bigsmiles = "{[<]OCCO[>]}"
ir = parse_bigsmiles(bigsmiles)
monomer = bigsmilesir_to_monomer(ir)

# Generate 3D coordinates
adapter = RDKitAdapter(internal=monomer)
gen3d = Generate3D(add_hydrogens=True, embed=True, optimize=True, update_internal=True)
adapter = gen3d(adapter)
monomer_3d = adapter.get_internal()

print(f"Monomer atoms: {len(list(monomer_3d.atoms))}")
print(f"Ports: {[a.get('port') for a in monomer_3d.atoms if a.get('port')]}")

Monomer atoms: 10
Ports: ['<', '>']


---

## 5. Reacter: Chemical Reactions

Run chemical reactions to connect monomers.

**Example:** Dehydration reaction (forms ether bond, removes Hâ‚‚O)

In [5]:
from molpy.reacter import Reacter, select_hydroxyl_group, form_single_bond
from molpy.reacter.selectors import select_port_atom
from molpy.reacter.utils import find_neighbors

# Define reaction selectors
def select_carbon_from_oh(assembly, port_name):
    """Select C atom connected to -OH"""
    port_o = select_port_atom(assembly, port_name)
    c_neighbors = find_neighbors(assembly, port_o, element="C")
    return c_neighbors[0]

def select_h_from_oh(assembly, port_atom):
    """Select H from -OH"""
    h_neighbors = find_neighbors(assembly, port_atom, element="H")
    return [h_neighbors[0]]

# Create dehydration reacter
dehydration = Reacter(
    name="ether_formation",
    port_selector_left=select_carbon_from_oh,
    port_selector_right=select_port_atom,
    leaving_selector_left=select_hydroxyl_group,
    leaving_selector_right=select_h_from_oh,
    bond_former=form_single_bond,
)

# Run reaction (connect two monomers)
m1 = monomer_3d.copy()
m2 = monomer_3d.copy()

result = dehydration.run(left=m1, right=m2, port_L=">", port_R="<")

print(f"Product atoms: {len(list(result.product.atoms))}")
print(f"Removed atoms: {len(result.removed_atoms)} (water)")
print(f"New bonds: {len(result.new_bonds)}")

Product atoms: 17
Removed atoms: 3 (water)
New bonds: 1


---

## 6. Builder: Polymer Chains

Build linear polymer chains from monomer sequences.

**Components:**
- `Connector`: Maps ports between monomers
- `Placer`: Positions monomers in 3D
- `linear()`: Builds chain from sequence

In [6]:
from molpy.builder.polymer.connectors import ReacterConnector
from molpy.builder.polymer.placer import create_covalent_linear_placer
from molpy.builder.polymer.linear import linear
from molpy.typifier.atomistic import OplsAtomisticTypifier

# Load force field
ff = mp.io.read_xml_forcefield("oplsaa.xml")
typifier = OplsAtomisticTypifier(ff, strict_typing=False)

# Type monomer
monomer_3d.get_topo(gen_angle=True, gen_dihe=True)
for idx, atom in enumerate(monomer_3d.atoms):
    atom["id"] = idx + 1
typifier.typify(monomer_3d)

# Create connector and placer
port_map = {("A", "A"): (">", "<")}
connector = ReacterConnector(default=dehydration, port_map=port_map)
placer = create_covalent_linear_placer()

# Build 5-mer chain
library = {"A": monomer_3d}
sequence = "AAAAA"

build_result = linear(
    sequence=sequence,
    library=library,
    connector=connector,
    typifier=typifier,
    placer=placer,
)

polymer = build_result.polymer
print(f"Polymer atoms: {len(list(polymer.atoms))}")
print(f"Polymer bonds: {len(list(polymer.bonds))}")



Polymer atoms: 38
Polymer bonds: 37


---

## 7. Typifier: Atom Typing

Assign force field atom types using SMARTS patterns.

**Automatic typing:**
- Atoms â†’ atom types
- Bonds â†’ bond types
- Angles â†’ angle types
- Dihedrals â†’ dihedral types

In [7]:
# Typifier already applied during building
# Check assigned types
atom_types = set()
for atom in polymer.atoms:
    if atom.get("type"):
        atom_types.add(atom.get("type"))

print(f"Unique atom types: {len(atom_types)}")
print(f"Types: {sorted(atom_types)[:5]}...")  # Show first 5

Unique atom types: 7
Types: ['opls_140', 'opls_154', 'opls_155', 'opls_157', 'opls_180']...


---

## 8. File I/O: Export to LAMMPS

Export structures to simulation formats.

**Supported formats:**
- LAMMPS data/trajectory
- PDB, GRO, XYZ
- Force field files

In [8]:
from molpy.io.data.lammps import LammpsDataWriter

# Prepare frame
frame = polymer.to_frame()
frame.metadata["box"] = Box.cubic(30.0)

# Add required fields
n_atoms = frame["atoms"].nrows
frame["atoms"]["mol"] = np.ones(n_atoms, dtype=int)
if "charge" in frame["atoms"]:
    frame["atoms"]["q"] = frame["atoms"]["charge"]

# Write LAMMPS data file
writer = LammpsDataWriter("polymer.data", atom_style="full")
writer.write(frame)

print("âœ… Exported to polymer.data")

âœ… Exported to polymer.data


---

## 9. Trajectory: Analyze Simulations

Read and analyze trajectory files.

**Features:**
- Lazy loading (memory efficient)
- Slicing and striding
- Property calculations

In [9]:
# Example: Create a simple trajectory
from molpy import Trajectory

# Simulate 10 frames with random displacements
frames = []
for i in range(10):
    f = frame.copy()
    # Add random displacement
    # Add random displacement to each coordinate
    for coord in ["x", "y", "z"]:
        f["atoms"][coord] += np.random.randn(len(f["atoms"][coord])) * 0.1
    f.metadata["time"] = i * 1.0  # ps
    frames.append(f)

# Calculate RMSD
ref_x = frames[0]["atoms"]["x"]
ref_y = frames[0]["atoms"]["y"]
ref_z = frames[0]["atoms"]["z"]
rmsds = []
for f in frames:
    x = f["atoms"]["x"]
    y = f["atoms"]["y"]
    z = f["atoms"]["z"]
    rmsd = np.sqrt(np.mean((x - ref_x)**2 + (y - ref_y)**2 + (z - ref_z)**2))
    rmsds.append(rmsd)

print(f"RMSD range: {min(rmsds):.3f} - {max(rmsds):.3f} Ã…")

RMSD range: 0.000 - 0.000 Ã…


---

## 10. Complete Workflow: Polydisperse System

**End-to-end example:** Build a polydisperse polymer system.

**Steps:**
1. Parse GBigSMILES
2. Generate molecular weight distribution
3. Build polymer chains
4. Pack into simulation box
5. Export to LAMMPS

In [10]:
from molpy.parser.smiles import parse_gbigsmiles_to_polymerspec
from molpy.builder.polymer.system import (
    create_dp_distribution_from_ir,
    PolydisperseChainGenerator,
    SystemPlanner,
)
from molpy.builder.polymer.sequence_generator import WeightedSequenceGenerator
from random import Random

# Parse GBigSMILES with distribution
gbigsmiles = "{[<]OCCO[>]}|flory_schulz(0.1)|[H].|1e5|"
spec = parse_gbigsmiles_to_polymerspec(gbigsmiles)

# Extract monomer (already have monomer_3d from earlier)
# Calculate average monomer mass
from molpy.core.element import Element
avg_mw = sum(Element(a.get("symbol", "C").upper()).mass 
             for a in monomer_3d.atoms)

# Create distribution
from molpy.builder.polymer.system import FlorySchulzDPDistribution
dp_dist = FlorySchulzDPDistribution(
    p=0.1,
    avg_monomer_mass=avg_mw,
    random_seed=42,
)

# Generate system plan
seq_gen = WeightedSequenceGenerator(monomer_weights={"0": 1.0})
chain_gen = PolydisperseChainGenerator(
    seq_generator=seq_gen,
    monomer_mass={"0": avg_mw},
    end_group_mass=0.0,
    dp_distribution=dp_dist,
)

planner = SystemPlanner(
    chain_generator=chain_gen,
    target_total_mass=1e5,  # 100 kDa
    max_rel_error=0.05,
)

rng = Random(42)
system_plan = planner.plan_system(rng=rng)

print(f"âœ… Generated {len(system_plan.chains)} chains")
print(f"   Total mass: {system_plan.total_mass:.1f} Da")
print(f"   Chain length range: {min(c.dp for c in system_plan.chains)} - {max(c.dp for c in system_plan.chains)}")

âœ… Generated 180 chains
   Total mass: 100302.1 Da
   Chain length range: 1 - 45


---

## ðŸŽ¯ Next Steps

**Tutorials (hands-on):**
- [Frame & Block](../tutorials/frame-block.ipynb) - Data structures deep dive
- [Polymer Workflows](../tutorials/polymer-step-by-step.ipynb) - Complete polymer building
- [Force Fields](../tutorials/force-field.ipynb) - Atom typing and parameters

**User Guide (comprehensive):**
- [Parser](../user-guide/parser.ipynb) - SMILES/BigSMILES/GBigSMILES
- [Reacter](../user-guide/reacter.ipynb) - Chemical reactions
- [Builder](../user-guide/polymer_builder_overview.ipynb) - Polymer construction
- [Typifier](../user-guide/typifier.ipynb) - Force field assignment
- [IO](../user-guide/io.ipynb) - File formats

**API Reference:**
- [Complete API](../api/index.md) - All modules documented

**Need help?**
- [FAQ](faq.md)
- [GitHub Issues](https://github.com/MolCrafts/molpy/issues)
- [Discussions](https://github.com/MolCrafts/molpy/discussions)