# Quickstart Guide: First Steps with MolPy

Welcome to the MolPy quickstart guide. This tutorial is designed to take you from a blank slate to a fully-realized simulation system. We will walk through the core workflow of MolPy: **Building**, **Manipulating**, and **Exporting** molecular systems.

MolPy distinguishes itself by separating the *chemical graph* (connectivity) from the *physical data* (arrays). This separation allows for intuitive building processes and high-performance simulation kernels.

## Objectives
By the end of this tutorial, you will be able to:
1.  **Build** a molecule from scratch using `Atomistic` APIs.
2.  **Inspect** and **Manipulate** molecular properties.
3.  **Typify** your system with a force field.
4.  **Export** your system to a LAMMPS data file for simulation.

Let's get started!

## 1. Build a Molecule (Atomistic, end‑to‑end)

This section is intentionally *not* “atom here, bond there, angle over there”. Instead, we follow the way you actually work:

1) create an `Atomistic` graph (atoms + bonds)
2) derive topology from bonds (`get_topo`)
3) do a few common edits (CRUD)
4) keep the structure in a consistent, exportable state

Key idea: **`Atomistic` is the editable graph**. You should be able to build, query, modify, and re‑derive topology without manually bookkeeping angles/dihedrals.

In [1]:
# Imports used throughout this notebook
import molpy as mp
import numpy as np

print(f"MolPy version: {mp.__version__}")

MolPy version: 0.2.0


### 1.1 Create a structure (atoms + bonds)

We’ll build a single water molecule. Two MolPy conventions to notice:

- **Atoms are entities**: they carry flexible key/value data (e.g., `symbol`, `xyz`, `charge`, `type`).
- **Bonds are links**: they connect atoms and can also carry data (e.g., bond order, typed parameters).

Tip: use `xyz=[x, y, z]` when defining atoms — MolPy will normalize it into `x/y/z` fields for you.

In [2]:
# Build a water molecule as an Atomistic graph
water = mp.Atomistic(name="water")

# Atoms (graph nodes)
o = water.def_atom(type="O", name="OW", xyz=[0.0, 0.0, 0.0], charge=0.417)
h1 = water.def_atom(type="H", name="HW1", xyz=[0.96, 0.0, 0.0], charge=0.417)
h2 = water.def_atom(type="H", name="HW2", xyz=[-0.24, 0.93, 0.0], charge=-0.834)

# Bonds (graph edges)
b1 = water.def_bond(o, h1, type="HO")
b2 = water.def_bond(o, h2, type="HO")

a1 = water.def_angle(h1, o, h2, type="HOH")

print(water)
print("Neighbor atoms of O:", [a.get("name") for a in water.get_neighbors(o, link_type=mp.Bond)])

<Atomistic, 3 atoms (?:3), 2 bonds>
Neighbor atoms of O: ['HW1', 'HW2']


### 1.2 Derive topology + do practical CRUD (without busywork)

You generally should **not** hand-create angles/dihedrals. Instead:

- define bonds
- call `get_topo(...)` to derive higher-order interactions

In [3]:
# Build a molecule as an Atomistic graph
nh3 = mp.Atomistic(name="NH3")

n = nh3.def_atom(type="N", name="N1", xyz=[0.0, 0.0, 0.0], charge=-0.834)
h1 = nh3.def_atom(type="H", name="H1", xyz=[0.96, 0.0, 0.0], charge=0.417)
h2 = nh3.def_atom(type="H", name="H2", xyz=[-0.24, 0.93, 0.0], charge=0.417)
h3 = nh3.def_atom(type="H", name="H3", xyz=[-0.24, -0.93, 0.0], charge=0.417)

b1 = nh3.def_bond(n, h1, type="NH")
b2 = nh3.def_bond(n, h2, type="NH")
b3 = nh3.def_bond(n, h3, type="NH")   

topo = nh3.get_topo(gen_angle=True, gen_dihe=True)

print(f"angles derived: {len(nh3.angles)}")

angles derived: 3


## 2. Geometric manipulation (still on Atomistic)

MolPy’s spatial ops are designed for *workflow composition*: you can translate, rotate, scale, and align entities without converting to `Frame` early.

Below we:
- translate the molecule
- rotate around a chosen point (`about=...`) so you control what stays fixed

In [4]:
# Translate the whole molecule
water.move([5.0, 0.0, 0.0])

# Rotate 30° about the oxygen position (keeps O approximately fixed in space)
about_o = [o["x"], o["y"], o["z"]]
water.rotate(axis=[0, 0, 1], angle=np.deg2rad(30.0), about=about_o)

for atom in water.atoms:
    print(f"{atom['name']}: ({atom['x']:.3f}, {atom['y']:.3f}, {atom['z']:.3f})")

OW: (5.000, 0.000, 0.000)
HW1: (5.831, 0.480, 0.000)
HW2: (4.327, 0.685, 0.000)


## 4. Convert to Frame and export

When you’re done editing, convert to `Frame` for array-centric workflows and file I/O.

Why `Frame`?
- It stores coordinates and attributes in contiguous NumPy arrays (fast)
- Writers/readers operate on a stable table representation
- The box (PBC) lives in `frame.metadata["box"]`
- The LAMMPS writer will handle 0-based vs 1-based indexing conventions for you

> Note: export formats vary in what columns they expect. For LAMMPS `atom_style="full"`, you typically need `id`, `mol`, and `q` (charge).

In [5]:
# Convert the editable graph to an array-centric Frame
frame = water.to_frame()
frame.metadata["box"] = mp.Box.cubic(10.0)

# Ensure minimal columns for LAMMPS full style
n = frame["atoms"].nrows
frame["atoms"]["id"] = np.arange(1, n + 1)
frame["atoms"]["mol"] = np.ones(n, dtype=int)
if "charge" in frame["atoms"] and "q" not in frame["atoms"]:
    frame["atoms"]["q"] = frame["atoms"]["charge"]

from molpy.io.writers import write_lammps_data
write_lammps_data("system.data", frame, atom_style="full")
print("Wrote system.data")

# Optional: inspect available columns
print("atom columns:", sorted(list(frame["atoms"].keys())))
print("blocks:", list(frame.to_dict()["blocks"].keys()))

Wrote system.data
atom columns: ['charge', 'id', 'mol', 'name', 'q', 'type', 'x', 'y', 'z']
blocks: ['atoms', 'bonds', 'angles']


## 5. What you learned (and what to do next)

You now have a complete, minimal pipeline:
- Build and edit with `Atomistic` (graph CRUD)
- Derive topology from bonds (`get_topo`)
- Attach parameters with a `ForceField` + `Typifier`
- Convert to `Frame` and export for simulation

Next: read Core Concepts to learn the “why” behind the data model and then jump into Tutorials for larger systems.