# Quickstart Guide: First Steps with MolPy

Welcome to the MolPy quickstart guide. This tutorial is designed to take you from a blank slate to a fully-realized simulation system. We will walk through the core workflow of MolPy: **Building**, **Manipulating**, and **Exporting** molecular systems.

MolPy distinguishes itself by separating the *chemical graph* (connectivity) from the *physical data* (arrays). This separation allows for intuitive building processes and high-performance simulation kernels.

## Objectives
By the end of this tutorial, you will be able to:
1.  **Build** a molecule from scratch using `Atomistic` APIs.
2.  **Inspect** and **Manipulate** molecular properties.
3.  **Typpify** your system with a force field.
4.  **Export** your system to a LAMMPS data file for simulation.

Let's get started!

## 1. Setting Up the Environment

First, we need to import `molpy` and `numpy`. MolPy relies heavily on NumPy for efficient numerical operations, so it's good practice to have it handy.

In [1]:
import molpy as mp
import numpy as np

print(f"MolPy version: {mp.__version__}")

MolPy version: 0.2.0


## 2. Building Molecules: The `Atomistic` Approach

When you are building a molecule, you think in terms of **Atoms** and **Bonds**. MolPy's `Atomistic` class is designed exactly for this. It acts as a container for your chemical graph, managing entities and their connections.

### Creating a Water Molecule

Let's create a simple water molecule. We start by initializing an empty `Atomistic` container. Think of this as an empty canvas for your molecule.

In [2]:
# Initialize an empty atomistic structure
water = mp.Atomistic(name="water_molecule")
print(water)

<Atomistic, 0 atoms (), 0 bonds>


### Adding Atoms

Now, we will add atoms to our structure. The `def_atom` method is your primary tool here. It initializes a new `Atom` entity and registers it with the system.

You can pass any keyword arguments to `def_atom` to set properties. Standard properties include:
*   `name`: A unique identifier for the atom type (e.g., "OW", "HW1").
*   `symbol` or `element`: The chemical element string.
*   `x`, `y`, `z`: Cartesian coordinates.
*   `charge`: Partial charge.
*   `type`: Integer or string type ID for force fields.

Let's define the Oxygen and two Hydrogen atoms:

In [3]:
# Define Oxygen atom
o = water.def_atom(
    element="O", 
    name="OW", 
    x=0.0, y=0.0, z=0.0, 
    charge=-0.834,
    type="OW"  # Explicitly set type for later use
)

# Define two Hydrogen atoms
h1 = water.def_atom(
    element="H", 
    name="HW1", 
    x=0.96, y=0.0, z=0.0, 
    charge=0.417,
    type="HW"
)

h2 = water.def_atom(
    element="H", 
    name="HW2", 
    x=-0.24, y=0.93, z=0.0, 
    charge=0.417,
    type="HW"
)

print(f"Structure now has {len(water.atoms)} atoms.")

Structure now has 3 atoms.


### Accessing Atom Properties

The objects returned by `def_atom` (`o`, `h1`, `h2`) are **Atom Instances**. They behave like Python dictionaries. You can read or write properties directly to them at any time.

This flexibility allows you to attach custom data, such as experimental IDs or simulation parameters, without restricted schema.

In [4]:
# Access properties using dictionary syntax
print(f"Oxygen element: {o['element']}")
print(f"Oxygen position: ({o['x']}, {o['y']}, {o['z']})")

# Modify properties dynamically
o['mass'] = 15.9994
print(f"Oxygen mass: {o['mass']}")

# Add a custom tag
o['note'] = "Central Atom"

Oxygen element: O
Oxygen position: (0.0, 0.0, 0.0)
Oxygen mass: 15.9994


### Defining Topology: Bonds

Atoms alone do not make a molecule; they need connectivity. We use `def_bond` to connect atoms. Similar to atoms, bonds are entities that can hold properties like `order` or `type`.

Note that `itom` and `jtom` define the endpoints of the bond.

In [5]:
# Create bonds between Oxygen and Hydrogens
bond1 = water.def_bond(o, h1, type="OH", order=1)
bond2 = water.def_bond(o, h2, type="OH", order=1)

print(f"Structure now has {len(water.bonds)} bonds.")
print(f"Bond 1 connects: {bond1.itom['name']} - {bond1.jtom['name']}")

Structure now has 2 bonds.
Bond 1 connects: OW - HW1


### Automated Topology Analysis

Manually defining angles and dihedrals is tedious and error-prone. MolPy simplifies this with the `get_topo()` method.

By calling `get_topo(gen_angle=True)`, MolPy analyzes the bond graph and automatically detects all 3-body angles (and 4-body dihedrals if requested). It then registers these new entities into the structure.

**Important**: `get_topo` detects the *existence* of an angle, but it does not know the *chemistry*. We must manually assign a `type` to these emerging entities if we want to export them to a force-field compliant format.

In [6]:
# Automatically generate angles based on connectivity
water.get_topo(gen_angle=True)

# Assign a type to the generated angles
# MolPy detects the geometry, but we must define the chemistry (type)
for angle in water.angles:
    angle["type"] = "HOH"

# Verify that the H-O-H angle was found
print(f"Structure now has {len(water.angles)} angle(s).")
for angle in water.angles:
    print(f"Angle found: {angle.itom['name']} - {angle.jtom['name']} - {angle.ktom['name']}")

Structure now has 1 angle(s).
Angle found: HW1 - OW - HW2


## 3. Transformations and Manipulation

MolPy provides built-in methods for geometric transformations. Common operations like translation and rotation are available directly on the `Atomistic` instance.

Let's move our water molecule by 5 Angstroms along the X-axis.

In [7]:
# Translate the whole molecule
water.move([5.0, 0.0, 0.0])

# Check new positions
print("New positions:")
for atom in water.atoms:
    print(f"{atom['name']}: ({atom['x']:.2f}, {atom['y']:.2f}, {atom['z']:.2f})")

New positions:
OW: (5.00, 0.00, 0.00)
HW1: (5.96, 0.00, 0.00)
HW2: (4.76, 0.93, 0.00)


## 4. Converting to Frame for Output

While `Atomistic` is great for building, simulation software (like LAMMPS or GROMACS) expects structured numeric tables. This is where the **Frame** comes in.

`Frame` organizes data into contiguous memory blocks (NumPy arrays), making it efficient for I/O and large-scale processing.

We use the `to_frame()` method to convert our graph-based molecule into a table-based frame. Since we ensured all atoms, bonds, and angles have a `type`, this conversion should handle the full topology.

In [8]:
# Convert Atomistic graph to data Frame
frame = water.to_frame()

# The frame contains blocks for atoms, bonds, etc.
print("Frame blocks:", list(frame.blocks))

Frame blocks: [Block(name: shape=(3,), mass: shape=(3,), charge: shape=(3,), z: shape=(3,), element: shape=(3,), x: shape=(3,), y: shape=(3,), note: shape=(3,), type: shape=(3,)), Block(atom_i: shape=(2,), atom_j: shape=(2,), order: shape=(2,), type: shape=(2,)), Block(atom_i: shape=(1,), atom_j: shape=(1,), atom_k: shape=(1,), type: shape=(1,))]


### Inspecting Frame Data

You can access the data inside a `Frame` just like columns in a DataFrame or a dictionary of arrays. This is the preferred way to perform heavy numerical analysis or batched updates.

In [9]:
# Access atomic coordinates as numpy arrays
x_coords = frame['atoms']['x']
charges = frame['atoms']['charge']

print("X coordinates (array):", x_coords)
print("Total charge:", np.sum(charges))

X coordinates (array): [5.   5.96 4.76]
Total charge: 0.0


## 5. Adding Simulation Context: The Box

Before we can simulate, we almost always need a simulation box (periodic boundary conditions). The `Box` class defines the size and shape of the simulation domain.

We create a cubic box and attach it to the frame's metadata.

In [10]:
# Define a 10x10x10 Angstrom cubic box
box = mp.Box.cubic(10.0)

# Attach box to the frame
frame.metadata["box"] = box

print(f"Simulation Box: {box}")

Simulation Box: <Orthogonal Box: [10. 10. 10.]>


## 6. Exporting to LAMMPS

Finally, we are ready to save our system. MolPy supports various formats, but here we will focus on the **LAMMPS Data** format, a standard input for MD simulations.

We need to ensure our frame has all the necessary columns expected by LAMMPS (like `id` and `mol` IDs). We can quickly generate these if they are missing.

In [11]:
from molpy.io.data.lammps import LammpsDataWriter

# Ensure basic IDs exist for LAMMPS
n_atoms = frame['atoms'].nrows
frame['atoms']['id'] = np.arange(1, n_atoms + 1)
frame['atoms']['mol'] = np.ones(n_atoms, dtype=int)

# Ensure charge column is named 'q' for LAMMPS full style
if 'charge' in frame['atoms'] and 'q' not in frame['atoms']:
    frame['atoms']['q'] = frame['atoms']['charge']

# Write the file
writer = LammpsDataWriter("system.data", atom_style="full")
writer.write(frame)

print("Successfully wrote 'system.data'")

Successfully wrote 'system.data'


## Conclusion

You have successfully:
1.  **Built** a water molecule using the object-oriented `Atomistic` API.
2.  **Manipulated** its properties and analyzed its topology.
3.  **Converted** the graph into a high-performance `Frame`.
4.  **Exported** the result for simulation.

This workflow—**Build -> Frame -> Export**—is the backbone of MolPy simulations. From here, you can explore more advanced topics like building Polymer networks or applying Force Fields.

[Next: Core Concepts Deep Dive >](core-concepts.ipynb)