# Core Concepts

MolPy is designed with a specific philosophy: **Flexibility through Composition**. Unlike many molecular libraries that hardcode "Molecule" or "Residue" classes, MolPy provides a set of building blocks that you can assemble to create your own structures.

This guide explains the foundational pillars of MolPy's architecture, restoring the deep-dive context you need to master the library.

**Key Components:**
1.  **Entity & Link**: The fundamental graph elements.
2.  **Block & Frame**: The high-performance data containers.
3.  **Topology**: The connectivity graph and pattern matching.
4.  **Force Field**: The physical parameters and typing engine.

---

## 1. Entity & Link: The Graph Basics

In MolPy, an `Entity` is the most basic unit. It represents *something* in your system - an atom, a bead, a residue, a voxel.

### Identity vs. Content
**Crucial Concept**: Entities are defined by **Identity**, not content. Two atoms with the exact same properties (same position, same element) are *different* atoms if they are different Python objects. This mimics real life: two identical hydrogen atoms are independent physical objects.

In [1]:
import molpy as mp

# Create two identical atoms
a1 = mp.Atom(symbol="H", xyz=[0, 0, 0])
a2 = mp.Atom(symbol="H", xyz=[0, 0, 0])

# Content equality check (e.g., properties)
print(f"Same symbol? {a1['symbol'] == a2['symbol']}")

# Identity check (Python 'is' operator)
print(f"Same entity? {a1 is a2}")
print(f"Hash comparison: {hash(a1)} vs {hash(a2)}")

# This allows us to use Entities as dictionary keys or graph nodes!
mapping = {a1: "First Atom", a2: "Second Atom"}
print(mapping[a1])

Same symbol? True
Same entity? False
Hash comparison: 281471649334384 vs 281471682165936
First Atom


### Links: Connectivity
A `Link` connects Entities. Common examples are `Bond` (2 atoms), `Angle` (3 atoms), and `Dihedral` (4 atoms). Links hold **references** to the entities they connect, but they do NOT own them.

In [2]:
c1 = mp.Atom(symbol="C")
c2 = mp.Atom(symbol="C")

# A bond is just a Link with 2 endpoints
bond = mp.Bond(c1, c2, order=1.0)

print(f"Bond connects: {bond.endpoints}")
print(f"Same bond instance? {bond is mp.Bond(c1, c2, order=1.0)}") # False, distinct link objects

Bond connects: (<Atom: C>, <Atom: C>)
Same bond instance? False


## 2. Block & Frame: The Data Backbone

While Entities are great for object-oriented logic, they are slow for heavy number crunching. For performance, MolPy uses **Blocks** and **Frames**.

### Block: Columnar Tables
A `Block` is essentially a dictionary of NumPy arrays. It holds data for a collection of items (e.g., all atoms).
- **Efficient**: Single contiguous memory block for coordinates.
- **Flexible**: Add any column you want ('charge', 'mass', 'velocity').

In [3]:
from molpy.core.frame import Block
import numpy as np

# Create a Block mimicking atom data
data = {
    'x': np.array([0.0, 1.0, 2.0]),
    'y': np.array([0.0, 0.0, 0.0]),
    'z': np.array([0.0, 0.0, 0.0]),
    'element': np.array(['O', 'H', 'H'])
}
atoms_block = Block(data)

# Access columns as numpy arrays
print("X coordinates:", atoms_block['x'])

# Block slicing returns a new Block
print("First 2 atoms:", atoms_block[0:2].nrows)

X coordinates: [0. 1. 2.]
First 2 atoms: 2


### Frame: The Container
A `Frame` bundles multiple Blocks (e.g., `atoms`, `bonds`, `angles`) and metadata (`box`, `time`, `forcefield`). It is the standard unit for file I/O (LAMMPS data, trajectory frames).

> **Analogy**: If `Block` is a DataFrame table, `Frame` is the entire Database or Excel Workbook.

In [4]:
frame = mp.Frame()
frame["atoms"] = atoms_block
frame.metadata["box"] = mp.Box.cubic(10.0)

# Frame does not expose an iterator, so we access the internal dictionary
print(f"Frame keys: {list(frame._blocks.keys())}")
print(f"Box: {frame.metadata['box']}")

Frame keys: ['atoms']
Box: <Orthogonal Box: [10. 10. 10.]>


## 3. Topology: The Connectivity Graph

The `Topology` class represents the structural graph of your system. It is powered by **igraph** for high-performance graph algorithms.

**Key Features:**
*   **Automatic Pattern Matching**: Angles, Dihedrals, and Impropers are *not* stored manually. They are detected on-the-fly from the bond graph using subgraph isomorphism.
*   **Pathfinding**: Find shortest paths, rings, and connected components.

See [Topology Tutorial](../tutorials/topology.ipynb) for a deep dive.

In [5]:
from molpy.core.topology import Topology

# Create a simple linear chain 0-1-2-3
topo = Topology()
topo.add_atoms(4)
topo.add_bonds([(0, 1), (1, 2), (2, 3)])

print(f"Atoms: {topo.n_atoms}, Bonds: {topo.n_bonds}")

# Automatic detection of higher-order interactions
print(f"Angles (A-B-C): {topo.n_angles}")      # Should be 2: (0-1-2) and (1-2-3)
print(f"Dihedrals (A-B-C-D): {topo.n_dihedrals}") # Should be 1: (0-1-2-3)

# Graph algorithms
print(f"Is connected? {topo.is_connected()}")
print(f"Shortest path 0->3: {topo.get_shortest_paths(0, 3)[0]}")

Atoms: 4, Bonds: 3
Angles (A-B-C): 2
Dihedrals (A-B-C-D): 1
Is connected? True
Shortest path 0->3: [0, 1, 2, 3]


## 4. Force Field & Typing

MolPy separates **Topology** (what is connected) from **Parameters** (how strong is the connection).

*   **ForceField**: A database of `AtomTypes`, `BondTypes`, `AngleTypes` etc.
*   **Typifier**: A "compiler" that applies a ForceField to a Topology/Structure.
*   **Potential**: An executable function (e.g., Harmonic Bond) derived from parameters.

**Workflow:**
1.  Define/Load ForceField (e.g., OPLS-AA).
2.  `Typifier.typify(system)`: Assigns types (`CT`, `HC`) based on chemical graph matching (SMARTS).
3.  `to_potentials()`: Converts parameters to compute-ready potentials.

See [Force Field Tutorial](../tutorials/force-field.ipynb) for more details.

In [6]:
import molpy as mp

# Create a mini force field
ff = mp.AtomisticForcefield(name="ToyFF", units="real")
atom_style = ff.def_atomstyle("full")

# Define types (parameters)
ct = atom_style.def_type("CT", mass=12.01, charge=-0.1)
hc = atom_style.def_type("HC", mass=1.008, charge=0.1)

print(f"Defined AtomTypes: {[t.name for t in ff.get_types(mp.AtomType)]}")

# Define a bond style and type
bond_style = ff.def_bondstyle("harmonic")
ch_bond = bond_style.def_type(ct, hc, k=340.0, r0=1.09)

print(f"Defined BondType: {ch_bond.name} (k={ch_bond['k']}, r0={ch_bond['r0']})")

Defined AtomTypes: ['HC', 'CT']
Defined BondType: CT-HC (k=340.0, r0=1.09)


## Summary Cheatsheet

| Component | Role | Analogy |
| :--- | :--- | :--- |
| **Entity** | Unique Identity | The "Soul" of an atom |
| **Link** | Connectivity | The "Handshake" between atoms |
| **Block** | High-perf Data | The "Spreadsheet" column |
| **Frame** | System Container | The "Snapshot" of the universe |
| **Topology** | Graph Logic | The "Schematic" or "Blueprint" |
| **ForceField** | Physical Laws | The "Rulebook" of physics |

This modularity allows MolPy to scale from simple scripts to complex polymer builders (like `PolymerBuilder`).