# TorchSim Tutorial

This tutorial introduces the atomate2 interface to TorchSim for molecular dynamics simulations and geometry optimizations. The atomate2 interface wraps TorchSim's high-level API into jobflow-compatible Makers, enabling workflow management, database storage, and reproducible simulations.

## Installing Atomate2 with TorchSim

```bash
# Setting up conda environment
>>> conda create -n atomate2-torchsim python=3.11
>>> conda activate atomate2-torchsim

# Installing atomate2 with TorchSim support
>>> pip install atomate2
>>> pip install torch-sim

# For MACE models (optional but recommended)
>>> pip install mace-torch
```

To verify the installation:

```python
import torch_sim as ts
from atomate2.torchsim.core import TorchSimStaticMaker
print("Installation successful!")
```

## Understanding the Atomate2 TorchSim Interface

Atomate2 provides three primary Makers for TorchSim simulations:

1. **`TorchSimStaticMaker`** - For one-time energy/force/property calculations
2. **`TorchSimOptimizeMaker`** - For geometry optimization
3. **`TorchSimIntegrateMaker`** - For molecular dynamics simulations

These Makers wrap TorchSim's `static`, `optimize`, and `integrate` functions respectively, adding:
- Structured output via `TorchSimTaskDoc` schema
- Jobflow integration for workflow management
- Automatic tracking of calculation metadata
- Support for chaining calculations together

## Basic Static Calculation

Let's start with a simple static calculation using a Lennard-Jones potential. First, we create our atomic structure:

In [None]:
# ruff: noqa: T201
from ase.build import bulk
from pymatgen.io.ase import AseAtomsAdaptor

# Create an Argon FCC structure using ASE and convert to pymatgen
ar_atoms = bulk("Ar", "fcc", a=5.26, cubic=True)
ar_structure = AseAtomsAdaptor.get_structure(ar_atoms)

print(f"Structure: {ar_structure.formula}")
print(f"Number of atoms: {len(ar_structure)}")

Now we create a `TorchSimStaticMaker` and run the calculation. Note that unlike raw TorchSim, we specify the model type using the `TorchSimModelType` enum:

In [None]:
from jobflow import run_locally

from atomate2.torchsim.core import TorchSimStaticMaker
from atomate2.torchsim.schema import TorchSimModelType

# Create a static calculation maker with Lennard-Jones model
static_maker = TorchSimStaticMaker(
    model_type=TorchSimModelType.LENNARD_JONES,
    model_path="",  # LJ model doesn't need a path
    model_kwargs={
        "sigma": 3.405,  # Angstrom, typical for Ar
        "epsilon": 0.0104,  # eV, typical for Ar
        "compute_stress": True,
    },
)

# Create the job - accepts a single structure or a list of structures
job = static_maker.make([ar_structure])

# Run locally
response_dict = run_locally(job, ensure_success=True)

# Extract the result
result = list(response_dict.values())[-1][1].output

print(f"Energy: {result.calcs_reversed[0].output.energies[0]:.6f} eV")
print(f"Time elapsed: {result.time_elapsed:.3f} seconds")

## Understanding the Output: TorchSimTaskDoc

The output of all TorchSim Makers is a `TorchSimTaskDoc`, which contains:
- `structures`: The final structures from the calculation
- `calcs_reversed`: List of calculation details (most recent first)
- `time_elapsed`: Total calculation time
- `uuid`: Unique identifier for this task
- `dir_name`: Directory where the calculation was run

In [None]:
# Exploring the task document structure
print(f"Task UUID: {result.uuid}")
print(f"Directory: {result.dir_name}")
print(f"Number of structures: {len(result.structures)}")
print(f"Number of calculations: {len(result.calcs_reversed)}")

# Explore the calculation details
calc = result.calcs_reversed[0]
print("\nCalculation details:")
print(f"  Model: {calc.model}")
print(f"  Task type: {calc.task_type}")
print(f"  Energies: {calc.output.energies}")
print(f"  Forces shape: {len(calc.output.all_forces[0])} atoms x 3")

## Batch Processing Multiple Systems

One of TorchSim's strengths is efficiently processing multiple systems in parallel. This works seamlessly through the atomate2 interface:

In [None]:
# Create multiple structures
cu_atoms = bulk("Cu", "fcc", a=3.6, cubic=True)
fe_atoms = bulk("Fe", "bcc", a=2.87, cubic=True)

cu_structure = AseAtomsAdaptor.get_structure(cu_atoms)
fe_structure = AseAtomsAdaptor.get_structure(fe_atoms)

# Create supercells
cu_supercell = cu_structure.copy()
cu_supercell.make_supercell([2, 2, 2])

structures = [ar_structure, cu_structure, fe_structure, cu_supercell]

print(f"Processing {len(structures)} structures:")
for i, s in enumerate(structures):
    print(f"  {i}: {s.formula} ({len(s)} atoms)")

In [None]:
# Run static calculation on all structures at once
job = static_maker.make(structures)
response_dict = run_locally(job, ensure_success=True)
result = list(response_dict.values())[-1][1].output

# Print energies for each structure
print("Results for batch calculation:")
for i, energy in enumerate(result.calcs_reversed[0].output.energies):
    n_atoms = len(structures[i])
    print(f"  Structure {i}: {energy:.6f} eV ({energy / n_atoms:.6f} eV/atom)")

## Geometry Optimization

The `TorchSimOptimizeMaker` provides geometry optimization capabilities. It supports various optimizers and convergence criteria:

In [None]:
import torch_sim as ts

from atomate2.torchsim.core import TorchSimOptimizeMaker
from atomate2.torchsim.schema import ConvergenceFn

# Perturb the structure to make optimization meaningful
perturbed_structure = ar_structure.copy()
perturbed_structure.translate_sites(
    list(range(len(perturbed_structure))), [0.05, 0.05, 0.05]
)

# Create an optimization maker
optimize_maker = TorchSimOptimizeMaker(
    model_type=TorchSimModelType.LENNARD_JONES,
    model_path="",
    optimizer=ts.Optimizer.fire,  # FIRE optimizer
    convergence_fn=ConvergenceFn.FORCE,  # Converge based on forces
    convergence_fn_kwargs={"force_tol": 1e-3},  # Force tolerance in eV/A
    max_steps=500,
    steps_between_swaps=10,
    init_kwargs={"cell_filter": ts.CellFilter.unit},  # Keep cell fixed
    model_kwargs={"sigma": 3.405, "epsilon": 0.0104, "compute_stress": True},
)

# Run optimization
job = optimize_maker.make([perturbed_structure])
response_dict = run_locally(job, ensure_success=True)
result = list(response_dict.values())[-1][1].output

print(f"Optimization completed in {result.time_elapsed:.3f} seconds")
print(f"Final energy: {result.calcs_reversed[0].output.energies[0]:.6f} eV")

## Molecular Dynamics

The `TorchSimIntegrateMaker` enables molecular dynamics simulations with various integrators:

In [None]:
from atomate2.torchsim.core import TorchSimIntegrateMaker

# Create an MD maker
md_maker = TorchSimIntegrateMaker(
    model_type=TorchSimModelType.LENNARD_JONES,
    model_path="",
    integrator=ts.Integrator.nvt_langevin,  # Langevin thermostat
    n_steps=100,
    temperature=300.0,  # Kelvin
    timestep=0.001,  # picoseconds
    model_kwargs={"sigma": 3.405, "epsilon": 0.0104, "compute_stress": True},
)

# Run MD simulation
job = md_maker.make([ar_structure])
response_dict = run_locally(job, ensure_success=True)
result = list(response_dict.values())[-1][1].output

print(f"MD completed in {result.time_elapsed:.3f} seconds")
print(f"Final energy: {result.calcs_reversed[0].output.energies[0]:.6f} eV")

# Check that the structure has evolved
calc = result.calcs_reversed[0]
print("\nMD parameters stored:")
print(f"  Integrator: {calc.integrator}")
print(f"  n_steps: {calc.n_steps}")
print(f"  Temperature: {calc.temperature} K")
print(f"  Timestep: {calc.timestep} ps")

## Trajectory Reporting

TorchSim supports saving trajectory data during simulations. In atomate2, you configure this via the `trajectory_reporter_dict` parameter:


In [None]:
import tempfile
from pathlib import Path

# Create a temporary directory for trajectory files
tmp_dir = Path(tempfile.mkdtemp())

n_systems = 2
trajectory_reporter_dict = {
    "filenames": [tmp_dir / f"md_traj_{i}.h5md" for i in range(n_systems)],
    "state_frequency": 10,  # Save state every 10 steps
    "prop_calculators": {
        5: ["potential_energy", "kinetic_energy", "temperature"],
    },
}

# Create MD maker with trajectory reporting
md_maker_with_traj = TorchSimIntegrateMaker(
    model_type=TorchSimModelType.LENNARD_JONES,
    model_path="",
    integrator=ts.Integrator.nvt_langevin,
    n_steps=50,
    temperature=300.0,
    timestep=0.001,
    trajectory_reporter_dict=trajectory_reporter_dict,
    model_kwargs={"sigma": 3.405, "epsilon": 0.0104, "compute_stress": True},
)

# Run with trajectory reporting
job = md_maker_with_traj.make([ar_structure, ar_structure])
response_dict = run_locally(job, ensure_success=True)
result = list(response_dict.values())[-1][1].output

# Check trajectory reporter details in output
traj_details = result.calcs_reversed[0].trajectory_reporter
print("Trajectory reporter configuration:")
print(f"  State frequency: {traj_details.state_frequency}")
print(f"  Property calculators: {traj_details.prop_calculators}")
print(f"  Output files: {[str(f) for f in traj_details.filenames]}")

In [None]:
# Analyze the trajectory
traj_file = traj_details.filenames[0]

with ts.TorchSimTrajectory(traj_file) as traj:
    potential_energies = traj.get_array("potential_energy")
    temperatures = traj.get_array("temperature")

    print("Trajectory analysis:")
    print(f"  Number of frames: {len(potential_energies)}")
    print(f"  Initial energy: {potential_energies[0].item():.6f} eV")
    print(f"  Final energy: {potential_energies[-1].item():.6f} eV")
    print(f"  Average temperature: {temperatures.mean().item():.1f} K")

## Using Machine Learning Potentials

TorchSim shines with machine learning potentials like MACE. Here's how to use a MACE model with atomate2:

In [None]:
from mace.calculators.foundations_models import download_mace_mp_checkpoint

# Download MACE-MP model checkpoint
mace_model_path = Path(download_mace_mp_checkpoint("small"))

# Create a static maker with MACE
mace_static_maker = TorchSimStaticMaker(
    model_type=TorchSimModelType.MACE,
    model_path=mace_model_path,
)

# Run on a copper structure
job = mace_static_maker.make([cu_structure])
response_dict = run_locally(job, ensure_success=True)
result = list(response_dict.values())[-1][1].output

print("MACE static calculation:")
print(f"  Energy: {result.calcs_reversed[0].output.energies[0]:.6f} eV")
print(f"  Model path: {result.calcs_reversed[0].model_path}")

In [None]:
# MACE optimization example
mace_optimize_maker = TorchSimOptimizeMaker(
    model_type=TorchSimModelType.MACE,
    model_path=mace_model_path,
    optimizer=ts.Optimizer.fire,
    convergence_fn=ConvergenceFn.FORCE,
    convergence_fn_kwargs={"force_tol": 0.01},
    max_steps=200,
    init_kwargs={"cell_filter": ts.CellFilter.unit},
)

# Perturb and optimize
perturbed_cu = cu_structure.copy()
perturbed_cu.translate_sites(list(range(len(perturbed_cu))), [0.02, 0.02, 0.02])

job = mace_optimize_maker.make([perturbed_cu])
response_dict = run_locally(job, ensure_success=True)
result = list(response_dict.values())[-1][1].output

print(f"MACE optimization completed in {result.time_elapsed:.3f} seconds")
print(f"Final energy: {result.calcs_reversed[0].output.energies[0]:.6f} eV")

## Autobatching

When processing many systems, TorchSim's autobatching automatically determines optimal batch sizes for GPU memory. Enable it via `autobatcher_dict`:

In [None]:
# Create many structures to process
many_structures = [ar_structure.copy() for _ in range(10)]

# Enable autobatching with custom settings
autobatcher_dict = {
    "memory_scales_with": "n_atoms",
    "max_memory_scaler": 260,
}

static_maker_batched = TorchSimStaticMaker(
    model_type=TorchSimModelType.LENNARD_JONES,
    model_path="",
    autobatcher_dict=autobatcher_dict,
    model_kwargs={"sigma": 3.405, "epsilon": 0.0104},
)

job = static_maker_batched.make(many_structures)
response_dict = run_locally(job, ensure_success=True)
result = list(response_dict.values())[-1][1].output

# Check autobatcher details
if result.calcs_reversed[0].autobatcher:
    ab_details = result.calcs_reversed[0].autobatcher
    print(f"Autobatcher used: {ab_details.autobatcher}")
    print(f"Memory scales with: {ab_details.memory_scales_with}")

print(f"\nProcessed {len(result.calcs_reversed[0].output.energies)} structures")

## Chaining Calculations

One advantage of the atomate2 interface is the ability to chain calculations together. You can pass the output of one job as input to another:

In [None]:
from jobflow import Flow

# Create makers for a multi-step workflow
# Step 1: Optimize the structure (using energy convergence for simplicity)
optimize_maker = TorchSimOptimizeMaker(
    name="optimize",
    model_type=TorchSimModelType.LENNARD_JONES,
    model_path="",
    optimizer=ts.Optimizer.fire,
    convergence_fn=ConvergenceFn.ENERGY,  # Energy-based convergence
    max_steps=100,
    model_kwargs={"sigma": 3.405, "epsilon": 0.0104, "compute_stress": True},
)

# Step 2: Run MD on the optimized structure
md_maker = TorchSimIntegrateMaker(
    name="md",
    model_type=TorchSimModelType.LENNARD_JONES,
    model_path="",
    integrator=ts.Integrator.nvt_langevin,
    n_steps=50,
    temperature=300.0,
    timestep=0.001,
    model_kwargs={"sigma": 3.405, "epsilon": 0.0104, "compute_stress": True},
)

# Create jobs
perturbed = ar_structure.copy()
perturbed.translate_sites(list(range(len(perturbed))), [0.05, 0.05, 0.05])

optimize_job = optimize_maker.make([perturbed])

# Chain: use optimized structures as input to MD
# The prev_task parameter allows tracking the calculation chain
md_job = md_maker.make(
    optimize_job.output.structures,
    prev_task=optimize_job.output,
)

# Create a flow
workflow = Flow([optimize_job, md_job], name="optimize_then_md")

# Run the workflow
response_dict = run_locally(workflow, ensure_success=True)

# Get the final result
final_result = list(response_dict.values())[-1][1].output

print("Workflow completed!")
print(f"Number of calculations in chain: {len(final_result.calcs_reversed)}")
print(f"Final energy: {final_result.calcs_reversed[0].output.energies[0]:.6f} eV")

## Supported Model Types

The atomate2 TorchSim interface supports various machine learning potentials through the `TorchSimModelType` enum:

In [None]:
from atomate2.torchsim.schema import TorchSimModelType

print("Supported model types:")
for model_type in TorchSimModelType:
    print(f"  - {model_type.name}: {model_type.value}")

## Available Property Functions

For trajectory reporting, these property functions are available via the `PropertyFn` enum. Due to the constraints of serialization, you cannot add arbitrary property functions like in raw torchsim, however you can easily modify the underlying PropertyFn code to manually add additional properties.

In [None]:
from atomate2.torchsim.schema import PropertyFn

print("Available property functions for trajectory reporting:")
for prop in PropertyFn:
    print(f"  - {prop.value}")

## Running with Databases

Like other atomate2 workflows, TorchSim jobs can be run with database storage. Configure your `jobflow.yaml` to point to your MongoDB instance:

```yaml
JOB_STORE:
  docs_store:
    type: MongoStore
    database: DATABASE
    collection_name: atomate2_docs
    host: your-mongo-host
    port: 27017
    username: USERNAME
    password: PASSWORD
```

Then run your workflows as usual - the `TorchSimTaskDoc` will be automatically stored in the database.

## Conclusion

The atomate2 TorchSim interface provides a powerful way to run molecular simulations with:

1. **`TorchSimStaticMaker`** - Single-point energy/force calculations
2. **`TorchSimOptimizeMaker`** - Geometry optimization with customizable convergence
3. **`TorchSimIntegrateMaker`** - Molecular dynamics with various integrators

Key features:
- Support for multiple ML potentials (MACE, FairChem, SevenNet, etc.)
- Batch processing of multiple structures
- Automatic autobatching for GPU memory management
- Trajectory reporting with customizable property calculations
- Structured output via `TorchSimTaskDoc` schema
- Full jobflow integration for workflow management and database storage

For more advanced usage, refer to the TorchSim documentation and the atomate2 source code.