# Using Snapshots With MPI

## Overview

### Questions

* How can I access the state of the simulation in parallel simulations? 
* What is the difference between a local and global snapshot?

### Objectives

* Describe how to write GSD files in MPI simulations.
* Show examples using **local snapshots**.
* Show examples using **global snapshots**.

In [1]:
import os

fn = os.path.join(os.getcwd(), 'trajectory.gsd')
![ -e "$fn" ] && rm "$fn"

## Writing GSD files in parallel jobs

You can write GSD files in parallel jobs just as you do in serial.
Saving the simulation trajectory to a file is useful for visualization and analysis after the simulation completes.
As mentioned in the previous section, make sure that you specify the operation with identical parameters on all ranks:

In [2]:
%pycat lj_trajectory.py

[0;32mimport[0m [0mhoomd[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0mdevice[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mdevice[0m[0;34m.[0m[0mCPU[0m[0;34m([0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0msim[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mSimulation[0m[0;34m([0m[0mdevice[0m[0;34m=[0m[0mdevice[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0msim[0m[0;34m.[0m[0mcreate_state_from_gsd[0m[0;34m([0m[0mfilename[0m[0;34m=[0m[0;34m'random.gsd'[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0mintegrator[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mmd[0m[0;34m.[0m[0mIntegrator[0m[0;34m([0m[0mdt[0m[0;34m=[0m[0;36m0.005[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mcell[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mmd[0m[0;34m.[0m[0mnlist[0m[0;34m.[0m[0mCell[0m[0;34m([0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0mlj[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mmd[0m[0;34m.[0m[0mpair[0m[0;34m.[0m[0mLJ[0m[0;34m([0m[

In [3]:
!mpirun -n 4 python3 lj_trajectory.py

notice(2): Using domain decomposition: n_x = 1 n_y = 2 n_z = 2.


## Modifying particle properties with local snapshots

Use snapshots when you need to modify particle properties during a simulation, or perform analysis where the results need to be known as the simulation progresses (e.g. umbrella sampling).
**Local snapshots** provide high performance direct access to the particle data stored in HOOMD-blue.
The direct access comes with several costs.
Your script can only access particles local to the **domain** of the current **rank**. 
Any analysis you perform may require MPI communication to combine results across ranks.
Furthermore, particles may appear in any order in the local snapshot and a given particle is only present on one rank.
When you need to access the properties of a specific particle, you need to look it up by tag and handle the condition where it is not present on the rank.

The example below demonstrates the latter two concepts with a toy example that doubles the mass of all particles, and quadruples the mass of the particle with tag 100:

In [4]:
%pycat local_snapshot.py

[0;32mimport[0m [0mhoomd[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0mdevice[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mdevice[0m[0;34m.[0m[0mCPU[0m[0;34m([0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0msim[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mSimulation[0m[0;34m([0m[0mdevice[0m[0;34m=[0m[0mdevice[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0msim[0m[0;34m.[0m[0mcreate_state_from_gsd[0m[0;34m([0m[0mfilename[0m[0;34m=[0m[0;34m'random.gsd'[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;32mwith[0m [0msim[0m[0;34m.[0m[0mstate[0m[0;34m.[0m[0mcpu_local_snapshot[0m [0;32mas[0m [0msnap[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0mN[0m [0;34m=[0m [0mlen[0m[0;34m([0m[0msnap[0m[0;34m.[0m[0mparticles[0m[0;34m.[0m[0mposition[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m    [0;31m# double the mass of every particle[0m[0;34m[0m
[0;34m[0m    [0msnap[0m[0;34m.[0m[0mparticles[0m[0;34m.[0

In [5]:
!mpirun -n 4 python3 local_snapshot.py

notice(2): Using domain decomposition: n_x = 1 n_y = 2 n_z = 2.


Notice how the example uses the `rtag` lookup array to efficiently find the index of the particle with the given tag.
When the particle is not present on the local rank, `rtag` is set to a number greater than the local number of particles.

## Handling global snapshots with MPI

**Global snapshots** collect all particles onto rank 0 and sort them by tag.
This removes a number of the inconveniences of the local snapshot API, but at the cost of *much slower* performance.
When you use **global snapshots** in MPI simulations, you need to add `if snapshot.communicator.rank == 0:` checks around all the code that accesses the data in the snapshot.
Note: the `get_snapshot()` call itself *MUST* be made on all ranks.
Here is a toy example that computes the total mass of the system using a global snapshot:

In [7]:
%pycat global_snapshot.py

[0;32mimport[0m [0mhoomd[0m[0;34m[0m
[0;34m[0m[0;32mimport[0m [0mnumpy[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0mdevice[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mdevice[0m[0;34m.[0m[0mCPU[0m[0;34m([0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0msim[0m [0;34m=[0m [0mhoomd[0m[0;34m.[0m[0mSimulation[0m[0;34m([0m[0mdevice[0m[0;34m=[0m[0mdevice[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0msim[0m[0;34m.[0m[0mcreate_state_from_gsd[0m[0;34m([0m[0mfilename[0m[0;34m=[0m[0;34m'random.gsd'[0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0msnapshot[0m [0;34m=[0m [0msim[0m[0;34m.[0m[0mstate[0m[0;34m.[0m[0mget_snapshot[0m[0;34m([0m[0;34m)[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m[0;31m# can only access particle data on rank 0[0m[0;34m[0m
[0;34m[0m[0;32mif[0m [0msnapshot[0m[0;34m.[0m[0mcommunicator[0m[0;34m.[0m[0mrank[0m [0;34m==[0m [0;36m0[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0mtotal

In [8]:
!mpirun -n 4 python3 global_snapshot.py

notice(2): Using domain decomposition: n_x = 1 n_y = 2 n_z = 2.
2048.0


To do this same calculation efficiently with local snapshots, you could use [mpi4py](http://mpi4py.readthedocs.io/) to sum the locally computed masses across all ranks.

In this section, you have written trajectories to a GSD file, modified the state of the system efficiently using local snapshots, and analyzed the state of the system with a global snapshot - all with conditions that work in both MPI parallel and serial simulations.
The next section of this tutorial shows you how to use MPI to run many independent simulations with different inputs.