# Homework 4 - Analysis

For this homework assignment, we will look at the structural ordering of hard spheres within different phase boundaries. **Simulate 5 systems of hard spheres at different volume fractions, $\mathbf{\phi = 0.4, 0.5, 0.6, 0.7, \text{and} \: 0.74}$.** 

**For each simulation, note the phase of the system and calculate the RDF (see `HW4-Analysis.ipynb`). Please submit your responses to the questions and include a plot with all 5 RDFs in a PDF file along with the rest of your homework solutions and a PDF version of this notebook to Canvas.**

Now that you've run your simulation we can do some post-processing of the data. We're going to learn about the radial distribution function and calculate it for our hard sphere systems. **Please answer all questions in a separate PDF document, _not_ in this notebook.**

In [None]:
import gsd.hoomd
import numpy as np
import matplotlib.pyplot as plt
import freud

The radial distribution function gives as a way to measure the structural correlation between particles in a system. The two main aspects of an RDF that give information about a system are the first peak and the frequency of subsequent peaks. Read [this article](https://en.wikibooks.org/wiki/Molecular_Simulation/Radial_Distribution_Functions) and use it to **answer the following questions about the radial distribution function.** 

1. What does the radial distribution function represent?
2. What is the significance of the _location_ of the first peak in a g(r) plot?
3. What other information can you determine using the first peak?
4. What does the frequency of subsequent peaks indicate in a g(r) plot?

Now that you've read about the radial distribution function, it's time to calculate it for system. Let's start by loading our simulation trajectory. Make sure to include the filename from your simulation's GSD writer.

In [None]:
traj = gsd.hoomd.open(filename, mode='r')
snap = traj[0]

For the RDF, we want to specify our correlation cutoff distance, or the maximum inter-particle distance we will consider in our calculation. The RDF cutoff distance should be smaller than half of the box size. To ensure this, we can find a value just under half the box size using numpy's next after function. If you'd like an explaination of how this function works, check out the numpy [documentation](https://numpy.org/doc/stable/reference/generated/numpy.nextafter.html).

In [None]:
halfbox = np.nextafter(np.max(snap.configuration.box[:3])*0.5, 0, dtype=np.float32)

Now, we can set up the RDF object from freud. Here we must decide how many bins we'd like to use in this calculation. This will also determine the size of the bins, which is $L_{bin} = \frac{r_{max}-r_{min}}{N_{bins}}$. For our cases, $r_{min} = 0$ and $r_{max} = L_{box}/2$. It is important choose an appropriate bin size when computing the RDF to ensure that the curve is smooth. We're going to use 100 bins, which is freud's default value, and generally an adequate number of bins.

In [None]:
rdf = freud.density.RDF(bins=100, r_max=halfbox)

Now, we want to actually compute the RDF. To do this, we are going to average over the last 50 frames in our simulation trajectory, which is where the `for snap in traj[-50:]` loop comes from. 

5. Why is it important to average our results over multiple timeframes?

For each snapshot, we use freud's compute function to compute the RDF based on the positions of our particles. By setting `reset=False`, each time we compute the rdf, freud adds it to all the previous calculations and averages them.

In [None]:
for snap in traj[-50:]:
    box = snap.configuration.box
    pos = snap.particles.position
    rdf.compute(system=(box, pos), reset=False)

Now, we can plot our RDF! Make sure to properly label each curve and format your axes.

6. Compare the RDF from each simulation. What similarities and differences do you notice in the curves? What are the causes of these changes?

In [None]:
fig,ax = plt.subplots()
ax.plot(rdf.bin_centers, rdf.rdf)
plt.show()