# Part 2  - Dynamics

### In these exercises, you will analyze a real molecular dynamics trajectory of 125 water molecules at $T$=300K

# Setup
### Run the following cell to setup the plotting libraries and to read in the data trajectories

#### The get_data function will read in two trajectories, one of positions and one of velocities. All data is the "data/" folder, as well as the lammps inputs used to create it.
#### xyz is a numpy array containing the positions of the atoms vs time with dimensions (N_frames, N_atoms, 3) [ Technical Note: I have saved this data such that the molecules do not get wrapped back into the periodic box. This makes it easier to compute diffusion properties as below. ]. *These positions have units of angstroms*

#### vxyz is a numpy array containing the velocities of the atoms vs time with dimensions (N_frames, N_atoms, 3). *These velocities are in angstroms/ps*

#### x_times are the times corresponding to each frame in the xyz array, and v_times are the times corresponding to the frames in the velocity array. *These times are in picoseconds*


#### Within the position and velocity arrays, each molecule is contiguous. The order of atoms is Oxygen, Hydrogen, Hydrogen. So water1 has indices (0,1,2), water2 as indices (3,4,5), etc...

In [1]:
%pylab inline
from utilities import get_data

xyz, vxyz, x_times, v_times = get_data()

Populating the interactive namespace from numpy and matplotlib


In [2]:
print("Shape of position data: ",xyz.shape, x_times.shape)
print("Shape of velocity data: ",vxyz.shape, v_times.shape)

Shape of position data:  (2001, 375, 3) (2001,)
Shape of velocity data:  (501, 375, 3) (501,)


# Problem 2.1 - Getting oriented (20 points)
Compute and answer the following questions:
* What is the length of time in real units for the position data, and what is the length of time for the velocity data? 
* What is the spacing between frames in each case?
* What are the starting x,y, and z positions of the atoms in the first molecule?
* What are the starting velocities, vx,vy,vz, of the atoms in the first molecule?

# Problem 2.2 - Diffusion and mean-squared displacement
## In this problem we will compute the mean squared displacement of water and from that extract the diffusion constant
## As a simplification, we can do it for the oxygen atoms, although you could do it for the center of mass of each molcule

# Problem 2.2.1 - Simple MSD for oxygen (30 points)
## In this problem, compute we will plot the mean squared displacement of oxygen atoms the "simple way"
### 1) First, get the coordinates of the oxygen atoms at every time
### 2) Then, compute the squared displacement compared to *time 0* only for each atom $\delta\vec{r_i}(t)\cdot\delta\vec{r_i}(t)$, where for each atom displacement $\delta \vec{r}_i(t) = \vec{r_i}(t)-\vec{r_i}(0)$,  and then take the average over the 125 atoms
### With numpy arrays you can do this pretty much all in one line, or you can do it one step at a time
### 3) Plot this MSD versus time, it should look kind of noisy, like the following
![RoughMSD](example_figures/msd_rough.png)


# Problem 2.2.2 - Beter MSD computation for oxygen (50 points)
## In this problem, compute we will plot the mean squared displacement of oxygen atoms the "real way". You can do this however you want, but one procedure would be:
### 1) Compute the squared displacement compared to every possible starting time, *time s* for each atom, so for each atom displacement $\delta \vec{r}_i(t) = \vec{r_i}(t+s)-\vec{r_i}(s)$,  and then take the average over starting times $s$
### 2) Then you can average over the 125 atoms. You can also compute a standard deviation over the 125 atoms
### 3) Plot this MSD versus time on top of your previous plot, and use error bars to show the standard deviations, it should look kind of noisy, like the following
![MSDComparison](example_figures/msd_comparison.png)
### 4) As best you can, explain the size of the error bars versus time


# Problem 2.2.3 - Diffusion constant (30 points)
### 1) Recall the theoretical relationship between MSD(t) and the Diffusion constant.
### 2) Compute the diffusion constant by fitting the MSD data you computed to this formula, and make sure to give proper units. 
### You can search "fitting in python" and things like that to find out how to best to do it. I used the library function "sklearn.linear_model.LinearRegression" but there are many possible ways to do it
### If you want to be sophisticated, you can use the standard deviation or other error metrics (jackknife, bootstrap) in your fit

### ** If you cannot get the previous part to work, you can read in my MSD values and still do this part, by running the following command: msd = np.load("example_figures/msd_data.npy")

# Problem 2.2.4 - Velocity distribution (30 points)
## 1) Compute the distribution of velocities for "O" and "H" atoms separately in either x,y or z. You can check whether they are the same in each dimension if you want.
## 2) Plot these on top of eachother, it should look something like the following
![VDist](example_figures/velocity_distribution.png)

### 3) What is the formula that should correspond to each curve? 
### 4) Compute the standard deviation of the velocity data, and comprae to the ideal value $\sqrt{k_B T/m}$. How accurately is this satisfied by the simulation? Don't forget to be careful about which units you're using!

# Problem 2.2.5 - Velocity-velocity auto correlation function (50 points)
### 1) Compute the velocity-velocity autocorrelation function $\langle \vec{v}(t)\cdot \vec{v}(0)\rangle$. This should be a very similar function to the "correct" MSD calculation above, averaging over all start times and atoms
### 2) Plot this function versus time. Is there any interesting feature in this curve that you maybe didn't expect? [Hint: look at point (a) of the abstract for this paper, which is a very famous study related to the first MD simulations of liquids. https://aip.scitation.org/doi/abs/10.1063/1.1727719 ] What do you think the reason for this feature might be? 
### 3) Using the appropriate formula, determine the diffusion constant from this correlation function by numerical integration. I used the np.trapz function to do the integration, but you can use whatever method you want. Is this value close to what you got before?

# Problem 2.2.6 (Bonus) - Dipole dipole correlatoin function (50 points)
## Compute and plot the dipole-dipole correlation function, as from your final homework. $\langle \vec{\mu}(t)\cdot \vec{\mu}(0)\rangle$. 
### First compute the dipole of every water molcule at each time
### Then, you should be able to use the same function as your velocity-velocity correlation function, but first This should be a very similar function to the "correct" MSD calculation above, averaging over all start times and atoms
### Does this function decay as we predicted theoretically for a rigid molecule in the homework? How would you get the rotational 'friction' from this plot?