# Analysis of Trajectories

Trajectories --|math| --> Statistics

In [2]:
# Importing the required libraries
import matplotlib.pyplot as plt


# The bread-and-butter of numerical computation aka statistics with python
import pandas as pd
import numpy as np


In [None]:
# We start with importing the pandas dataframe
trajectory_file = None
traj = pd.read_csv (trajectory_file)

# View some of the data from the top (or the dataframe's head)
traj.head()

# Print some summary statistics
traj.describe()

The dataframe contains several columns. Note the columns `frame` and `particle`. Each row represents a single feature in one frame for one particle. 
One frame is one photo and represents one ensamble.
To get a single frame, we can `query` the dataframe as follows:


# 

In [None]:
# View a single frame
frame_no = 4
print(traj[traj.frame == frame_no])

To get a particle trajectory instead, we can query for a particular particle. Please note that not all particle ids exist. Many of them have been filtered out in the last notebook.

In [None]:
part_no = 4
eg_traj = traj[traj.particle == part_no]
print(eg_traj)

In [None]:
# Let'a see how it moves!
plt.plot(eg_traj.x, eg_traj.y, title=f"Trajectoryy of particle {part_no}")

Now let's start with the analysis. We want to find the average speeds in x & y of all the particles and plot a 2D histogram (or a heatmap) wit it.

## How to calculate the speeds?

We already have the trajectories of the particles, which gives us the x. Speed is **distance / time**. So we will calculate the distance travelled from the trajectories and divide it by the total time. That gives us the average speed. Have a look at the function below. Our units are `pixels/frames`. Here `pixels` is a unit of distance and `frame`is a unit of time (given the `framerate` is fixed).

In [None]:
def avg_speed(df_part, framerate=None):
    
    # We calculate the time (or the total frames we have)
    tot_frames = max(df_part.index) - min(df_part.index)
    time = tot_frames
    
    # Extract x and y positions
    xlist = np.array(df_part.x)
    ylist = np.array(df_part.y)
    
    # Calculating displacement and distance arrays (plus a little array indicing to equalize the array sizes)
    dispx = xlist[:-1] - xlist[1:] # x(sans last element) - x(sans first element)
    dispy = ylist[:-1] - ylist[1:] # y(sans last element) - y(sans first element)
    
    # Displacement -> Distance array abs(-x) = x. "Absolute value" or "mod" operation
    distx = abs(dispx)
    disty = abs(dispy)
    
    tot_distx = sum(distx)
    tot_disty = sum(disty)
    
    # Calculating the speed
    speed_x = tot_distx / time
    speed_y = tot_disty / time
    
    return (speed_x, speed_y)

Now modify the above function to return the speeds in the units: `pixels/seconds` below:


In [None]:
def avg_speed2():
    pass

In [None]:
# Now lets do the `avg_speed` operation for all the particles we have.

speeds = list() # Create an empty list

for i in range(max(traj.particle)):
    df_per_particle = traj[traj.particle==i]
    
    if df_per_particle.empty:
        #print(i, "Particle doesn't exist!")
        pass # Ignore this particle number
        
    else:
        # Calculate speeds and append to list
        speeds.append(avg_speed(df_per_particle))
        
speeds # We now have a list of average speeds (vx, vy) for all available particles

Now lets calculate the 2D histogram

We need to restructure the datastructure for the next step. The list `speeds` has the following form:

[(vx1, vy2), (vx2, vy2), (vx3, vy3), ...]

For the 2D histogram, we need the following form:

vx_list = [vx1, vx2, vx3, ...]

vy_list = [vy1, vy2, vy3, ...]