# Computational Practical - Introduction to Python, Simulations and Data Handling with Pandas

Throughout this practical we're going to first: 
- Set up Anaconda/Visual Studio and make sure everyone knows how to create an environment to run code 
- Simulate and visualise Brownian Motion 
- Save and Load a large Data File with our simulator 
- Data Handling with our new data! 

## 1. Recap on Brownian Motion 

Brownian motion refers to the random and continuous movement of particles in fluids (liquids or gases), which occurs due to collisions between the particles. This phenomenon was first observed by the Scottish botanist Robert Brown in 1827. The motion is independent, with each step being random and unrelated to the previous one, often modeled using a 'random walk'. Over time, the distance a particle travels from its starting position forms a Gaussian distribution, expanding as time progresses.

Key factors that amplify the motion of particles include increased temperature (leading to more energy and hence, more motion), less viscous solutions (making it easier for particles to move), and smaller particle sizes (as smaller particles move faster and travel further compared to larger particles). In colloids, Brownian motion prevents the larger particles from settling down due to gravity, keeping them suspended in the solution.

Examples of Brownian motion are evident in the diffusion process and can be observed in experiments involving the tracking of fluorescent dyes in solutions. Notable figures in the study of Brownian motion include Albert Einstein, who published the first paper on the topic, and Jean Baptiste Perrin, who was awarded the Nobel Prize in Physics in 1926 for his contribution to the understanding of Brownian motion. The research in this area continues today, focusing on various aspects including mathematical modeling of the motion.

## 2. Formulas needed for Brownian Motion

#### One-Dimensional Brownian Motion:

In a standard one-dimensional Brownian motion $B_t$, the change $ \Delta B_t $ over an interval $ \Delta t$ is given by:

$$ \Delta B_t \sim N(0, \Delta t) $$

Where:
- $ \Delta B_t $: Change in position
- $N(0, \Delta t)$: Normally distributed random variable with mean $0$ and variance $ \Delta t $

#### Two-Dimensional Brownian Motion:

In two dimensions, the $x$ and $y$ coordinates, $ (X_t, Y_t) $, evolve independently as:

$$ \Delta X_t \sim N(0, \Delta t) $$
$$ \Delta Y_t \sim N(0, \Delta t) $$

Where:
- $ \Delta X_t $, $\Delta Y_t$: Change in x and y coordinates respectively
- $ N(0, \Delta t) $: Normally distributed random variable with mean 0 and variance $ \Delta t $

#### General Form:

The general form of Brownian motion, $ B_t $, at time $ t $can be expressed as:

$$ B_t = \sqrt{t} \cdot Z $$

Where:
- $ B_t $: Position at time $ t $
- $ \sqrt{t} $: Square root of time
- $ Z $: Standard normal random variable ($ Z \sim N(0, 1)$)


## 3. So how do we transition from Code to Simulation? Can we simulate this?

Sometimes it can be tricky to transition from theory to simulation, so let's go through it step by step. Write a function/class what ever you want that simulates this in one dimension as a function of time. 

In [1]:
import numpy as np 
import matplotlib.pyplot as plt 

Can you plot this? (Lorenzo is going to show you how to plot this and make it pretty in the next practical!)

## Two Dimensional Simulation: Can you simulate two dimensional brownian motion? How would you plot this?

See if you can get this yourself! You can either write your own function or create a new one! It's up to you! :) 

In [4]:
#put your code here to simulate and plot brownian motion


Great now you know how to simulate Brownian Motion! Your next step is to see if you can save it! Usually when we handle large datasets in science, we use a python library called pandas. It's a great way to handle our data easy and efficiently. So make sure you have it installed!

In [26]:
#put your code here to write it as a file with pandas and save that file

Here we saved it to a csv file, but there is lots of ways you can save them, including npz, excel etc. When your in the lab sometimes you'll be handling weird files, so make sure to look up on pandas the way to save and load them! 

Next step is to load them!

In [27]:
#write your code here to load the data

Now our data is nice and clean and all the numbers are there. But in real life, we usually don't have such a nice dataset and data is usually missing! So I'm going to make our data look a bit more realistic. 

In [28]:
def introduce_missing_values(df, missing_fraction=0.1):

    df_with_nans = df.copy()
    total_values = df_with_nans.size
    num_missing = int(total_values * missing_fraction)
    for _ in range(num_missing):
        row_idx = np.random.randint(0, len(df_with_nans))
        col_idx = np.random.randint(0, len(df_with_nans.columns))
        df_with_nans.iat[row_idx, col_idx] = np.nan
    
    return df_with_nans

messy_dataset = introduce_missing_values(dataset, missing_fraction=0.1)


How to view the dataset?

In [5]:
#print the dataset 


How to identify missing values?

In [6]:
#write code here to identy the missing values 

## How to handle those missing values?

In [32]:
#write a code to get rid of the missing values 

## Or how about selecting certain columns or rows? Or caluculating statistics?

In [34]:
#write code to get some stats on the dataset 

# Analysis of Brownian Motion Dataset

In this section, we will perform various calculations and analyses on the 2D Brownian motion dataset to understand various aspects of Brownian motion.

## 1. Calculation of Mean Squared Displacement (MSD)

One of the fundamental quantities to compute for particles undergoing Brownian motion is the Mean Squared Displacement (MSD). It gives insight into the spatial extent of the particle's motion, which is central in determining the diffusive properties of the particle.

### Calculating MSD:

$$ MSD(t) = \langle (x(t) - x(0))^2 \rangle $$

Where $x(t)$ and $x(0)$ are the positions of the particle at time $t$ and initial time respectively.


In [35]:
#write the above to calculate the MSD

## 2. Estimating Diffusion Coefficient

The diffusion coefficient is a fundamental parameter in the analysis of Brownian motion. It can be estimated from the MSD as:
 
$$D= \frac{MSD(t)}{4t}$$
 
This is for a 2D Brownian motion.

In [37]:
#calculate the diffusion coefficient 


## 3. Velocity Autocorrelations

Calculate the velocity autocorrelations to analyze the correlations in the motion of the particles. This will give insight into the memory and the randomness of the motion.

In [42]:
#calculate the velocity autocorrelations
