# Workshop 4: Introduction to Stochastic Methods

A **stochastic** process (also known as a **random** process) is guided by random variables. In stochastic processes, the outcome of a particular event can only be known probabilistically. However, a collection of events exhibits a robust statistical behavior. For example, flipping a balanced coin should produce heads 50% of the time. However, one cannot predict the outcome of a since flip.

In [None]:
import scipy         # Another numerical library
from scipy import integrate

import matplotlib    # Library used for plotting
import numpy as np   # Numerical library
import matplotlib.pyplot as plt # Plot commands

# Define some colors using the RGB format

CF_red = (204/255, 121/255, 167/255)
CF_vermillion = (213/255, 94/255, 0)
CF_orange = (230/255, 159/255, 0)
CF_yellow = (240/255, 228/255, 66/255)
CF_green = (0, 158/255, 115/255)
CF_sky = (86/255, 180/255, 233/255)
CF_blue = (0, 114/255, 178/255)
CF_black = (0, 0, 0)

### Coin Flipping

Let us start by defining a function for flipping coins:

Now we do some flippin':

This is nice, but it's not very useful for a large number of flips. Instead, let us define a new ```coin_flip()``` function that returns [1,0] for heads and [0,1] for tails

The new output for the ```coin_flip()``` function might seem a bit weird, but it will make sense shortly. First, we check that the function still works:

We can use this output type to keep track of the number of heads and tails in a series of coin flips:

Now, we perform a series of coin flips and plot the numbers of heads and tails:

To make the difference between the number of heads and tails more clear, define a function that computes it from ```flips()``` 

Now, we can see how the difference changes with the number of flips:

Let us plot several realizations on the same axes

We see that the more flips there are, the larger is the excursion of the (Heads $-$ Tails) quantity from zero. How far should we expect it to deviate?

To answer this question, let us define $\Delta\left(n\right) = H\left(n\right) - T\left(n\right)$. 

Its average is given by
$$
\left\langle\Delta\left(n\right)\right\rangle = \left\langle H \left(n\right)\right\rangle - \left\langle T \left(n\right)\right\rangle = np_H - np_T\,,
$$

where $p_H$ and $p_T$ are the probabilities of getting heads or tails. For a fair coin, $p_H = p_T$ so the average is zero, as expected.

To get the deviation from zero, we need the variance:
$$
\sigma^2 = \left\langle \left[\Delta\left(n\right) - \left\langle\Delta\left(n\right)\right\rangle\right]^2 \right\rangle
= \left\langle \Delta^2\left(n\right) + \left\langle\Delta\left(n\right)\right\rangle^2 - 2\Delta\left(n\right)  \left\langle\Delta\left(n\right)\right\rangle\right\rangle = \left\langle \Delta^2\left(n\right)\right\rangle  - \left\langle \Delta \left(n\right)\right\rangle^2\,.
$$

We already know the second term. The first one is:
$$
\left\langle \Delta^2\left(n\right)  \right\rangle = \left\langle \left[H\left(n\right) - T\left(n\right) \right]^2 \right\rangle = \left\langle H^2\left(n\right)+T^2\left(n\right) - 2H\left(n\right)T\left(n\right) \right\rangle 
= \left\langle H^2\left(n\right)\right\rangle + \left\langle T^2\left(n\right) \right\rangle - 2\left\langle H\left(n\right)T\left(n\right) \right\rangle 
\\
= \sum_{j = 0}^n j^2 p_H^j p_T^{n - j} \frac{n!}{j!\left(n - j\right)!}  + \sum_{j = 0}^n j^2 p_T^j p_H^{n - j}\frac{n!}{j!\left(n - j\right)!} - 2 \sum_{j = 0}^n  p_H^j p_T^{n - j}j\left(n - j\right) \frac{n!}{j!\left(n - j\right)!}
\\
=\left( p_H + p_T\right)^{n-2}\left[n^2\left(p_H - p_T\right)^2 + 4 np_H p_T  \right]\,.
$$

If $p_H + p_T = 1$, as expected for a coin, we have
$$
\left\langle \Delta^2\left(n\right)  \right\rangle
=\left[n^2\left(p_H - p_T\right)^2 + 4 np_H p_T  \right]\rightarrow \sigma^2 = 4 np_H p_T\rightarrow \sigma = 2\sqrt{n p_H p_T}\,.
$$

This means that for a fair coin, the standard deviation from the expected value should be close to $\sqrt{n}$.

### Central Limit Theorem

1. Start with defining a function that performs a specified number of flips $n$ with the probability of getting heads $p_H$ and returns $\Delta$.
2. Choose a sufficiently large (play with it) value of $n$ and, for now, set $p_H = 1/2$. Run the function you wrote $N$ times and plot the histogram of the results.
3. What is the fraction of your results that fall within $\pm\sigma$? What about $\pm2\sigma$ and $\pm 3\sigma$? Do these numbers look familiar?

Next, we perform multiple realizations to see the distribution of $\Delta$: