<!-- dom:TITLE: Monte Carlo Methods -->
# Monte Carlo Methods
<!-- dom:AUTHOR: Aksel Hiorth, the National IOR Centre & Institute for Energy Resources, -->
<!-- Author: -->  
**Aksel Hiorth, the National IOR Centre & Institute for Energy Resources,**
University of Stavanger

Date: **Jan 27, 2019**

<!-- Common Mako variables and functions -->

In [1]:
from IPython.core.display import HTML
css_file = 'style.css'
HTML(open(css_file, "r").read())

# Monte Carlo Methods
Usually we use differential equations to describe physical systems, the solution to these equations are continuous functions. In order for these solutions 
to be useful, they require that the differential equation describes our physical sufficiently. In many practical cases we have no control over many 
of the parameters entering the differential equation, or stated differently *our system is not deterministic*. This means that there could be some random
fluctuations, occurring at different times and points in space, that we have no control over. In a practical situation we might would like to investigate how these fluctuations would
affect the behavior of our system. A 

# Monte Carlo Integration
Let us start with a simple illustration of one uses of the Monte Carlo Method (MCM), Monte Carlo integration. To the left
in [figure](#fig:mc:mci) there is a shape of a pond. Imagine that we wanted to estimate the area of the pond, how could
we do it? Assume further that you did not have you phone or any other electronic devices to help you. 

<!-- dom:FIGURE: [fig-mc/mci.png, width=400 frac=1.0] Two ponds to illustrate the MCM. <div id="fig:mc:mci"></div> -->
<!-- begin figure -->
<div id="fig:mc:mci"></div>

<p>Two ponds to illustrate the MCM.</p>
<img src="fig-mc/mci.png" width=400>

<!-- end figure -->


One possible approach is: First to walk around it, and put up some bands (illustrated by the black dotted line).
Then estimate the area inside the bands (e.g. 4$\times$3 meters). Then we would know that the area was less than 12m$^2$. Finally,
and this is the difficult part, throw rocks *uniformly* inside the bands. If we are able to throw rocks randomly, and count the
number of rocks hitting the water, the area of the pond should be:

<!-- Equation labels as ordinary links -->
<div id="eq:mc:mci"></div>

$$
\begin{equation}
A\simeq\text{Area of rectangle}\times\frac{\text{Number of rocks hitting the pond}}{\text{Number of rocks thrown}}.
\label{eq:mc:mci} \tag{1}
\end{equation}
$$

It is important that we throw the rocks randomly, otherwise  equation ([1](#eq:mc:mci)) is not correct. Now, let us
investigate this in more detail, and use the idea of throwing rocks to estimate $\pi$. To the right in [figure](#fig:mc:mci),
there is a well known shape, a circle. The area of the circle is $\pi d^2/4$, and the shape is given by $x^2+y^2=d^2/4$. Assume that
the circle is inscribed in a square with sides of $d$. To throw rocks randomly inside the square, is equivalent pick random numbers
with coordinates $(x,y)$, where $x\in[0,d]$ and $y\in[0,d]$. We want all the $x-$ and $y-$values to be chosen with equal probability,
which is equivalent to pick random numbers from a *uniform* distribution. Below is a Python implementation:

In [2]:
import numpy as np
import random

def estimate_pi(N,d):
#   random.seed(2)
    D2=d*d/4; dc=0.5*d
    A=0
    for k in range(0,N):
        x=random.uniform(0,d)
        y=random.uniform(0,d)
        if((x-dc)**2+(y-dc)**2 <= D2):
            A+=1
    # estimate area of circle: d*d*A/N
    return 4*A/N

N=1000;d=1
pi_est=estimate_pi(N,d)
print('Estimate for pi= ', pi_est,' Error=', np.pi-pi_est)

In the table below, we have run the code for $d=1$ and different values of $N$. 

<table border="1">
<thead>
<tr><th align="center">MC estimate</th> <th align="center">   Error   </th> <th align="center"> $N$  </th> <th align="center">$1/\sqrt{N}$</th> </tr>
</thead>
<tbody>
<tr><td align="center">   3.04           </td> <td align="center">   -0.10159       </td> <td align="center">   10$^2$    </td> <td align="center">   0.100           </td> </tr>
<tr><td align="center">   3.176          </td> <td align="center">   $\,$0.03441    </td> <td align="center">   10$^3$    </td> <td align="center">   0.032           </td> </tr>
<tr><td align="center">   3.1584         </td> <td align="center">   $\,$0.01681    </td> <td align="center">   10$^4$    </td> <td align="center">   0.010           </td> </tr>
<tr><td align="center">   3.14072        </td> <td align="center">   -0.00087       </td> <td align="center">   10$^5$    </td> <td align="center">   0.003           </td> </tr>
</tbody>
</table>
We clearly see that a fair amount of rocks or numbers needs to be used in order to get a good estimate. If you run this code several
times you will see that the results changes from time to time. This makes sense as the coordinates $x$ and $y$ are chosen at random. There are
much to be said about random number generators. The MCM depends on a good random number generator, otherwise we cannot use the results from
statistics to develop our algorithms. Below, we briefly summarize some important points that you should be aware of:

1. Random number generators are generally of two types: *hardware random number generator* (HRNG) or *pseudo random number generator* (PRNG).

2. HRNG uses a physical process to generate random numbers, this could atmospheric noise, radioactive decay, microscopic fluctuations, which is translated to an electrical signal. The electrical signal is converted to a digital number (1 or 0), by sampling the random signal random numbers can be generated. The HRNG are often named *true random number generators*, and their main use are in *cryptography*.

3. PRNG uses a mathematical algorithm to generate an (apparent) random sequence. The algorithm uses an initial number, or a *seed*,  to start the sequence of random number. The sequence is deterministic, and it will generate the same sequence of numbers if the same seed is used. At some point the algorithm will reproduce itself, i.e. it will have certain period. For some seeds the period may be much shorter.

4. Many of the PRNG are not considered to be cryptographically secure, because if a sufficiently long sequence of random numbers are generated from them, the rest of the sequence can be predicted. 

5. Python uses the [Mersenne Twister](https://en.wikipedia.org/wiki/Mersenne_Twister) algorithm to generate random numbers, and has a period of $2^{19937}−1\simeq4.3\cdot10^{6001}$. It is not considered to be cryptographically secure.

In Pythons `random.uniform` function, a random seed is chosen each time the code is run, but
if we set e.g. `random.seed(2)`, the code will generate the same sequence of numbers each time it is called. 

## Errors on Monte Carlo Integration and the Binomial Distribution
How many rocks do we need to throw in order to reach a certain accuracy? To answer this question we need some results from statistics. Our problem of calculating the integral is closely related to the *binomial distribution*. When we throw a rock one of two things can happen i) the rock falls into the water, or ii) it falls outside the pond. If we denote the probability that the rock falls into the pond as $p$, then the probability that it falls outside the pond, $q$, has to be $q=1-p$.
This is simply because there are no other possibilities and the sum of the two probabilities has to be one: $p+q=p+(1-p)=1$. The binomial distribution is given by:

<!-- Equation labels as ordinary links -->
<div id="eq:mc:bin"></div>

$$
\begin{equation}
p(k)=\frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}.
\label{eq:mc:bin} \tag{2}
\end{equation}
$$

$p(k)$ is the probability that an event happens $k$ times after $n$ trials. The mean, $\mu$, and the variance, $\sigma^2$, of the binomial distribution is:

<!-- Equation labels as ordinary links -->
<div id="eq:mc:binm"></div>

$$
\begin{equation}
\mu=\sum_{k=0}^{n-1}kp(k)=np, \label{eq:mc:binm} \tag{3}
\end{equation}
$$

<!-- Equation labels as ordinary links -->
<div id="eq:mc:binv"></div>

$$
\begin{equation}  
\sigma^2=\sum_{k=0}^{n-1}(k-\mu)^2p(k)=np(1-p). \label{eq:mc:binv} \tag{4}
\end{equation}
$$

Before we proceed, we should take a moment and look a little more into the meaning of the formulas above, to appreciate it usefulness.  A classical example of the use of the binomial formula is to toss a coin, if the coin is fair it will have an equal probability of giving us a head or tail, hence $p=0.5$. Equation ([2](#eq:mc:bin)), can answer questions like: "What is the probability to get only heads after 4 tosses?". Let us calculate this answer using equation ([2](#eq:mc:bin)), the number of tosses is 4, the number of success is 5 (only heads each time)

<!-- Equation labels as ordinary links -->
<div id="eq:mc:coin"></div>

$$
\begin{equation}
p(k=4)=\frac{4!}{4!(4-4)!}\frac{1}{2}^4(1-\frac{1}{2})^{4-4}=\frac{1}{2^4}=\frac{1}{16}.
\label{eq:mc:coin} \tag{5}
\end{equation}
$$

and in a more explicit manner:

The mathematical formulas 
In our case we throw rocks into the square $N$ times, and record each time it falls into the pond