# ASTR 598 Monte Carlo Methods April 8th, 2016

---

### Random numbers, Generating them, and You

In [2]:
from __future__ import print_function, division

import numpy as np
import time
import math
import matplotlib as mpl
import matplotlib.pyplot as plt

### Monte Carlo Integration

---

Useful to integrate when your dimension d > 10 or so.

If you sample function with N points, the error scales as error (variance) $\sim 1/\sqrt{N}$.  Slow, but guarenteed.  Make sure to vary point number, like use N and 2N when doing calculations for a good error comparison.  Also, you should run multiple independent simulations (different RNG seeds, of course!) to get a better answer, more robust error.  To do this, run multiple copies on different cores (easy to parallelize!), can plot distribution of answers to get answer, error from width of distribution.  We expect this error to decrease as the error equation given above.

Pros:
```
1. Good for large number of dimensions
2. error ~ 1/sqrt(N)
3. easy to run multiple independent simulations for comparison/error estimate
4. easy to parallelize
5. easy to check point (save state if calculation is interupted)
    -Ex: every x steps, print out RNG seed, point information to some file
```

Cons:
```
1. Can be slow for large N
2. Can require large N to beat down errors
```

Ex. Estimating pi using Monte Carlo Integration
```
1. Draw random x, y from [0,1] uniformly
2. Check to see if it's within the circle (if x^2 + y^2 < 1)
3. Count total points, total points within circle
4: (area of circle quadrant)/(area of 1 by 1 square) == (number of points within circle)/(total number of points)
```

Algorithms like this are really easy to generalize to a higher dimension.  Imagine some 10-sphere of "radius" 1.  Finding it's 10-volume (whatever that is) via Monte Carlo integration uses the same scheme as above but the if statement becomes
```
if x1^2 + x2^2 + ... + x10^2 < 1:
    area_count++
```

### Random numbers in python
Has science gone too far?

---

Numpy random is great! See here: http://docs.scipy.org/doc/numpy/reference/routines.random.html.  But all these algorithms are psuedorandom meaning they can pass statistical tests of randomness, depending on the algorithm, of course, but are totally deterministic.

In [6]:
# print random numbers
# omg im so rndom

for i in range(10):
    print(np.random.randint(1,1000)) # Return random integers from low (inclusive) to high (exclusive).

659
229
775
182
545
46
340
933
886
111


### For actual calculations, we want to control the random seed.
This allows us to reproduce results and weed out bugs if need be.

In [10]:
# print random numbers

# Set the seed!
np.random.seed(123)

for i in range(10):
    print(np.random.randint(1,1000)) # Return random integers from low (inclusive) to high (exclusive).

511
366
383
323
989
99
743
18
596
107


With a given seed, we get the same random number sequence!

### Good way to seed?

---

Typically, we use the system time when your simulations start at least a second apart.  Note: this will overflow in 2038!

In [18]:
seed=time.time()
print("Seed: %lf" % seed)

Seed: 1460136939.812944


If you want to see what your RNG is doing, call get_state().  Below the output is suppressed, but it tells use that we're using the Mersenne Twister algorithm.

In [19]:
np.random.get_state();

### Get a random normal deviate:

In [22]:
mu = 0.0
sigma = 1.0
num_vals = 10
np.random.normal(mu, sigma, num_vals)

array([-0.76943347,  0.57674602,  0.12652592, -1.30148897,  2.20742744,
        0.52274247,  0.46564476,  0.72491523,  1.49582653,  0.74658059])

# Math

Suppose for some distribution P(z) where $\int_0^{\infty}P(z)dz = 1$ with a cumulative distribution function of
$$
F(x) = \int_0^x P(z)dz 
$$

$$
dF = P(z)dz
$$

$$
\int_0^{F_0} dF = \int_0^{x_0} P(z)dz 
$$

$$
F_0 = \int_0^{x_0} P(z)dz 
$$

So suppose you can only sample a uniform distribution from [0,1] = $F_0$.  You use that to solve for $x_0$ and $x_0$ will be distributed according to P(z).  This methods handles ANY arbitrary distrubiton and allows you to randomly sample P(z) given any random uniform variate from [0,1]!  This works because a given CDF of a sample is a uniform distribution from [0,1] (proof: http://stats.stackexchange.com/questions/161635/why-is-the-cdf-of-a-sample-uniformly-distributed)

### Ex: Exponential distribution

---

$$
P(z) = \lambda e^{-\lambda z}
$$

for $0 \leq z < \infty$

$$
F_0 = \int_0^{x_0} \lambda e^{-\lambda z} dz
$$
We can evaluate this analytically!

...math...

$$
F_0 - 1 = -e^{-\lambda x_0} 
$$

...more math...

$$
x_0 = \frac{-1}{\lambda} \log{(1-F_0)}
$$

where log is the natural log.

Therefore, all you have to do is plug in $F_0$ (random uniform variate from [0,1]) into the above expression and it givens you a random variate drawn from the exponential distribution!

### Ex: Lorentz distribution

---

$$
P(z) = \frac{2}{\pi}\frac{1}{1 + z^2}
$$

for $0 \leq z < \infty$

$$
F_0 = \frac{2}{\pi} \int_0^{x_0} \frac{dz}{1 + z^2} 
$$
We can evaluate this analytically (thanks to integral tables)!

...math...

$$
F_0 = \frac{2}{\pi}[arctan(x_0)]
$$

...more math...

$$
x_0 = \tan(\frac{\pi}{2} F_0)
$$

where log is the natural log.

Therefore, all you have to do is plug in $F_0$ (random uniform variate from [0,1]) into the above expression and it givens you a random variate drawn from the Lorentz distribution!

If we instead want $-\infty < z < \infty$,

```
if xi_1 < 0.5)
    x = -tan(pi/2 * x2)
else
    x = +tan(pi/2 * x2)
```
for some random x2.

## Sampling From Gaussian: Box-Muller Transform
See here: https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform
where U = 1 - x for uniform random x.