# Basic Monte Carlo Integration

## Overview

In this section, we will look into <a href="https://en.wikipedia.org/wiki/Monte_Carlo_integration">Monte Carlo integration</a>. Frequently in applications we need to evaluate integrals for which no analytical solution exists. Numerical methods can help us overcome this. Monte Carlo is just one of these methods. 

## Monte Carlo integration

Let's assume that we want to evaluate the integral

$$I=\int_a^b h(x) dx$$

If $f$ is a polynomial or a trigonometric function, then this integral can be calculated in closed form. 
However, in many cases there may not be  a closed for solution for $I$. Numerical techniques, such as <a href="https://en.wikipedia.org/wiki/Gaussian_quadrature">Gaussian quadrature</a> or the the <a href="https://en.wikipedia.org/wiki/Trapezoidal_rule">trapezoid rule</a> can  be 
employed in order to evaluate $I$. Monte Carlo integration is yet another techinque for evaluating complex integrals that is
notable for its simplicity and generality [1].

Let's begine by rewriting $I$ as follows

$$I=\int_a^b \omega(x)f(x) dx$$

where $\omega=h(x)(b-a)$ and $f(x) = 1/(b-a)$ and $f$ is the probability density for a uniform random variable over $(a,b)$ [1]. 
Recall that the expectation for a continuous variable $X$ is given by

$$E\left[X\right]=\int xf(x)dx$$

Hence, 

$$I=E\left[\omega(X)\right]$$

This is the basic Monte Carlo integration method [1]. In order to evaluate the integral $I$, we evaluate the following expression

$$\hat{I} = \frac{1}{N}\sum_{i=1}^{N}\omega(x_i)$$

where $x \sim U(a,b)$. By the 
<a href="https://en.wikipedia.org/wiki/Law_of_large_numbers">law of large numbers</a> it follows, [1],

$$\hat{I}\rightarrow E\left[\omega(X)\right] = I$$

The standard error, $\hat{se}$, for the estimate is [1]

$$\hat{se} = \frac{s}{\sqrt{N}}$$

where

$$s^2  = \frac{\sum_{i}^{N}(\omega(x_i) - \hat{I} )^2}{N - 1}$$

A $1-\alpha$ confidence interval for the estimate is given from, [1], 

$$\hat{I} \pm z_{\alpha/2}\hat{se}$$

## Python example

In [3]:
import numpy as np
from scipy import random

In [4]:
def f(x):
    return x*x*x

In [5]:
# generate the random points
N = [10, 100, 1000, 10000]
a = 0.0
b = 1.0
exact_answer = 1.0 / 4.0

In [12]:
for n in N:
    print(f"Sample size={n}")
    
    # array of zeros of length N
    random_points = np.zeros(n)
    
    integral = 0.0
    y = []
    for _ in range (n):
        
        point = random.uniform(a,b)
        integral += f(point)
        y.append(f(point)*(b-a))
        
    ans = (integral *(b-a)) / float(n) 
    print(f"Calculated answer={ans}")
          
    sum_2 = 0.0
    for yi in y:
          sum_2 += (yi - ans)*(yi - ans)
    
    s2 = np.sqrt(sum_2/(n-1))
    se_hat = s2/np.sqrt(n)
    print(f"Standard error of estimate {se_hat}")
    print(f"95% C.I. for estimate=[{ans - 1.96*se_hat}, {ans + 1.96*se_hat}")
    
    
    

Sample size=10
Calculated answer=0.4089009285012767
Standard error of estimate 0.02962949840243374
95% C.I. for estimate=[0.35082711163250657, 0.4669747453700468
Sample size=100
Calculated answer=0.2333997858544021
Standard error of estimate 0.007428551162622636
95% C.I. for estimate=[0.21883982557566173, 0.24795974613314248
Sample size=1000
Calculated answer=0.24887124843892253
Standard error of estimate 0.002554323880305161
95% C.I. for estimate=[0.2438647736335244, 0.2538777232443206
Sample size=10000
Calculated answer=0.2557902663480203
Standard error of estimate 0.0008199960304503451
95% C.I. for estimate=[0.2541830741283376, 0.257397458567703


## Summary

In this section we reviewed the Monte Carlo integration. This is a numerical integration method. 
In the basic form dicscussed herein, we sample points $x_i$ from the uniform distribution and evaluate the expression

$$\hat{I}=\frac{1}{N}\sum_{i=1}^{N} \omega(x_i)$$

According to the law of large numbers, $\hat{I}$ will converge in probability to $I$. 
We also saw that a $1-\alpha$ confidence interval is given by $$\hat{I} \pm z_{\alpha/2}\hat{se}$$

 Standard Monte Carlo integration is great if we can sample from
the target distribution (i.e. the desired distribution). However, this is not always possible.
<a href="https://en.wikipedia.org/wiki/Importance_sampling">Importance sampling</a> is a methodwe can use in order to 
overcome the proble of sampling from a difficult target distribution.

Finally, one advantage of this basic Monte Carlo integrator is that it is very easy to parallelize. 

## References

1. Larry Wasserman, _All of Statistics. A Concise Course in Statistical Inference_, Springer 2003.