## The Monte Carlo method

What is the average height of all people living in the United States?

This is difficult to determine exactly but can easily be estimated quite well:<br>
Sample $n=100$ (say) people at random. Then use the average height of these $n$ people as an estimate of the average height of all people in the US.

This is an example of the general problem where we are interested in a unknown **parameter** $\theta$ of a population.

We estimate $\theta$ with a statistic (**estimator**) $\hat{\theta}$ which is based on a sample of $n$ observations $X_1, ..., X_n$ drawn at random from the population:

$\hat{\theta}=$ average of the sample $=\frac{1}{n}\sum_{i=1}^{n}X_i$

$\hat{\theta}=\frac{1}{n}\sum_{i=1}^{n}X_i$ tends to be close to the uncomputable population mean $\theta$, even for moderate sample sizes such as $n=100$.

This example is a special case of the **Monte Carlo Method** or **Simulation**:

* We approximate a fixed quantity $\theta$ by the average of independent randdom variables that have expected value $\theta$.

* By the law of large numbers, the approximation error can be made arbitrarily small by using a large enough sample size.

The Monte Carlo Method can also be used for more involved quantities. For example, we can use it to compute the standard error (SE) of a statistic $\hat{\theta}$.

Recal that the standard error tells roughly how far off the statistic will be from its expected value. The precise definition is

$$SE(\hat{\theta}) = \sqrt{E(\hat{\theta} - E(\hat{\theta}))^2}$$

* Get many (say 1,000) samples of 100 observations each.

* Compute $\hat{\theta}$ for each sample, resulting in 1,000 estimates $\hat{\theta}_1, ..., \hat{\theta}_{1000}$

* Compute the standard deviation of these 1,000 estimates:

$$s(\hat{\theta}_1,...,\hat{\theta}_{1000})=\sqrt{\frac{1}{999}\sum_{i=1}^{1000}(\hat{\theta}_i - avg(\hat{\theta}_i))^2}$$

Note that this is not an average of independent random variables. But it can be shown that the law of large numbers still applies and Monte Carlo works:

$$s(\hat{\theta}_1,...,\hat{\theta}_{1000}) \approx SE(\hat{\theta})$$

We can use Monte Carlo only if we can draw manny samples of size 100!