# Stochastic Simulation

*Winter Semester 2024/25*

06.12.2024

Prof. Sebastian Krumscheid<br>
Assistants: Stjepan Salatovic, Louise Kluge

<h3 align="center">
Exercise sheet 05
</h3>

---

<h1 align="center">
The Monte Carlo method
</h1>

In [1]:
import matplotlib.pylab as plt
import numpy as np

from ipywidgets import interact
from scipy.special import factorial
from scipy.stats import uniform, norm, pareto, lognorm, rv_continuous
from scipy.optimize import newton
from typing import Callable, Tuple
from tqdm.notebook import tqdm

In [2]:
plt.rc('axes', labelsize=14)     # fontsize of the x and y labels
plt.rc('xtick', labelsize=12)    # fontsize of the tick labels
plt.rc('ytick', labelsize=12)    # fontsize of the tick labels
plt.rc('legend', fontsize=14)    # legend fontsize

## Exercise 1

Let $X = [X_1, X_2, ..., X_n] \overset{\text{i.i.d}}{\sim} \mathcal{U}([-1,1]^n)$ be a random vector uniformly distributed over the $n$-dimensional square $\Gamma=[-1,1]^n$, and define the random variable $Z=\mathbb{1}_{\|X\|_{l^2}<1}.$ Observe that

\begin{align*}
I=\mathbb{E}[Z]=\int_\Gamma \mathbb{1}_{\|x\|_{l^2}<1}p(x)\mathrm{d}x=\frac{1}{|\Gamma|}\left|B(0,1)\right|,
\end{align*}

where $p(x)$ is the PDF of $\mathcal{U}([-1,1]^n),$ and $\left|B(0,1)\right|$ is the volume of the $n$-dimensional sphere with center $0$ and radius $1$. 

1. Let $n=2$. Use Monte Carlo to approximate the value of $I$:
$$
\overline{I}_N := \frac{1}{N} \sum_{k=1}^N Z_k
$$
For $N=10,100,1000,10000,$ compute $\overline{I}_N$ as well as an approximate confidence interval and compare with the exact value $I$. In addition, plot the relative error $\frac{|\overline I_N -I|}{I}$ versus $N$ in logarithmic scale and verify the convergence rate. 

In [3]:
def get_sample(N: int, n: int=2) -> np.array:
    """
    Generates a uniform sample of size `N` in the hypercube `[-1, 1]^n`.
    """
    # TODO
    return

In [6]:
def I(N: int, n: int=2) -> float:
    """
    Estimates `I` using `N` Monte Carlo samples in the hypercube `[-1, 1]^n`.
    """
    # TODO
    return

2. **(On the choice of $N$):** By _a priori_ analysis (knowing that $Z\sim \text{Bernoulli}(p)$ with $p=\pi/4$), determine three lower bounds for $N = N(\alpha, \epsilon)$ with $\epsilon =10^{-2}$ and $\alpha = 10^{-4}$ for ensuring that
$$
\mathbb{P}\left(\left| \overline{I}_N - \pi/4 \right| > \epsilon \right) < \alpha
$$
using Chebycheff's inequality (rigorous), the Berry-Esseen Theorem
(rigorous) and the leap of faith
$$
\frac{ \overline{I}_N - \pi/4}{\sqrt{\mathrm{Var}(Z)/N}} \sim N(0,1).
$$

    Discuss the advantages and disadvantages of using each bound.

In [7]:
def central_limit_bound(alpha: float, eps: float) -> int:
    """
    Calculates the minimum sample size for estimating `π / 4` based on the CLT.
    """
    # TODO
    return

In [8]:
def chebycheff_bound(alpha: float, eps: float) -> int:
    """
    Calculates the minimum sample size for estimating `π / 4` based on the Chebycheff bound.
    """
    # TODO
    return

In [9]:
def berry_essen_bound(alpha: float, eps: float) -> int:
    """
    Calculates the minimum sample size for estimating `π / 4` based on the Berry-Essen theorem.
    """
    # TODO
    return

3. An important property of the MC method is that, under
very weak regularity assumptions, an $O(N^{-1/2})$
convergence rate holds independently of the dimensionality of the
underlying problem.  To illustrate this, consider approximating $\mathbb{E}[Z]$ as in the first point, for $n=6.$

In [10]:
# TODO

## Exercise 2

A simulator would like to produce an unbiased estimate of
$\mathbb{E}(XY)$, where the two independent random variables $X$ and
$Y$ have bounded first moments and can be generated by a stochastic
simulation. To this end, she simulates $R\in\mathbb{N}$ replications
$X_1,\dots, X_R$ of $X$ and, independently of this, $R$ replications
$Y_1,\dots, Y_R$ of $Y$. She thus has the following two natural
estimators for $\mathbb{E}(XY)$ at her disposal:

\begin{equation*}
  \text{Est}_1 := \Biggl(\frac{1}{R}\sum_{r=1}^RX_r \Biggr) \Biggl(\frac{1}{R}\sum_{r=1}^RY_r \Biggr)\quad\text{and}\quad \text{Est}_2 := \frac{1}{R}\sum_{r=1}^RX_rY_r\;.
\end{equation*}

1. Verify that both estimators $\text{Est}_1$ and $\text{Est}_2$ are unbiased.

2. Show that $\mathrm{Var}(\text{Est}_1)<\mathrm{Var}(\text{Est}_2)$.

3. Use the delta method to show that $\sqrt{R}(\text{Est}_1-\mu_x\mu_y)\overset{d}{\rightarrow}N(0,\tau^2)$. Find $\tau^2$ explicitly and derive a $1-\alpha$ asymptotic confidence interval. 

## Exercise 3

Algorithm 1 proposes a sequential Monte Carlo method to compute the expectation $\mathbb{E}[X]$ of a random variable $X$, where the sample size is doubled at each iteration until the estimated $1-\alpha$ confidence interval based on a central limit theorem approximation is smaller than a prescribed tolerance $\epsilon$. The algorithm then outputs the final sample size $N(\epsilon,\alpha)$, as well as the estimated value $\bar X_N$.

---
**Algorithm 1** Sample Variance Based SMC

- **Input:** $N_0$, distribution $\lambda$, accuracy $\epsilon>0$, confidence $1-\alpha>0$.
- **Output:** $\overline X_{\epsilon,\alpha} $ (i.e., approximation of $\mathbb E [ X]$ with $X\sim\lambda$), $N$.

- Set $k=0$, generate $N_k$ i.i.d. replicas $\{X_i\}_{i=1}^{N_k}$ of $X\sim\lambda$ and 
	\begin{equation}\tag{1}
	\bar{X}_{N_k}=\frac{1}{N_k}\sum_{i=1}^{N_k}X_i,
	\end{equation}
	
    \begin{equation}\tag{2}
    \overline \sigma^2_{N_k} := \frac{1}{N_k-1} \sum_{i=1}^{N_k} (X_i -\overline X_{N_k})^2.
    \end{equation}

- **while** $\bar{\sigma}_{N_k}C_{1-\alpha/2}/\sqrt{N_k} > \epsilon$ **do**
    - Set $k =k+1$ and $N_{k}=2N_{k-1}$.
    - Generate a new batch of $N_k$ i.i.d. replicas $\{X_i\}_{i=1}^{N_k}$ of $X\sim \lambda$.
    - Compute the sample variance $\overline \sigma^2_{N_k}$ by (2).
- **end while**

- Set $N =N_k$, generate i.i.d. samples $\{X_i\}_{i=1}^{N}$ of $\lambda$ and compute the output sample mean $\overline X_{\epsilon,\alpha}$.

---

Algorithm 1 can be particularly sensitive to the choice of initial sample size $N_0$, and as such, we would like to assess the robustness of such an algorithm in estimating $\mathbb{E}[ X]$ for different distributions of $X$. For some values of $N_0$ ranging between $10$ and $50$,  consider $\alpha = 10^{-1.5}$ and $\epsilon =1/10$, and the following random variables:

1. $X \sim \text{Pareto}(x_m=1,\gamma=3.1)$ (i.e. with PDF $p(y) = \mathbb{1}_{y>x_m} x_m^\gamma \gamma y^{-(\gamma+1)}$), $\,\mathbb{E}[X]=\frac{\gamma x_m}{\gamma-1}.$
2. $X \sim \text{Lognormal}(\mu=0,\sigma=1)$,  $\,\mathbb{E}[X]=\exp\left(\mu+\frac{\sigma^2}{2}\right).$
3. $X \sim U([-1,1]),$ $\,\mathbb{E}[X]=0$.

Repeat the simulation $K=20\alpha^{-1}$ times and record the sample sizes $\{N^{(i)}\}_{i=1}^K$ as well as the computed sample means $\{\bar{X}_{\epsilon,\alpha}^{(i)}\}_{i=1}^K$ returned by the algorithm for each run $i=1,...,K$. Estimate the probability of failure $\overline p$ of the algorithm:

$$
\large
\overline p_K(N_0,\epsilon,\alpha)=\frac{1}{K}\sum_{i=1}^{K}\mathbb{1}_{|\bar{X}^{(i)}_{\epsilon, \alpha}-\mathbb{E}[X]|>\epsilon}.
$$

Then check whether  $\overline p_K(N_0,\epsilon,\alpha) \le \alpha$ holds. Repeat your experiment for different values of $\epsilon$ and $\alpha$. Discuss your results. 

**Hint 1:** You may generate Pareto$(x_m,\alpha)$ r.v. by inversion or have a look at [`scipy.stats.pareto`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pareto.html).

**Hint 2:** The specified type hint [`rv_continuous`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html) for the `dist` argument within the `algorithm_1` function down below indicates that you have the flexibility to use any of [SciPy's continous distributions](https://docs.scipy.org/doc/scipy/tutorial/stats/continuous.html). For instance, you can pass `dist=pareto(b=3.1)` or `dist=lognorm(s=1)`. Of course, while this is only one of many possible solutions, it allows a modular code implementation.

In [12]:
def algorithm_1(n: int, dist: rv_continuous, alpha: float, eps: float) -> Tuple[float, int]:
    """
    Estimates the mean of a random variable with distribution `dist` using the SMC Algorithm 1.
    Returns the estimated mean and the final sample size.
    """
    # TODO
    return

In [13]:
def prob_of_failure(algo: Callable, n: int, dist: rv_continuous, alpha: float, eps: float) -> float:
    """
    Estimates the probability of failure for a given algorithm.
    """
    # TODO
    return

## Exercise 4

Compare Algorithm 1 with the sequential Monte Carlo method in Algorithm 2, where one realization is added at a time. 

---
**Algorithm 2** One-at-a-time Sample Variance Based SMC

- **Input:** $N_0$, distribution $\lambda$, accuracy $\epsilon>0$, confidence $1-\alpha>0$.
- **Output:** $\overline X_{\epsilon,\alpha} $ (i.e, approximation of $\mathbb E [ X]$ with $X\sim\lambda$), $N$.

- Set $k=0$, generate $N_k$ i.i.d. samples $\{X_i\}_{i=1}^{N_k}$ of $\lambda$ and compute the sample variance 
    $$
    \overline \sigma^2_{N_k} := \frac{1}{N_k-1} \sum_{i=1}^{N_k} (X_i -\overline{X}_{N_k})^2.
    $$

- **while** $\bar{\sigma}_{N_k}C_{1-\alpha/2}/\sqrt{N_k} > \epsilon$ **do**
    - Set $k =k+1$ and $N_{k}=N_{k-1} + 1$.
    - Generate a new  i.i.d. sample $X^{(N_k+1)}$ of $\lambda$.
    - Compute\begin{align}
\bar{\mu}_{N_k+1}&=\frac{N_k}{N_k+1}\bar{\mu} +\frac{1}{N_k+1}X^{(N_k+1)}\\
\bar\sigma^2_{N_k+1}&=\frac{N_k-1}{N_k}\sigma^2_{N_k}+
\frac{1}{N_k+1}(X^{(N_k+1)}-\bar\mu_{N_k})^2
		\end{align}
- **end while**

- Set $N =N_k$, generate i.i.d. samples $\{X_i\}_{i=1}^{N}$ of $\lambda$ and compute the output sample mean $\overline X_{\epsilon,\alpha}$.

---


In [14]:
def algorithm_2(n: int, dist: rv_continuous, alpha: float, eps: float) -> Tuple[float, int]:
    """
    Estimate the mean of a random variable with distirbution `dist` using the SMC Algorithm 2.
    Returns the estimated mean and the final sample size.
    """
    # TODO
    return