#### Exercise 5: Variance reduction methods


1. Estimate the integral $\int_{0}^{1} e^x dx $ by simulation (the _crude_ Monte Carlo estimator). Use, for example, an estimator based on 100 samples and present your result as the point estimator and confidence interval  


In [51]:
import numpy as np
from scipy import stats
from matplotlib import pyplot as plt

#### Crude Monte Carlo estimator

We can interpret this interval as

$$
\theta = \int_{0}^{1} e^{x} \: dx = \mathrm{E}(e^{u}) \hspace{0.3cm} , \hspace{0.3cm} U \in \mathrm{U}(0,1)
$$

To estimate the integral obtain the sample of the random variable $e^{U}$ and take the average.

$$
X_{i} = e^{U_{i}} \hspace{0.3cm} , \hspace{0.3cm} \bar{X} = \frac{\sum_{i=1}^{n} X_{i}}{n}
$$

In [None]:
# the integral

# no. of samples
n = 100

# no. of samples obtained from uniform distribution
U = np.random.uniform(0, 1, n)

#  sample of the random variable e^U
eU = np.exp(U)

# Point estimate 𝜃_hat
X_bar = sum(eU) / n

# analytic solution
analytic_sol = np.exp(1) - 1


In [None]:
# CI for 95%
def CI(data, alpha = 0.05):
    mean = np.mean(data)                                          # mean
    std = np.std(data)                                            # standard deviation
    variance = np.var(data)                                       # variance
    chi_squared = stats.t.ppf(1 - alpha/2, n -1)                  # variance
    lower = mean - chi_squared * (std / np.sqrt(len(data)))       # lower
    upper = mean + chi_squared * (std / np.sqrt(len(data)))       # upper    
    return mean, lower, upper, variance

# Obtain the CI and variance from crude method
mean, lower, upper, variance = CI(np.exp(U))

In [70]:
print(f"The estimate for the integral of exp(x) from 0 -> 1 is: {X_bar:.4f}")
print(f"The variance of the estimate is: {variance:.4f}")
print(f"The 95% CI for the crude Monte Carlo method is: [{lower:.4f}, {upper:.4f}]")
print(f"The analytical solution for the integral: {analytic_sol:.4f}")

The estimate for the integral of exp(x) from 0 -> 1 is: 1.7142
The variance of the estimate is: 0.2488
The 95% CI for the crude Monte Carlo method is: [1.6152, 1.8131]
The analytical solution for the integral: 1.7183


Confident Interval

In [21]:
# Degrees of freedom
df = 100 - 1

# Significance level
alpha = 0.05

# Calculate critical value
chi_squared = chi2.ppf(1 - alpha, df)
print("The critical value is:", chi_squared)

# Calculate the p-value under the observed test statistic
p_value = chi2.sf(T, df)
print("The p-value is:", p_value)

NameError: name 'chi2' is not defined

2. Estimate the integral $\int_{0}^{1} e^x dx $  using antithetic variables, with comparable computer ressources.

3. Estimate the integral $\int_{0}^{1} e^x dx $ using a control variable, with comparable computer ressources.

4. Estimate the integral $\int_{0}^{1} e^x dx $ using stratified sampling, with
comparable computer ressources.

5. Use control variates to reduce the variance of the estimator in
exercise 4 (Poisson arrivals).

6. Demonstrate the effect of using common random numbers in
exercise 4 for the difference between Poisson arrivals (Part 1) and a
renewal process with hyperexponential interarrival times. Remark:
You might need to do some thinking and some re-programmin

7. For a standard normal random variable Z ∼ N(0, 1) using the crude
Monte Carlo estimator estimate the probability Z > a. Then try
importance sampling with a normal density with mean a and
variance σ
2
. For the expirements start using σ
2 = 1, use different
values of a (e.g. 2 and 4), and different sample sizes. If time
permits experiment with other values for σ
2
. Finally discuss the
efficiency of the methods.

8. Use importance sampling with g(x) = λ exp (−λ ∗ x) to calculate
the integral $\int_{0}^{1} e^x dx $ of Question 1. Try to find the optimal value of
λ by calculating the variance of $ h(X)f(X)/g(X) $ and verify by
simulation. Note that importance sampling with the exponential
distribution will not reduce the variance.

9. For the Pareto case derive the IS estimator for the mean using the
first moment distribution as sampling distribution. Is the approach
meaningful? and could this be done in general? With this insight
could you change the choice of $ g(x) $ in the previous question
(Question 8) such that importance sampling would reduce the
variance? You do not need to implement this, as long as you can
argue, what should happe