# HW 1-2 Vladimir Saraikin

## Task 1

1) Bias of this estimator: $\hat{\theta}$ is defined as $Bias(\hat{\theta}) = E[\hat{\theta}] - \theta$.
   - $E[\hat{\theta}] = \int_{0}^{\theta} x \cdot f_{\hat{\theta}}(x) \, dx$
     where $f_{\hat{\theta}}(x) = n x^{n-1} / \theta^n$, the PDF of the maximum of $n$ uniform distributions.
   - $E[\hat{\theta}] = \int_{0}^{\theta} x \cdot \frac{n x^{n-1}}{\theta^n} \, dx = \frac{n}{\theta^n} \int_{0}^{\theta} x^n \, dx = \frac{n}{\theta^n} \cdot \frac{\theta^{n+1}}{n+1} = \frac{n \theta}{n+1}$
   - $Bias(\hat{\theta}) = \frac{n \theta}{n+1} - \theta = \theta \left(\frac{n}{n+1} - 1\right) = -\frac{\theta}{n+1}$

2) SE of the estimator:
   - SE of an estimator is the STD of the estimator, $SE(\hat{\theta}) = \sqrt{Var(\hat{\theta})}$.
   - Variance of $\hat{\theta}$:
     - $Var(\hat{\theta}) = E[\hat{\theta}^2] - (E[\hat{\theta}])^2$
     - $E[\hat{\theta}^2]$:
       $E[\hat{\theta}^2] = \frac{n \theta^2}{n+2}$
     - $Var(\hat{\theta}) = \frac{n \theta^2}{n+2} - \left(\frac{n \theta}{n+1}\right)^2 = \theta^2 \left(\frac{n}{n+2} - \frac{n^2}{(n+1)^2}\right)$
   - SE: $SE(\hat{\theta}) = \sqrt{Var(\hat{\theta})} = \theta \sqrt{\frac{n+1}{n+2} - \frac{n}{(n+1)^2}}$

3) MSE of the estimator:
   - $MSE(\hat{\theta}) = E[(\hat{\theta} - \theta)^2]$.
   - $MSE(\hat{\theta}) = Var(\hat{\theta}) + (Bias(\hat{\theta}))^2$
   - $MSE(\hat{\theta}) = Var(\hat{\theta}) + \left(-\frac{\theta}{n+1}\right)^2 = \theta^2 \left(\frac{n+1}{n+2} - \frac{n}{(n+1)^2} + \frac{1}{(n+1)^2}\right)$

4) Is this estimator consistent?
   - An estimator is consistent if it converges in probability to the *true* parameter value as the sample size $n$ goes to infinity.
   - $\lim_{n \to \infty} Bias(\hat{\theta}) = 0 \quad and \quad \lim_{n \to \infty} Var(\hat{\theta}) = 0$
   - Thus, the estimator $\hat{\theta}$ is consistent.

5) Is this estimator strongly consistent?
   - An estimator is strongly consistent if it converges almost surely to the true parameter value as the sample size $n$ goes to infinity.
     $\lim_{n \to \infty} P(\hat{\theta} = \theta) = 1$
     The estimator $\hat{\theta}$ is strongly consistent because the probability that $\hat{\theta}$ equals $\theta$ approaches 1 as $n$ increases.

## Task 2

1) Bias of this estimator:
   - Since $X_i \sim U[0, \theta]$, $E[X_i] = \frac{\theta}{2}$.
   - $E[\hat{\theta}] = E\left[2 \cdot \frac{1}{n} \sum_{k=1}^{n} X_k \right] = 2 E[X_i]$ because $X_i$ are identically distributed.
   - $E[\hat{\theta}] = 2 \cdot \frac{\theta}{2} = \theta$.
   - Therefore, $\text{Bias}(\hat{\theta}) = E[\hat{\theta}] - \theta = \theta - \theta = 0$.

2) SE of this estimator:
   - Since the $X_i$ are i.i.d., $\text{Var}(X_i) = \frac{\theta^2}{12}$.
   - The $Var$ of the average of the $X_i$'s is $\text{Var}\left(\frac{1}{n} \sum_{k=1}^{n} X_k\right) = \frac{1}{n^2} \sum_{k=1}^{n} \text{Var}(X_k) = \frac{\theta^2}{12n}$.
   - $\text{Var}(\hat{\theta}) = 4 \cdot \text{Var}\left(\frac{1}{n} \sum_{k=1}^{n} X_k\right) = \frac{\theta^2}{3n}$.
   - $\text{SE}(\hat{\theta}) = \sqrt{\text{Var}(\hat{\theta})} = \sqrt{\frac{\theta^2}{3n}} = \frac{\theta}{\sqrt{3n}}$.

3) MSE of the estimator:
   - $\text{MSE}(\hat{\theta}) = \text{Var}(\hat{\theta}) + (\text{Bias}(\hat{\theta}))^2$.
   - As $\text{Bias}(\hat{\theta}) = 0$, $\text{MSE}(\hat{\theta}) = \text{Var}(\hat{\theta})$.
   - Therefore, $\text{MSE}(\hat{\theta}) = \frac{\theta^2}{3n}$.

4) Is this estimator consistent?
   - An estimator is consistent if both $\text{Bias}(\hat{\theta}) \to 0$ and $\text{Var}(\hat{\theta}) \to 0$ as $n \to \infty$.
   - Since $\text{Bias}(\hat{\theta}) = 0$ and $\text{Var}(\hat{\theta}) = \frac{\theta^2}{3n} \to 0$ as $n \to \infty$, the estimator $\hat{\theta}$ is consistent.

5) Is this estimator strongly consistent?
   - For the estimator $\hat{\theta}$, $SLLN$ implies that $\frac{1}{n} \sum_{k=1}^{n} X_k \to E[X_i]$ almost surely as $n \to \infty$.
   - Since $E[X_i] = \frac{\theta}{2}$, $2 \cdot \frac{1}{n} \sum_{k=1}^{n} X_k \to \theta$ almost surely as $n \to \infty$.
   - Thus, the estimator $\hat{\theta}$ is strongly consistent.

## Task 5

In [3]:
import numpy as np
from scipy.stats import norm

n = 100  # observations
m = 1000  # simulations
confidence_level = 0.95

# calculate epsilon for the DKW inequality
epsilon = np.sqrt(np.log(2 / (1 - confidence_level)) / (2 * n))

epsilon

0.13581015157406193

In [4]:
from scipy.stats import norm

def ecdf(data):
    """
    compute ECDF for a one-dimensional array of measurements
    """
    # mumber of data points
    n = len(data)
    # x-data for the ECDF
    x = np.sort(data)
    # y-data for the ECDF
    y = np.arange(1, n+1) / n
    return x, y

contain_true_cdf = 0
contain_ecdf = 0

true_cdf = norm.cdf

for _ in range(m):
    samples = np.random.normal(0, 1, n)
    x, y_ecdf = ecdf(samples)
    lower_band = y_ecdf - epsilon
    upper_band = y_ecdf + epsilon

    # check if true CDF is within the bands
    y_true_cdf = true_cdf(x)
    if np.all((y_true_cdf >= lower_band) & (y_true_cdf <= upper_band)):
        contain_true_cdf += 1

    # check if ECDF is within its own band (this is guaranteed)
    if np.all((y_ecdf >= lower_band) & (y_ecdf <= upper_band)):
        contain_ecdf += 1

(contain_true_cdf / m, contain_ecdf / m)


(0.967, 1.0)