## Intermediate Exam 2024 Solutions
Robert Mackevič

In [1]:
import numpy as np
from scipy.stats import gaussian_kde

# Individualized parameters
N = 6
S = 8
I1 = 5
I2 = 4

### Problem 1

#### Task (a)
$N = 6 + 8 = 14$\
$q = 6 / (6 + 8) = 3/7 \approx 0.4286$\
$j = 8$

The quantile $y_{(q)}$ is defined as the $3/7$ of $Y$\
To locate the closest order statistic to $y_{(q)}$ we can calculate:
$$q \cdot N = 3/7 \cdot 14 = 6$$
So $y_{(q)}$ approximately represents $Y_{(6)}$

We get this:
$$P\{Y_{(8)} \leq y_{(q)} < Y_{(11)}\}=0$$

If $y_{(q)}$ approximately represents $Y_{(6)}$, then $Y_{(8)} \leq y_{(q)}$ is impossible by the definition of order statistics.

---
#### Task (b)
Original dataset size: $N = 2 + 6 = 8$\
New dataset size: $N = 4 + 6 = 8$

- The mean of the dataset is affected significantly by the addition of extremes, as it is sensitive to outliers.
- The median, remains unaffected by the addition of these extremes, as it is based on the ranks of the middle observations.
- The range of the dataset increases, as the new observations extend the dataset in both directions.
---


### Problem 2

#### Task (a)

if $D \leqslant Q$, then all demand is fulfilled: $pD$\
if $D > Q$, then only $Q$ kg is sold: $pQ$\
Cost is always $cQ$ single goods are acquired regardless of demand.\
$$
G(Q) = p \cdot \mathbb{E}[\min(D, Q)] - cQ
$$
$$
G(Q) = p \int_0^Q x \, dF_D(x) + pQ \int_Q^\infty dF_D(x) - cQ
$$
$$
G(Q) = p \int_0^Q x \, dF_D(x) + pQ \cdot (1 - F_D(Q)) - cQ
$$

The optimal $Q^*$ maximizes $G(Q)$, which requires:
$$
\frac{dG(Q)}{dQ} = 0
$$
Differentiating $G(Q)$:
$$
\frac{dG(Q)}{dQ} = p \cdot F_D(Q) - p \cdot F_D(Q) + p \cdot (1 - F_D(Q)) - c = 0
$$
$$
p \cdot (1 - F_D(Q)) - c = 0
$$

Solve for $Q^*$:
$$
F_D(Q^*) = 1 - \frac{c}{p}
$$
$$
Q^* = F_D^{-1}\left(1 - \frac{c}{p}\right)
$$

---
#### Task (b)
We can use EDF of $\hat{F}_D(x)$ to estimate $G(Q)$
$$
\hat{F}_D(x) = \frac{1}{n} \sum_{i=1}^n \mathbb{1}_{\{D_i \leq x\}}
$$
where $\mathbb{1}_{\{D_i \leq x\}}$ is the indicator function, which equals $1$ if $D_i \leq x$ and $0$ otherwise.

$$
\hat{G}(Q) = p \sum_{i=1}^n \frac{D_i \cdot \mathbb{1}_{\{D_i \leq Q\}}}{n} + pQ \cdot \left[1 - \frac{\sum_{i=1}^n \mathbb{1}_{\{D_i \leq Q\}}}{n}\right] - cQ
$$

$$
\hat{Q}^* = \hat{F}_D^{-1}\left(1 - \frac{c}{p}\right)
$$

$$
\hat{G}(\hat{Q}^*) = p \sum_{i=1}^n \frac{D_i \cdot \mathbb{1}_{\{D_i \leq \hat{Q}^*\}}}{n} + p \hat{Q}^* \cdot \left[1 - \frac{\sum_{i=1}^n \mathbb{1}_{\{D_i \leq \hat{Q}^*\}}}{n}\right] - c \hat{Q}^*
$$

---
#### Task (c)
$c = 8$\
$p = 6 + 8 = 14$\
$Q = 21 + 5 = 26$\
$D = \{24,35,26,51,39,16,21,44,26\}$

Plug in the numbers into the equation from Task (b):
$$ \hat{G}(Q) = 129\frac{5}{9} $$
Find the smallest $Q^*$ such that $\hat F_D(Q^*) \geq 1-\frac{c}{p}$:
$$Q^*=26$$
Repeat the calculations as in Step 1, using $Q^*$ instead of $Q$:
$$ \hat{G}(Q^*) = 129\frac{5}{9} $$

Solution is given in the below code block.

In [2]:
c = S
p = N + S
Q = 21 + I1

data = np.array([19 + I1, 35, 26, 51, 39, 10 + N, 21, 44, 10 + 2 * S])
n = len(data)


def calculate_gain(Q):
    indicator_leq_Q = (data <= Q).astype(int)
    sum_D_leq_Q = np.sum(data * indicator_leq_Q)
    sum_indicator_leq_Q = np.sum(indicator_leq_Q)
    return p * (sum_D_leq_Q / n) + p * Q * (1 - (sum_indicator_leq_Q / n)) - c * Q


hat_G_Q = calculate_gain(Q)

F_hat_D = np.array([np.sum(data <= x) / n for x in sorted(data)])
threshold = 1 - (c / p)
Q_star_candidates = sorted(data)
Q_star_index = np.argmax(F_hat_D >= threshold)
Q_star = Q_star_candidates[Q_star_index]

hat_G_Q_star = calculate_gain(Q_star)

print(f"Results:")
print(f"G(Q) = {hat_G_Q:.4f}")
print(f"Q* = {Q_star}")
print(f"G(Q*) = {hat_G_Q_star:.4f}")


Results:
G(Q) = 129.5556
Q* = 26
G(Q*) = 129.5556


---
#### Task (d)

The variance of $\hat{G}(Q)$ can be derived as:
$$
\text{Var}(\hat{G}(Q)) = p^2 \cdot \text{Var}\left(\frac{1}{n} \sum_{i=1}^n D_i \cdot \mathbb{1}_{\{D_i \leq Q\}}\right) + p^2 Q^2 \cdot \text{Var}\left(\frac{1}{n} \sum_{i=1}^n \mathbb{1}_{\{D_i \leq Q\}}\right).
$$

1. **First Term**:
   $$
   \text{Var}\left(\frac{1}{n} \sum_{i=1}^n D_i \cdot \mathbb{1}_{\{D_i \leq Q\}}\right) = \frac{1}{n^2} \sum_{i=1}^n \text{Var}(D_i \cdot \mathbb{1}_{\{D_i \leq Q\}}),
   $$
    where:
   $$
   \text{Var}(D_i \cdot \mathbb{1}_{\{D_i \leq Q\}}) = \mathbb{E}[(D_i \cdot \mathbb{1}_{\{D_i \leq Q\}})^2] - (\mathbb{E}[D_i \cdot \mathbb{1}_{\{D_i \leq Q\}}])^2.
   $$

2. **Second Term**:

   $$
   \text{Var}\left(\frac{1}{n} \sum_{i=1}^n \mathbb{1}_{\{D_i \leq Q\}}\right) = \frac{1}{n^2} \sum_{i=1}^n \text{Var}(\mathbb{1}_{\{D_i \leq Q\}}),
   $$

   where:

   $$
   \text{Var}(\mathbb{1}_{\{D_i \leq Q\}}) = \mathbb{E}[\mathbb{1}_{\{D_i \leq Q\}}] \cdot (1 - \mathbb{E}[\mathbb{1}_{\{D_i \leq Q\}}]).
   $$
The variance of $\hat{G}(\hat{Q^*})$ follows the same form as $\hat{G}(Q)$.

Answer:
$$
\text{Var}(\hat{G}(Q)) = 6553.7668
$$
$$
\text{Var}(\hat{G}(\hat{Q^*})) = 6553.7668
$$
Solution is given in the below code block.

In [3]:
def variance_of_gain(Q):
    indicator_leq_Q = (data <= Q).astype(int)
    D_leq_Q = data * indicator_leq_Q

    # First term variance
    E_D_leq_Q = np.mean(D_leq_Q)
    E_D_leq_Q_squared = np.mean(D_leq_Q ** 2)
    var_D_leq_Q = E_D_leq_Q_squared - E_D_leq_Q ** 2

    # Second term variance
    E_indicator = np.mean(indicator_leq_Q)
    var_indicator = E_indicator * (1 - E_indicator)

    # Total variance
    return (p ** 2 / n) * var_D_leq_Q + (p ** 2 * Q ** 2 / n) * var_indicator


var_G_Q = variance_of_gain(Q)
var_G_Q_star = variance_of_gain(Q_star)

# Print results
print(f"Variance of G(Q): {var_G_Q:.4f}")
print(f"Variance of G(Q*): {var_G_Q_star:.4f}")

Variance of G(Q): 6553.7668
Variance of G(Q*): 6553.7668


---
#### Task (e)
Robustness of $\hat{G}(Q)$:
- It is sensitive to outliers because it directly incorporates $D$ values in its computation. For instance, if most demands are around 20, but one is 100, it will heavily skew the gain calculation if $Q \geq 100$.
- With fewer data points it will become less reliable because the empirical distribution may poorly approximate the true distribution.

Robustness of $\hat{Q^*}$:
- The optimal quantity depends on the quantile of the empirical distribution. Outliers can distort the EDF, shifting away from the true optimal value.
- A small sample size can lead to incorrect estimates of the quantiles of EDF.

Robustness of $\hat{G}(\hat{Q^*})$:
- Combines the effects of outliers on both $\hat{G}(Q)$ and $\hat{Q^*}$.
- Suffers from similar problems as the other estimators.
---

### Problem 3

#### Task (a)
From the context of the problem we can assume that $X$ and $Y$ follow binomial distributions:
- There are $n$ independent trials.
- Each trial results in a "success" (e.g., infection) or "failure" (e.g., no infection).
- The probability of success, $p$, is the same for each trial.

$X \sim \text{Binomial}(M, p_m)$ - $p_m$ infection rate for women.\
$Y \sim \text{Binomial}(N, p_n)$ - $p_n$ infection rate for men.

The formula for $\hat{\rho}$ is:

$$
\hat{\rho} = \frac{XN}{YM}.
$$

We rewrite it as a function of two variables:

$$
g(X, Y) = \frac{XN}{YM}.
$$

The variance is approximated by:

$$
\text{Var}(\hat{\rho}) \approx \left( \frac{\partial g}{\partial X} \right)^2 \text{Var}(X) + \left( \frac{\partial g}{\partial Y} \right)^2 \text{Var}(Y),
$$

where:
- $ \frac{\partial g}{\partial X} = \frac{N}{YM} $,
- $ \frac{\partial g}{\partial Y} = -\frac{XN}{Y^2M} $.

Substituting these derivatives:

$$
\text{Var}(\hat{\rho}) \approx \left( \frac{N}{YM} \right)^2 \text{Var}(X) + \left( -\frac{XN}{Y^2M} \right)^2 \text{Var}(Y).
$$

For binomial random variables:
- $ \text{Var}(X) = Mp_m(1 - p_m) $,
- $ \text{Var}(Y) = Np_n(1 - p_n) $.

Substitute these variances:

$$
\text{Var}(\hat{\rho}) \approx \left( \frac{N}{YM} \right)^2 \cdot Mp_m(1 - p_m) + \left( \frac{XN}{Y^2M} \right)^2 \cdot Np_n(1 - p_n).
$$

The variance estimator for $ \hat{\rho} $ is:

$$
\widehat{\text{Var}}(\hat{\rho}) = \left( \frac{N}{YM} \right)^2 \cdot M\hat{p}_m(1 - \hat{p}_m) + \left( \frac{XN}{Y^2M} \right)^2 \cdot N\hat{p}_n(1 - \hat{p}_n),
$$

---
#### Task (b)
- $H_0 : p_m = p_n$ - Infection rates for women and men are equal.
- $H_1 : p_m \ne p_n$ - Infection rates for women and men are not equal.

The test statistic is:
$$
z = \frac{\hat{\rho} - 1}{\sqrt{\widehat{\text{Var}}(\hat{\rho})}},
$$

1. Determine the critical value for a two-tailed test at a significance level of 0.1:
   $$
   z_{\alpha/2} = \Phi^{-1}(1 - \alpha/2) \quad \text{with } \alpha = 0.1.
   $$

2. Reject $ H_0 $ if:
   $$
   |z| > z_{\alpha/2}.
   $$

Otherwise, fail to reject $ H_0 $.


The statistic $ \frac{YM}{XN} $ is the reciprocal of $ \hat{\rho} $. While it could also be used to test $ H_0: p_m = p_n $, its interpretation flips because, if $ \hat{\rho} $ is large (e.g., $ p_m > p_n $), then $ \frac{YM}{XN} $ will be small, and vice versa.
For this specific test both will lead to the same decision.


------
#### Task (c)
The estimator $\hat\rho$ is sensitive to extreme values because if $Y$ is very small or close to 0, then $\hat\rho$ becomes very large or undefined. Similarly, extreme values of $X$, $M$ or $N$ can distort the statistic as well.

The maximal bias occurs when $\hat\rho$ deviates significantly from its expected value due to extreme changes in $Y$, $X$, $M$ or $N$.

When:\
$X = 6$\
$M = 50 + 6 = 56$\
$Y = 8$\
$N = 100 + 8 = 108$

$$
\hat{\rho} = \frac{6 \cdot 108}{8 \cdot 56} = \frac{648}{448} \approx 1.446.
$$

**Maximal Bias**\
Case 1: $ X $ is doubled $ X \to 2X $
$$
\hat{\rho}_{\text{bias}} = \frac{12 \cdot 108}{8 \cdot 56} = \frac{1284}{448} \approx 2.868.
$$
Case 2: $ Y \to Y/2 $
$$
\hat{\rho}_{\text{bias}} = \frac{6 \cdot 108}{4 \cdot 56} = \frac{648}{224} \approx 2.893.
$$

As $ Y \to 0 : \hat{\rho} \to \infty $.

**Break Point**
   - $ X = 0 $: Contaminating $ \frac{0}{56} = 0\% $ (statistically irrelevant, but theoretically a break).
   - $ Y = 0 $: Contaminating $ \frac{8}{56} \approx 14.3\% $ of the data.

---

### Problem 4

#### Task (a)
The parameter $\alpha_2$ represents the variance of $X$, expressed as:
$$
\alpha_2 = \mathbb{E}[X^2] - (\mathbb{E}[X])^2.
$$

The standard estimator for $\alpha_2$ is:
$$
\hat{\alpha}_2 = S_2 - S_1^2,
$$
where:
- $S_1 = \frac{1}{n} \sum_{j=1}^n X_j$: Sample mean,
- $S_2 = \frac{1}{n} \sum_{j=1}^n X_j^2$: Mean of squared observations.

The jackknife method estimates the bias by systematically removing one observation at a time. For each $i$-th observation, calculate:
$$
\hat{\alpha}_2^{(-i)} = S_2^{(-i)} - \left(S_1^{(-i)}\right)^2,
$$
where:
- $S_1^{(-i)} = \frac{1}{n-1} \sum_{j \neq i} X_j$: Mean without the $i$-th observation,
- $S_2^{(-i)} = \frac{1}{n-1} \sum_{j \neq i} X_j^2$: Mean of squares without the $i$-th observation.

The bias-adjusted jackknife estimator is:
$$
\hat{\alpha}_2 = n \cdot \hat{\alpha}_2 - (n-1) \cdot \bar{\alpha}_2,
$$
where:
- $\hat{\alpha}_2 = S_2 - S_1^2$: Original estimator,
- $\bar{\alpha}_2 = \frac{1}{n} \sum_{i=1}^n \hat{\alpha}_2^{(-i)}$: Average of leave-one-out estimators.


The variance of the jackknife estimator $\hat{\alpha}_2$ is:
$$
\hat{V}^2 = \frac{n-1}{n} \sum_{i=1}^n \left(\hat{\alpha}_2^{(-i)} - \bar{\alpha}_2\right)^2,
$$
where:
- $\hat{\alpha}_2^{(-i)} = S_2^{(-i)} - \left(S_1^{(-i)}\right)^2$: Leave-one-out estimators,
- $\bar{\alpha}_2 = \frac{1}{n} \sum_{i=1}^n \hat{\alpha}_2^{(-i)}$: Mean of leave-one-out estimators.


---
#### Task (b)
1. **Bias-Adjusted Estimator**:
   - The jackknife bias correction works by averaging leave-one-out estimators $ \hat{\alpha}_2^{(-i)} $ and adjusting the original estimator $ \hat{\alpha}_2 $ to account for the bias.

2. **Variance Estimator**:
   - The variance $ \hat{V}^2 $ reflects the variability of the leave-one-out estimators around their mean $ \bar{\alpha}_2 $, scaled for the sample size.

3. **Assumptions**:
   - Observations are i.i.d.,
   - The second moment $ \mathbb{E}[X^2] $ exists and is finite,
   - The bias of $ \hat{\alpha}_2 $ is approximately constant.

---
#### Task (c)

In [4]:
D = np.array([24, 35, 26, 51, 39, 16, 21, 44, 26])
n = len(D)

S1 = np.mean(D)
S2 = np.mean(D ** 2)

alpha_2 = S2 - S1 ** 2

alpha_2_leave_one_out = []
for i in range(n):
    D_minus_i = np.delete(D, i)
    S1_minus_i = np.mean(D_minus_i)
    S2_minus_i = np.mean(D_minus_i ** 2)
    alpha_2_leave_one_out.append(S2_minus_i - S1_minus_i ** 2)

alpha_2_leave_one_out = np.array(alpha_2_leave_one_out)
alpha_2_bar = np.mean(alpha_2_leave_one_out)

alpha_2_jackknife = n * alpha_2 - (n - 1) * alpha_2_bar

V2 = (n - 1) / n * np.sum((alpha_2_leave_one_out - alpha_2_bar) ** 2)

print(f"Standard alpha_2 estimator: {alpha_2:.4f}")
print(f"Bias-adjusted alpha_2 (jackknife): {alpha_2_jackknife:.4f}")
print(f"Variance of alpha_2 (jackknife): {V2:.4f}")

Standard alpha_2 estimator: 119.1111
Bias-adjusted alpha_2 (jackknife): 134.0000
Variance of alpha_2 (jackknife): 2152.9141


---

### Problem 5

#### Task (1)
**Step 1: Define the Test Statistic**

- For the mean: Use the sample mean $\bar{Y}$:
  $$
  T(Y) = \bar{Y}
  $$

- **For the median**: Use the sample median $\text{med}(Y)$:
  $$
  T(Y) = \text{med}(Y)
  $$

Under $H_0$, $T(Y)$ should be close to $N$.

**Step 2: Compute the Observed Test Statistic**

Compute the observed test statistic $T_{\text{obs}}$ from the given data:

$$
T_{\text{obs}} =
\begin{cases}
\bar{Y} & \text{for the mean} \\
\text{med}(Y) & \text{for the median}
\end{cases}
$$

**Step 3: Generate Bootstrap Samples**

Resample the observed data $Y_1, Y_2, \dots, Y_n$ with replacement to generate bootstrap samples $Y_1^*, Y_2^*, \dots, Y_n^*$.

For each bootstrap sample $Y^*$, compute the test statistic $T(Y^*)$:

$$
T^* =
\begin{cases}
\bar{Y}^* & \text{for the mean} \\
\text{med}(Y^*) & \text{for the median}
\end{cases}
$$

Repeat this $B$ times (e.g., $B = 1000$) to generate a bootstrap distribution of $T^*$.

**Step 4: Compute the $p$-Value**

Calculate the $p$-value as the proportion of bootstrap statistics $T^*$ that are at least as extreme as $T_{\text{obs}}$ under $H_0$:

$$
p = \frac{\# \left\{ T^* : \left| T^* - N \right| \geq \left| T_{\text{obs}} - N \right| \right\}}{B}
$$

**Step 5: Decision Rule**

If $p \leq \alpha$ (e.g., $\alpha = 0.1$), reject $H_0$: The mean or median significantly differs from $N$.

Otherwise, fail to reject $H_0$: No significant evidence to conclude $\mu(F_Y) \neq N$.

---
#### Task (2)
**Step 1: Define the Test Statistic**

- **For the mean**: Use the transformed data $h(Y)$ to compute the sample mean:

$$
T(Y) = \bar{h(Y)} = \frac{1}{n} \sum_{j=1}^{n} h(Y_j)
$$

- **For the median**: Use the transformed data $h(Y)$ to compute the sample median:

$$
T(Y) = \text{med}(h(Y))
$$

Under $H_0$, the symmetry of $h(Y)$ about $a$ ensures that $T(Y)$ reflects the central tendency of $h(Y)$, and its expected value aligns with $N$.

**Step 2: Compute the Observed Test Statistic**

Transform the data $h(Y)$ using the known function $h$. Compute the observed test statistic $T_{\text{obs}}$:

$$
T_{\text{obs}} =
\begin{cases}
\bar{h(Y)} & \text{for the mean} \\
\text{med}(h(Y)) & \text{for the median}
\end{cases}
$$

**Step 3: Center the Transformed Data**

Center the transformed data $h(Y)$ around its observed central tendency:

$$
h(Y)_{\text{centered}} = h(Y) - T_{\text{obs}} + N
$$

where $N$ is the hypothesized mean/median under $H_0$.

Resample $h(Y)_{\text{centered}}$ with replacement to generate bootstrap samples:

$$
h(Y)_{\text{centered}}^*
$$

For each bootstrap sample, compute the test statistic $T^*$:

$$
T^* =
\begin{cases}
\bar{h(Y)}^* & \text{for the mean} \\
\text{med}(h(Y)^*) & \text{for the median}
\end{cases}
$$

Repeat this $B$ times (e.g., $B = 1000$) to generate a bootstrap distribution of $T^*$.

**Step 4: Compute the $p$-Value**

Calculate the $p$-value as the proportion of bootstrap statistics $T^*$ that are at least as extreme as $T_{\text{obs}}$ under $H_0$:

$$
p = \frac{\# \left\{ T^* : \left| T^* - N \right| \geq \left| T_{\text{obs}} - N \right| \right\}}{B}
$$

**Step 5: Decision Rule**

If $p \leq \alpha$ (e.g., $\alpha = 0.1$), reject $H_0$: The mean or median significantly differs from $N$.

Otherwise, fail to reject $H_0$: No significant evidence to conclude $\mu(F_Y) \neq N$.

---

### Problem 6

#### Task (a)
Need to estimate the unknown label of the element $\omega^\circ$, which is drawn from one of two classes: $\Omega_1$ or $\Omega_2$.
$Y^\circ = 6 - \frac{8}{2} = 6 - 4 = 2$\
$b = 3 + N = 3 + 6 = 9$\
$p_1 = \frac{6}{6 + 8} = \frac{6}{14} \approx 0.4286$
$$
f_1(u) = \frac{1}{9} \left( 1 - \frac{|u|}{9} \right) \cdot 1\{ |u| < 9 \}
$$

A training sample $T = \{ Y_t \}_{t=1}^{20}$ is available for $\Omega_2$, and the differences for 16 of the closest observations to $Y^\circ$ are given as:

$$
\{-0.8, -1.1, -2.8, 3.4, -4.2, -2.7, 1.9, 2.9, -3.7, 2.0, 0.4, -1.4, 4.9, 5.7, -3.3, -4.4\}
$$

In [5]:
Y_circ = N - S / 2
b = 3 + N
p1 = N / (N + S)
p2 = 1 - p1

differences = np.array([
    -I2 / 5, -1.1, -2.8, 3.4, -4.2, -2.7, 1.9, 2.9, -3.7, 2.0, 0.4, -1.4, 4.9, 5.7, -3.3, -4.4
])

if abs(Y_circ) < b:
    f1_Y_circ = (1 / b) * (1 - abs(Y_circ) / b)
else:
    f1_Y_circ = 0

kde = gaussian_kde(differences)
f2_Y_circ = kde(Y_circ)

numerator_1 = p1 * f1_Y_circ
numerator_2 = p2 * f2_Y_circ
denominator = numerator_1 + numerator_2

P_Omega_1 = numerator_1 / denominator
P_Omega_2 = numerator_2 / denominator

classification = "Omega_1" if P_Omega_1 > P_Omega_2 else "Omega_2"

print("Classification Results:")
print(f"f1(Y^circ): {f1_Y_circ:.5f}")
print(f"f2(Y^circ): {f2_Y_circ.item():.5f}")
print(f"P(Omega_1 | Y^circ): {P_Omega_1.item():.5f}")
print(f"P(Omega_2 | Y^circ): {P_Omega_2.item():.5f}")
print(f"Classification: {classification}")

Classification Results:
f1(Y^circ): 0.08642
f2(Y^circ): 0.07545
P(Omega_1 | Y^circ): 0.46207
P(Omega_2 | Y^circ): 0.53793
Classification: Omega_2


- **$P(\Omega_1 \mid Y^\circ)$** = 0.46207: The probability that $\omega^\circ$ belongs to class $\Omega_1$ given $Y^\circ$.
- **$P(\Omega_2 \mid Y^\circ)$** = 0.53793: The probability that $\omega^\circ$ belongs to class $\Omega_2$ given $Y^\circ$.

Based on the posterior probabilities:

$$
P(\Omega_2 \mid Y^\circ) > P(\Omega_1 \mid Y^\circ)
$$

Since $P(\Omega_2 \mid Y^\circ) > P(\Omega_1 \mid Y^\circ)$, $\omega^\circ$ is classified into class $\Omega_2$.

---
#### Task (b)
The Epanechnikov kernel for $f_2$ is:
$$
K(u) = \frac{3}{4b} \left( 1 - \left( \frac{u}{b} \right)^2 \right) \cdot 1\{ |u| < b \}
$$

where $b = 3 + N = 9$.

Using the given training sample $T$ for $\Omega_2$, the kernel estimate $\hat{f}_2(Y^\circ)$ is computed as:

$$
\hat{f}_2(Y^\circ) = \frac{1}{n} \sum_{i=1}^{n} K\left( \frac{Y^\circ - Y_i}{b} \right)
$$

where $Y_i$ are the training observations for $\Omega_2$.

In [6]:
def epanechnikov_kernel(u, b):
    return (3 / (4 * b)) * (1 - (u / b) ** 2) * (np.abs(u) < b)


f2_Y_circ_kernel = np.mean([epanechnikov_kernel((Y_circ - y_i), b) for y_i in differences])

if abs(Y_circ) < b:
    f1_Y_circ = (1 / b) * (1 - abs(Y_circ) / b)
else:
    f1_Y_circ = 0

numerator_1 = p1 * f1_Y_circ
numerator_2 = p2 * f2_Y_circ_kernel
denominator = numerator_1 + numerator_2

P_Omega_1 = numerator_1 / denominator
P_Omega_2 = numerator_2 / denominator

classification = "Omega_1" if P_Omega_1 > P_Omega_2 else "Omega_2"

print("Kernel Classification Results:")
print(f"f1(Y^circ): {f1_Y_circ:.5f}")
print(f"Kernel Estimate of f2(Y^circ): {f2_Y_circ_kernel:.5f}")
print(f"P(Omega_1 | Y^circ): {P_Omega_1:.5f}")
print(f"P(Omega_2 | Y^circ): {P_Omega_2:.5f}")
print(f"Classification: {classification}")

Kernel Classification Results:
f1(Y^circ): 0.08642
Kernel Estimate of f2(Y^circ): 0.06780
P(Omega_1 | Y^circ): 0.48874
P(Omega_2 | Y^circ): 0.51126
Classification: Omega_2


- **$P(\Omega_1 \mid Y^\circ)$** = 0.48874: The probability that $\omega^\circ$ belongs to class $\Omega_1$, considering the prior probability and class density $f_1$.
- **$P(\Omega_2 \mid Y^\circ)$** = 0.51126: The probability that $\omega^\circ$ belongs to class $\Omega_2$, considering the prior probability and kernel-estimated class density $f_2$.

Since:

$$
P(\Omega_2 \mid Y^\circ) > P(\Omega_1 \mid Y^\circ)
$$

The element $\omega^\circ$ is classified into class $\Omega_2$.

---