# Exercise 2: Bayes Risk with absolute loss

## Question 0
Let f be the following function:

$
f^* :
\begin{cases}
\mathcal{X} \to \mathcal{Y}\\
x \mapsto x^3
\end{cases}
$

We can see that it has a zero derivative at $x=0$ but this point isn't a local extremum

## Question 1

For this question let's take the same supervised learning setting as the first exercise: 

In [6]:
import numpy as np

SAMPLE_SIZE = 1000

# First we draw our input samples from the Poisson distribution
X = np.random.poisson(lam=2, size=SAMPLE_SIZE)

# Then Y is taken from the Gamma distribution with parameters X and 2000
Y = np.where(X > 0, np.random.gamma(shape=X, scale=1000), 0.0)

# Here we compute the predictions of the Bayes estimator
Y_pred_Bayes = 1000 * X

# Here we create the predictions of f bar
Y_pred_f_bar = np.array([2000 for i in range(SAMPLE_SIZE)])

# Now we can compute the empirical risks using the squared of both estimators
Bayes_emp_risk = np.mean((Y_pred_Bayes - Y) ** 2).sum()
f_bar_emp_risk = np.mean((Y_pred_f_bar - Y) ** 2).sum()

# We also compute the empirical risks using the absolute loss
f_bar_asb_risk = np.mean(np.abs(Y_pred_f_bar - Y)).sum()
Bayes_abs_risk = np.mean(np.abs(Y_pred_Bayes - Y)).sum()

print("Mesured empirical risks with squared loss:")
print(f"Bayes: {Bayes_emp_risk}")
print(f"F bar: {f_bar_emp_risk}")
print("--------------------------------------------")
print("Mesured empirical risks with absolute loss:")
print(f"F bar abs: {f_bar_asb_risk}")
print(f"Bayes: {Bayes_abs_risk}")

Mesured empirical risks with squared loss:
Bayes: 2337430.0598996803
F bar: 4539020.707325756
--------------------------------------------
Mesured empirical risks with absolute loss:
F bar abs: 1609.640461514917
Bayes: 1048.8806573790391


Here we can see that $R_{l_{absolute}} \neq R_{l_{squared}}$ for both the Bayes estimator and $\bar{f}$

## Question 2

We want to find $f^*_{l_{absolute}}(x) = \underset{z \in \mathbb{R}}{argmin}(\mathbb{E}[|y-z|X=x])=\underset{z \in \mathbb{R}}{argmin}(g(z))$

with

$g(z) = \int_{y \in \mathbb{R}}|y - z|p_{Y|X=x}(y)dy$

\
\
**1. Let's find the minimum of $g(z)$ by finding its derivative:**

First, $g(z)$ is not differentiable at $y=z$, we split the integral at $y=z$:

$g(z) = \int_{-\infty}^{z}|y - z|p_{Y|X=x}(y)dy + \int_{z}^{+\infty}|y - z|p_{Y|X=x}(y)dy$

$g(z) = \int_{-\infty}^{z}(z - y)p_{Y|X=x}(y)dy + \int_{z}^{+\infty}(y - z)p_{Y|X=x}(y)dy$

\
\
**2. Now we can compute the two derivatives:**

- When $y < z$:\
    $\frac{d}{dz}\int_{-\infty}^{z}(z - y)p_{Y|X=x}(y)dy = \int_{-\infty}^{z}p_{Y|X=x}(y)dy = F(z)$

- When $y > z$:\
    $\frac{d}{dz}\int_{z}^{+\infty}(y - z)p_{Y|X=x}(y)dy = -\int_{z}^{+\infty}p_{Y|X=x}(y)dy = -(1-F(z))$

So adding the two gives:

$g^{'}(z) = F(z)-(1-F(z))$

\
\
**3. We find the extremum with the derivative:**

If we need an extremum we have $g^{'}(x) = 0 \implies F(z) = 1 - F(z) \implies 2F(z) = 1 \implies F(z) = \frac{1}{2}$

The derivative that cancels when $F(x) = \frac{1}{2}$ means that z must be the median of conditional distribution of $Y∣X=x$

\
\
**4. Let's check that the solution we found is a minimum:**

- When $y < median$: $g^{'}(x) < 0$
- When $y > median$: $g^{'}(x) > 0$

So we have found indeed a minimum