### Exercise
Let $X\sim N(3, 16)$.
1. Find $\mathbb{P}(X < 7)$.
2. Find $\mathbb{P}(X > -2)$.
3. Find $x$ such that $\mathbb{P}(X > x) = 0.05$.
4. Find $\mathbb{P}(0\leqslant X < 4)$.
5. Find $x$ such that $\mathbb{P} ( |X| > |x| ) = 0.05$.

In [1]:
import numpy as np
from scipy.stats import norm

X = norm(loc=3, scale=4)

Theoretical computations

In [2]:
print(
    f"1. P(X < 7) = {X.cdf(7):.2f}\n"
    f"2. P(X > -2) = {1 - X.cdf(-2):.2f}\n"
    f"   Alternatively, P(X > -2) = {1 - X.cdf(-2):.2f}\n"
    f"3. P(X > x) = 0.05 occurs for x = {X.ppf(1-0.05):.2f}\n"
    f"4. P(0 < X < 4) = {X.cdf(4) - X.cdf(0):.2f}\n"
)

1. P(X < 7) = 0.84
2. P(X > -2) = 0.89
   Alternatively, P(X > -2) = 0.89
3. P(X > x) = 0.05 occurs for x = 9.58
4. P(0 < X < 4) = 0.37



The last item, number 5, is more delicate since the distribution of $X$ is *not* symmetric about zero.
We therefore search for the smallest $x$ for which $F(x) - F(-x) > 0.95$.

In [3]:
xpoints = np.linspace(-10, 10, 1000)
ypoints = X.cdf(xpoints) - X.cdf(-xpoints)
x = xpoints[next(i for i, y in enumerate(ypoints) if y > 0.95)]
print(f"5. P(|X| > x) = 0.05 for x = {x:.2f}")

5. P(|X| > x) = 0.05 for x = 9.62


Empirical validation

In [4]:
N = 1000000
samples = X.rvs(size=N)

In [5]:
print(
    f"1. {sum(samples < 7)/N * 100:.0f}% of the samples are below 7.\n"
    f"2. {sum(samples > -2)/N * 100:.0f}% of the samples are above -2.\n"
    f"3. {sum(samples > 9.58)/N * 100:.1f}% of the samples are above 9.58.\n"
    f"4. {sum(abs(samples - 2) < 2)/N * 100:.0f}% of the samples are between 0 and 4.\n"
    f"5. {sum(abs(samples) > 9.62)/N * 100:.1f}% of the samples are below -9.62 or above 9.62.\n"
)


1. 84% of the samples are below 7.
2. 89% of the samples are above -2.
3. 5.0% of the samples are above 9.58.
4. 37% of the samples are between 0 and 4.
5. 5.0% of the samples are below -9.62 or above 9.62.



We note that, for the last item, the condition is symmetric but does not equally exclude positive and negative samples in practice due to the fact that $X$ is centered at $x=3$. Indeed, the condition $|X| > 9.62$ is much more likely to exclude positive numbers than negative numbers, as shown below.

In [6]:
print(
    f"{sum(samples <-9.62)/N * 100:.1f}% of the samples are below 9.62.\n"
    f"{sum(samples > 9.62)/N * 100:.1f}% of the samples are above 9.62."
)

0.1% of the samples are below 9.62.
4.9% of the samples are above 9.62.
