# B7 Normal distribution

## Support libraries and functions

In [1]:
from scipy import stats
import m248_support as m248

## 1 Linear functions of normal variables

### Note

If a random variable $X$ has a normal distribution with mean $\mu$ and variance $\sigma^{2}$, and $Y=aX+b$ for constants $a$ and $b$, then

$$
Y=aX+b \sim N(a \mu + b, a^{2} \sigma^{2}).
$$

### Example 7.1.1

Suppose that $X \sim N(4,2)$, and that the random variable $Y$ is related to $X$ by the equation

$$
Y = 0.2X + 5.
$$

**(a)** What is the mean of $Y$?

**(b)** What is the variance of $Y$?

#### (a)

If $Y=0.2X + 5$, then $E(Y) = 0.2E(X) + 5 = \cdots$

In [2]:
m248.get_y_mean(x_mean=4, a=0.2, b=5)

5.8

#### (b)

If $Y=0.2X + 5$, then $V(Y) = (0.2^{2}) 2 = \cdots$

In [3]:
round(m248.get_y_var(x_var=2, a=0.2), 4)

0.08

### Example 7.1.2

Suppose the fuel consumption of a car can be modelled by a normal distribution with mean 48 miles per gallon (mpg) and variance 4.5 mpg$^{2}$.

Given that 1 mpg corresponds to approximately 0.354 kilometres per litre (km/l), calculate

**(a)** the mean,

**(b)** the variance

of the distribution of the car’s fuel consumption in km/l

#### Define the transformation

Let $G$ be a random variable that represents the fuel consumption of a car in miles per gallon, where $M \sim N(48, 4.5)$.

Since 1 mpg is approximately 0.354 km/l, the car consumption in km/l, $L$, is calculated as

$$
L = 0.354G.
$$

#### (a)

If $L=0.354G$, then $E(L) = 0.354E(G) = \cdots$

In [4]:
round(m248.get_y_mean(x_mean=48, a=0.354, b=0), 4)

16.992

#### (b)

If $L=0.354G$, then $V(L) = (0.354^{2}) 4.5 = \cdots$

In [5]:
round(m248.get_y_var(x_var=4.5, a=0.354), 4)

0.5639

### Example 7.1.3

Three independent random variables $X_{1}$, $X_{2}$ and $X_{3}$ are all normally distributed so that $X_{1} \sim N(4,6)$, $X_{2} \sim N(-8,2)$ and $X_{3} \sim N(-1,1)$.

Let another random variable $S$ be $S = X_{1} + X_{2} + X_{3}$.

**(a)** What is the mean of $S$?

**(b)** What is the variance of $S$?

#### Note

If $X_{1}, X_{2}, \ldots , X_{n}$ are independent normally distributed random variables, with means $\mu_{1}, \mu_{2}, \ldots , \mu_{n}$ and variances $\sigma_{1}, \sigma_{2}, \ldots , \sigma_{n}$, then

$$
X_{1} + X_{2} + \cdots + X_{n} \sim N(
    \mu_{1} + \mu_{2} + \cdots + \mu_{n}, \>
    \sigma_{1} + \sigma_{2} + \cdots + \sigma_{n}).
$$

#### (a)

If $S = X_{1} + X_{2} + X_{3}$, then

$$
E(S) = E(X_{1}) + E(X_{2}) + E(X_{3}) = \cdots
$$

In [6]:
4-8-1

-5

#### (b)

If $S = X_{1} + X_{2} + X_{3}$, then

$$
V(S) = V(X_{1}) + V(X_{2}) + V(X_{3}) = 6 + 2 + 1 = \cdots
$$

In [7]:
6+2+1

9

### Example 7.1.4

Two independent random variables $X$ and $Y$ are both normally distributed so that $X \sim N(-3,3)$ and $Y \sim N(-4,4)$.

**(a)** What is the mean of $X-Y$?

**(b)** What is the variance of $X-Y$?

#### (a)

If $X$ and $Y$ are two independent normally distributed random variables, then $X-Y$ is also normally distributed with $E(X-Y) = E(X) - E(Y)$, so

$$
E(X-Y) = E(X) - E(Y) = -3 - (-4) = 1.
$$

#### (b)

If $X$ and $Y$ are two independent normally distributed random variables, then $X-Y$ is also normally distributed with $V(X-Y) = V(X) + V(Y)$, so

$$
V(X-Y) = V(X) + V(Y) = 3 + 4 = 7.
$$

### Example 7.1.5

Let $X_{1}$ and $X_{2}$ denote the weights, in grams, of two randomly selected 750-gram bags of carrots, each of which is normally distributed with mean 751 and standard deviation 1.1. The random variable $Y = X_{1} + X_{2}$ is the total weight of these two bags of carrots.

Let $W$ denote the weight, in grams, of one randomly selected 1500-gram bags of carrots: $W$ is normally distributed with mean 1501 and standard deviation 2.

The random variables $X_{1}$ and $X_{2}$ can be assumed to be independent of each other (since the two bags are selected randomly), and random variables $Y$ and $W$ can also be considered independent of each other.

**(a)** What is the distribution of $Y$, the sum of two randomly selected 750-gram bags of carrots?

**(b)** What is the distribution of the difference $Y-W$ between two 750-gram bags of carrots and one 1500-gram bag of carrots, selected at random?

#### (a)

If $Y = X_{1} + X_{2}$, then

$$
E(Y) = E(X_{1}) + E(X_{2}) = 751 + 751 = 1502,
$$

and

$$
V(Y) = V(X_{1}) + V(X_{2}) = 1.1^{2} + 1.1^{2} = 2.42.
$$

So, $Y \sim N(1502, 2.42)$.

#### (b)

If $Y - W$, then

$$
E(Y-W) = E(Y) - E(W) = 1502 - 1501 = 1,
$$

and

$$
V(Y-W) = V(Y) + V(W) = 2.42 + 4 = 6.42.
$$

Therefore $Y-W \sim N(1, 6.42)$.

### Example 7.1.6 (June 2018)

Independent random variables $X$ and $Y$ have normal distributions $X \sim N(2, 4)$ and $Y \sim N(1, 3)$.

State the distribution of $U = X - Y$.

If $U = X-Y$, then

$$
E(U) = E(X-Y) = E(X) - E(Y) = 2 - 1 = 1,
$$

and

$$
V(U) = V(X-Y) = V(X) + V(Y) = 4+3 = 7.
$$

Therefore $U \sim N(1,7)$.

## 2 Calculating probabilities

### Note

For $Z \sim N(0,1)$

$$
P(Z \leq z) = P(Z < z) = \Phi{(z)},
$$

and because of the symmetry of $N(0,1)$, for any $z > 0$,

$$
P(Z \leq -z) = P(Z \geq z)
$$

so that

$$
\Phi(-z) = 1 - \Phi(z).
$$

### Example 7.2.1

Let $Z$ be a random variable distributed $Z \sim N(0,1)$.

Calculate

**(a)** $P(-1.25 \leq Z \leq 2)$

**(b)** $P(-1.5 \leq Z \leq 0.5)$

**(c)** $P(-1.86 \leq Z \leq 1.26)$

In [8]:
# declare the distribution
x = stats.norm()

#### (a)

The probability

$$
P(-1.25 \leq Z \leq 2) = \Phi(2) - \Phi(-1.25)\> = \{ \Phi(2) - (1 - \Phi(1.25)) \} = \cdots
$$

In [9]:
round(x.cdf(x=2) - x.cdf(-1.25), 4)

0.8716

#### (b)

The probability

$$
P(-1.5 \leq Z \leq 0.5) = \Phi(0.5) - \Phi(-1.5) = \{ \Phi(0.5) - (1 - \Phi(1.5)) \} = \cdots
$$

In [10]:
round(x.cdf(x=0.5) - x.cdf(-1.5), 4)

0.6247

#### (b)

The probability

$$
P(-1.86 \leq Z \leq 1.26) = \Phi(1.26) - \Phi(-1.86) = \{ \Phi(1.26) - (1 - \Phi(1.86)) \} = \cdots
$$

In [11]:
round(x.cdf(x=1.26) - x.cdf(-1.86), 4)

0.8647

### Example 7.2.2

Suppose that the birth weight (in kilograms) of newborn babies in a population may be reasonably modelled by a normal distribution with mean 3.4 and standard deviation 0.5.

**(a)** According to the model, what  is the proportion of birth weights that are between 2.9 and 3.9 kilograms?

**(b)** According to the model, what is the proportion of birth weights in the population of newborn babies that are less than 3.9 kilograms?

In [12]:
# declare the std normal distribution
std_norm = stats.norm()

#### (a)

The probability $P(2.9 < X < 3.9) = P(z_{1} < Z < z_{2})$, where $z_{1}$ is

In [13]:
z1 = m248.get_z_from_x(x=2.9, mean=3.4, std=0.5)
z1

-1.0

and $z_{2}$ is

In [14]:
z2 = m248.get_z_from_x(x=3.9, mean=3.4, std=0.5)
z2

1.0

Therefore $P(2.9 < X < 3.9) = P(-1 < Z < 1) = \Phi (1) - \Phi (-1) = \cdots$

In [15]:
round(std_norm.cdf(x=z2) - std_norm.cdf(x=z1), 4)

0.6827

#### (b)

The probability $P(X < 3.9) = P(Z < z)$, where $z$ is

In [16]:
z = m248.get_z_from_x(x=3.9, mean=3.4, std=0.5)
z

1.0

Therefore $P(X < 3.9) = P(Z<1) = \Phi(1) = \cdots$

In [17]:
round(std_norm.cdf(x=z), 4)

0.8413

### Example 7.2.3

The ages (in years) of women in a clinical trial may reasonably be modelled by a normal distribution with mean 43.4 and standard deviation 1.4.

**(a)** According to the model, what proportion of ages of women in the clinical trial are between 40 and 43 years?

**(b)** According to the model, what proportion of ages of women in the clinical trial are less than 45 years?

In [18]:
# declare the std normal distribution
std_norm = stats.norm()

#### (a)

The probability $P(40 < X < 43) = P(z_{1} < Z < z_{2})$, where $z_{1}$ is

In [19]:
z1 = m248.get_z_from_x(x=40, mean=43.4, std=1.4)
z1

-2.43

and $z_{2}$ is

In [20]:
z2 = m248.get_z_from_x(x=43, mean=43.4, std=1.4)
z2

-0.29

Therefore $P(40 < X < 43) = P(-2.429 < Z < -0.286) = \Phi (-0.286) - \Phi (-2.429) = \cdots$

In [21]:
round(std_norm.cdf(x=z2) - std_norm.cdf(x=z1), 4)

0.3784

#### (b)

The probability $P(X < 45) = P(Z < z)$, where $z$ is

In [22]:
z = m248.get_z_from_x(x=45, mean=43.4, std=1.4)
z

1.14

Therefore $P(X < 45) = P(Z < 1.143) = \Phi(1.143) = \cdots$

In [23]:
round(std_norm.cdf(x=z), 4)

0.8729

#### Example 7.2.4

An IQ test is designed so that in the general population the variability
in the scores attained should be normally distributed with mean 100
and standard deviation 15.

Calculate the probability that a randomly chosen person from the general
population will have an IQ between 90 and 125.

In [24]:
# declare the std normal distribution
std_norm = stats.norm()

The probability $P(90 < X < 125) = P(z_{1} < Z < z_{2})$, where $z_{1}$ is

In [25]:
z1 = m248.get_z_from_x(x=90, mean=100, std=15)
z1

-0.67

and $z_{2}$ is

In [26]:
z2 = m248.get_z_from_x(x=125, mean=100, std=15)
z2

1.67

Therefore $P(90 < X < 125) = \Phi (1.67) - \Phi (-0.67) = \cdots$

In [27]:
round(std_norm.cdf(x=z2) - std_norm.cdf(x=z1), 4)

0.7011

## 3 Calculating quantiles

### Example 7.3.1

Let $Z$ be a random variable distributed $Z \sim N(0,1)$.

Calculate

**(a)** $q_{0.996}$

**(b)** $q_{0.96}$

**(c)** $q_{0.6}$

**(d)** $q_{0.72}$

In [28]:
# declare the distribution
z = stats.norm()

#### (a)

In [29]:
round(z.ppf(q=0.996), 4)

2.6521

#### (b)

In [30]:
round(z.ppf(q=0.96), 4)

1.7507

#### (c)

In [31]:
round(z.ppf(q=0.6), 4)

0.2533

#### (d)

In [32]:
round(z.ppf(q=0.72), 4)

0.5828

### Example 7.3.2

Let $Z$ be a random variable distributed $Z \sim N(0,1)$.

Calculate

**(a)** $q_{0.003}$

**(b)** $q_{0.15}$

**(c)** $q_{0.22}$

**(d)** $q_{0.4}$

In [33]:
# declare the distribution
z = stats.norm()

#### Note

Because of the symmetry of $N(0,1)$, the quantile $q_{\alpha}$ for $\alpha < 0.5$ is given by $q_{\alpha} = - q_{1 - \alpha}$.

#### (a)

In [34]:
round(z.ppf(q=0.003), 4)

-2.7478

#### (b)

In [35]:
round(z.ppf(q=0.15), 4)

-1.0364

#### (c)

In [36]:
round(z.ppf(q=0.22), 4)

-0.7722

#### (d)

In [37]:
round(z.ppf(q=0.4), 4)

-0.2533

### 7.3.3

Suppose that the birth weight (in kilograms) of newborn babies may reasonably be modelled by a normal distribution with mean 3.4 and standard deviation 0.5.

Calculate the value of the birth weight (in kilograms) that, according to this model, is exceeded by only 3% of newborn babies.

#### Note

Let $X$ be some normal random variable with distribution $X \sim N(\mu, \sigma^{2})$, and let $Z$ be the standard normal distribution with $Z \sim N(0, 1)$.

If $q_{\alpha}$ is the $\alpha$-quantile of $N(0,1)$, then then $\alpha$-quantile, $x$, of $N(\mu, \sigma^{2})$ is given by

$$
x = \sigma q_{\alpha} + \mu.
$$

In [38]:
# declare the distribution
w = stats.norm(loc=3.4, scale=0.5)

In [39]:
std_norm = stats.norm()

#### (a)

The $0.97$-quantile $q_{0.97}$ for $Z \sim N(0,1)$ is

In [43]:
a = round(std_norm.ppf(q=0.97), 3)
a

1.881

Therefore the $0.97$-quantile for $X \sim N(3.4, 0.5^{2})$ is

$$
x = \sigma (1.881) + 3.4 =
$$

In [44]:
m248.get_x_from_a(a=a, mean=3.4, std=0.5)

4.34

### Example 7.3.4

Suppose that the size (in inches) of skipjack tuna may reasonably be modelled by a normal distribution with mean 22 and standard deviation 2.

According to the model,

**(a)** what is the interquartile range of the distribution of skipjack tuna in this population?

**(b)** what size $x$ is such that only 5% of skipjack tuna in the population are bigger than this size?

In [45]:
# declare the standard normal
std_norm = stats.norm()

#### (a)

The iqr is defined as $q_{U} - q_{L} = q_{0.75} - q_{0.25}$. For the standard normal distribution, $q_{0.75} = \cdots$

In [52]:
q_U = round(std_norm.ppf(q=0.75), 4)
q_U

0.6745

and $q_{0.75} = \cdots$

In [51]:
q_L = round(std_norm.ppf(q=0.25), 4)
q_L

-0.6745

Therefore the iqr of the distribution of skipjack tuna in the population, in inches, will be

In [63]:
# upper quartile of skipjack
x_U = m248.get_x_from_a(a=q_U, mean=22, std=2)

In [64]:
# lower quartile of skipjack
x_L = m248.get_x_from_a(a=q_L, mean=22, std=2)

In [66]:
# get iqr
round(x_U - x_L, 3)

2.698