## Øving 6: Normalfordeling

In [174]:
import numpy as np
import scipy.stats as stats

# Oppgave 1


**a) Probability $P(Z < 1.71)$:**

we can look up the cumulative probability for $Z = 1.71$ in the standard normal distribution table. 

Using a table or calculator, we will find that $P(Z < 1.71) \approx 0.9564$.

So, $P(Z < 1.71) \approx 0.9564$.

**b) Probability $P(1.68 < Z < 1.71)$:**

To find this probability, we can calculate the difference between $P(Z < 1.71)$ and $P(Z < 1.68)$.

Using the previous result from part (a), $P(Z < 1.71) \approx 0.9564$.

Now, we need to find $P(Z < 1.68)$ using the standard normal distribution table or calculator.

$P(Z < 1.68) \approx 0.9535$.

Finally, subtract $P(Z < 1.68)$ from $P(Z < 1.71)$ to get $P(1.68 < Z < 1.71)$:

$$P(1.68 < Z < 1.71) = P(Z < 1.71) - P(Z < 1.68)$$

$$P(1.68 < Z < 1.71) \approx 0.9564 - 0.9535 \approx 0.0029$$

So, $P(1.68 < Z < 1.71) \approx 0.0029$.

In [175]:
# For part (a): Probability P(Z < 1.71)
z_a = 1.71
probability_less_than_1_71 = stats.norm.cdf(z_a)

# For part (b): Probability P(1.68 < Z < 1.71)
z_lower_b = 1.68
probability_less_than_1_68 = stats.norm.cdf(z_lower_b)

# Calculate the difference to find the probability between 1.68 and 1.71
probability_between_1_68_and_1_71 = round(probability_less_than_1_71, 4) - round(probability_less_than_1_68, 4)

print(f"Probability P(Z < 1.71): {probability_less_than_1_71:.4f}")
print(f"Probability P(1.68 < Z < 1.71): {probability_between_1_68_and_1_71:.4f}")


Probability P(Z < 1.71): 0.9564
Probability P(1.68 < Z < 1.71): 0.0029


# Oppgave 2

To find the probabilities for the normal distribution with a mean ($\mu$) of 3.3 and a standard deviation ($\sigma$) of 2.5, we can first standardize the random variable $X$ to a standard normal variable $Z$ and then use the standard normal distribution tables or a calculator.

**a) Probability $P(X > 2.5)$:**

First, standardize the value 2.5 to $Z$ using the formula:

$$Z = \frac{X - \mu}{\sigma}$$

Where:
- $X = 2.5$ (the value we want to find the probability for)
- $\mu = 3.3$ (mean)
- $\sigma = 2.5$ (standard deviation)

Now, calculate $Z$:

$$Z = \frac{2.5 - 3.3}{2.5}$$

Using this value of $Z$, we can find $P(Z > Z_{\text{calculated}})$ from the standard normal distribution tables.

In [176]:

# Parameters for the original distribution
mu = 3.3
sigma = 2.5

# Value we want to find the probability for
x = 2.5

# Standardize x to Z
z_calculated = (x - mu) / sigma

# Calculate the probability P(X > 2.5)
probability_x_greater_than_2_5 = 1 - stats.norm.cdf(z_calculated)

print(f"Probability P(X > 2.5): {probability_x_greater_than_2_5:.4f}")

Probability P(X > 2.5): 0.6255


**b) Probability $P(1.5 < X < 2.5)$:**

Standardize both values 1.5 and 2.5 to $Z$ using the same formula as above. Then, we can find $P(Z_{\text{lower}} < Z < Z_{\text{upper}})$ from the standard normal distribution tables.

In [177]:
# Values we want to find the probability for
x_lower = 1.5
x_upper = 2.5

# Standardize x_lower and x_upper to Z
z_lower = (x_lower - mu) / sigma
z_upper = (x_upper - mu) / sigma

# Calculate the probability P(1.5 < X < 2.5)
probability_between_1_5_and_2_5 = stats.norm.cdf(z_upper) - stats.norm.cdf(z_lower)

print(f"Probability P(1.5 < X < 2.5): {probability_between_1_5_and_2_5:.4f}")

Probability P(1.5 < X < 2.5): 0.1387


# Oppgave 3

To solve these probability questions, we can use the normal distribution with the given mean ($\mu$) and standard deviation ($\sigma$).

**a) Probability that the sample value exceeds the threshold value $P(Y \geq 12)$:**

First, standardize the threshold value of 12 μg/dm³ to a z-score using the formula:

$$Z = \frac{Y - \mu}{\sigma}$$

Where:
- $Y = 12$ μg/dm³ (the threshold value)
- $\mu = 13$ μg/dm³ (mean)
- $\sigma = 2.0$ μg/dm³ (standard deviation)

Now, calculate $Z$:

$$Z = \frac{12 - 13}{2.0}$$

Using this value of $Z$, we can find $P(Y \geq 12)$ from the standard normal distribution tables or using a calculator.

In [178]:
import scipy.stats as stats

# Parameters for the distribution
mu = 13  # μg/dm³
sigma = 2.0  # μg/dm³

# Threshold value
threshold_value = 12  # μg/dm³

# Standardize the threshold value to Z
z_calculated = (threshold_value - mu) / sigma

# Calculate the probability P(Y ≥ 12)
probability_y_geq_12 = 1 - stats.norm.cdf(z_calculated)

print(f"Probability P(Y ≥ 12): {probability_y_geq_12:.3f}")

Probability P(Y ≥ 12): 0.691


**b) Probability that the sample value is greater than 14 μg/dm³ given that it exceeds the threshold value $P(Y > 14 | Y > 12)$:**

First, standardize the value 14 μg/dm³ to a z-score using the same formula as above.

Now, calculate $P(Y > 14 | Y > 12)$:

$$P(Y > 14 | Y > 12) = \frac{P(Y > 14 \cap Y > 12)}{P(Y > 12)}$$

Using the properties of conditional probability, $P(Y > 14 \cap Y > 12) = P(Y > 14)$, as the event $Y > 14$ implies $Y > 12$.

So, we have:

$$P(Y > 14 | Y > 12) = \frac{P(Y > 14)}{P(Y > 12)}$$

In [179]:
# Value we want to find the probability for
value_greater_than_14 = 14  # μg/dm³

# Standardize the value to Z
z_greater_than_14 = (value_greater_than_14 - mu) / sigma

# Calculate the probability P(Y > 14 | Y > 12)
probability_y_gt_14_given_y_gt_12 = (1 - stats.norm.cdf(z_greater_than_14)) / (1 - stats.norm.cdf(z_calculated))

print(f"Probability P(Y > 14 | Y > 12): {probability_y_gt_14_given_y_gt_12:.3f}")

Probability P(Y > 14 | Y > 12): 0.446


# Oppgave 4

To solve these problems, we can use the binomial distribution for part (a) and the normal distribution for parts (b) and (c) with the given parameters.

**a) For the binomially distributed variable X, calculate $P(X \le 8)$:**

We can calculate this directly using the binomial cumulative distribution function (CDF) for $X$ with $n = 35$ and $p = 0.2$. In Python, we can use the `scipy.stats` library to compute the CDF.

In [180]:
n = 35
p = 0.2
x_a = 8

probability_x_leq_8 = stats.binom.cdf(x_a, n, p)

print(f"P(X \le 8): {probability_x_leq_8:.3f}")

P(X \le 8): 0.745


**b) For the normally distributed variable Y, calculate $P(Y \le 8)$:**

we can calculate this using the cumulative distribution function (CDF) for a normal distribution with the given mean ($\mu$) and standard deviation ($\sigma$).

In [181]:
mu = n * p
sigma = (n * p * (1 - p))**0.5
x_b = 8

probability_y_leq_8 = stats.norm.cdf(x_b, loc=mu, scale=sigma)

print(f"P(Y \le 8): {probability_y_leq_8:.3f}")

P(Y \le 8): 0.664


**c) For the normally distributed variable Y, calculate $P(Y \le 8.5)$:**

Calculate this by finding the cumulative probability up to $x = 8.5$ using the normal distribution.

In [182]:
x_c = 8.5

probability_y_leq_8_5 = stats.norm.cdf(x_c, loc=mu, scale=sigma)

print(f"P(Y \le 8.5): {probability_y_leq_8_5:.3f}")

P(Y \le 8.5): 0.737


# Oppgave 5

**a) Probability that a randomly selected aluminum plate does not comply with the specification:**

The specification is that the weight should not deviate by more than ±1 gram. To find the probability that a plate does not comply, we need to find the probability that the weight deviates by more than ±1 gram from 100 grams. In other words, we want to find $P(|X - 100| > 1)$, where $X$ is the weight of a plate.

Since the weight follows a normal distribution with $\mu = 100$ grams and $\sigma = 0.6$ grams, we can standardize this to the standard normal distribution and find the probability.


In [183]:
import scipy.stats as stats

mu = 100  # mean
sigma = 0.6  # standard deviation

# Calculate the z-scores for the upper and lower bounds
z_upper = (101 - mu) / sigma
z_lower = (99 - mu) / sigma

# Use the z-scores to find the probabilities
probability_not_comply = 1 - (stats.norm.cdf(z_upper) - stats.norm.cdf(z_lower))

print(f"Probability that a plate does not comply: {probability_not_comply:.4f}")

Probability that a plate does not comply: 0.0956


**b) Expected weight (in grams) of the box of 25 aluminum plates:**

The expected weight of a single aluminum plate is 100 grams. Since there are 25 plates in a box, the expected weight of the box is $25 \times 100 + 50\text{ (weight of the box)} = 2550$ grams.

**c) Variance (in grams²) of the box of 25 aluminum plates:**

The variance of a single aluminum plate is $\sigma^2 = (0.6)^2 = 0.36$ grams². Since the plates are independent, the variance of the box of 25 plates is $25 \times 0.36 = 9$ grams².

**d) Probability that the box of 25 aluminum plates weighs less than 2545 grams:**

To find this probability, we need to calculate $P(X < 2545)$, where $X$ is the weight of the box of 25 aluminum plates.

The variance of the box is 
$25\cdot (0.6)^2=9\text{ grams}^2$, and so the standard deviation of the box is 
$\sqrt{9}=3\text{ grams}$.

In [184]:
box_weight = 25 * mu + 50  # Total weight of the box
box_std_dev = 3  # Updated standard deviation for the box

# Calculate the z-score for 2545 grams
z = (2545 - box_weight) / box_std_dev

# Use the z-score to find the probability
probability_box_weight_less_than_2545 = stats.norm.cdf(z)

print(f"Probability that the box weighs less than 2545 grams: {probability_box_weight_less_than_2545:.4f}")


Probability that the box weighs less than 2545 grams: 0.0478


# Oppgave 6

The central limit theorem tells us that the distribution of the sample mean ($\bar{X}$) for a sufficiently large number of samples (in this case, 33 machines) will be approximately normally distributed, regardless of the underlying distribution of individual machine stops.

Given that the individual machines have a Poisson distribution with an expected value of 4, the expected value of the sample mean ($\bar{X}$) is equal to the population mean ($\mu$):

$$\text{Expected value of } \bar{X} = \mu = 4$$

Now, for the standard deviation of the sample mean ($\bar{X}$), you can use the following formula:

$$\text{Standard Deviation of } \bar{X} = \frac{\sigma}{\sqrt{n}}$$

Where:
- $\sigma$ is the standard deviation of the individual machine stops, which is the square root of the expected value for a Poisson distribution: $\sigma = \sqrt{4} = 2$.
- $n$ is the number of samples (number of machines), which is 33 in this case.

Plugging in these values:

$$\text{Standard Deviation of } \bar{X} = \frac{2}{\sqrt{33}}$$

In [185]:
mu = 4  # Expected value of individual machine stops
n = 33  # Number of machines

expected_value_sample_mean = mu
std_dev_sample_mean = np.sqrt(mu) / np.sqrt(n)

print(f"Expected value of sample mean: {expected_value_sample_mean:.3f}")
print(f"Standard Deviation of sample mean: {std_dev_sample_mean:.3f}")

Expected value of sample mean: 4.000
Standard Deviation of sample mean: 0.348


# Oppgave 7

To find the probability that the mean of a sample of size 11 falls between 67.7 cm and 68.3 cm, given that the process is under control, we can use the properties of the normal distribution. 

First, we need to calculate the standard deviation of the sample mean ($\bar{X}$) for a sample size of 11. The standard deviation of the sample mean ($\sigma_{\bar{X}}$) is given by:

$$\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}}$$

Where:
- $\sigma$ is the standard deviation of the individual measurements, which is 0.5 cm.
- $n$ is the sample size, which is 11 in this case.

Plugging in these values:

$$\sigma_{\bar{X}} = \frac{0.5}{\sqrt{11}}$$

Now, we can calculate the z-scores for both 67.7 cm and 68.3 cm:

$$Z_{\text{lower}} = \frac{67.7 - \mu}{\sigma_{\bar{X}}}$$
$$Z_{\text{upper}} = \frac{68.3 - \mu}{\sigma_{\bar{X}}}$$

Where:
- $\mu$ is the population mean, which is 68 cm.
- $\sigma_{\bar{X}}$ is the standard deviation of the sample mean.

Now, we can use the z-scores to find the probabilities:

$$P(67.7 \le \bar{X} \le 68.3) = P(Z_{\text{lower}} \le Z \le Z_{\text{upper}})$$

In [186]:
mu = 68  # population mean
sigma = 0.5  # standard deviation of individual measurements
n = 11  # sample size

# Calculate the standard deviation of the sample mean
std_dev_sample_mean = sigma / (n**0.5)

# Calculate the z-scores for 67.7 cm and 68.3 cm
z_lower = (67.7 - mu) / std_dev_sample_mean
z_upper = (68.3 - mu) / std_dev_sample_mean

# Use the z-scores to find the probability
probability_between_67_7_and_68_3 = stats.norm.cdf(z_upper) - stats.norm.cdf(z_lower)

print(f"Probability between 67.7 cm and 68.3 cm: {probability_between_67_7_and_68_3:.3f}")

Probability between 67.7 cm and 68.3 cm: 0.953


# Oppgave 8

To find the highest speed you can go without being stopped, you need to find the critical value (z-score) of the normal distribution that corresponds to the 80th percentile. In other words, you want to find the speed ($X$) such that only 20% of motorists drive faster than $X$.

Given the information:
- The speed of motorists on this road follows a normal distribution with an expectation ($\mu$) of 90 km/h and a standard deviation ($\sigma$) of 6 km/h.

To find \(X\), you can use the cumulative distribution function (CDF) of the standard normal distribution. The CDF of the standard normal distribution is denoted as $\Phi(z)$, where $z$ is the z-score. 

The z-score is calculated as:

$$z = \frac{X - \mu}{\sigma}$$

You want to find the z-score ($z$) such that $\Phi(z) = 0.80$ since you want to capture the fastest 20% of motorists.

Now, you can use the z-score to find $X$:

$$X = \mu + z \cdot \sigma$$

Substitute the given values:

$$X = 90 + z \cdot 6$$

To find $z$ for $\Phi(z) = 0.80$, you can use the percent point function (PPF) or quantile function, which is the inverse of the CDF:

$$z = \text{PPF}(0.80)$$

In [187]:
mean = 90  # Mean speed (in km/h)
std_dev = 6  # Standard deviation of speed (in km/h)
percentile = 0.80  # Desired percentile (80th percentile)

# Use the percent point function (PPF) to find the critical value (z-score)
critical_value = stats.norm.ppf(percentile)

# Calculate the highest speed without being stopped
highest_speed = mean + (critical_value * std_dev)

print(f"Highest speed without being stopped: {highest_speed:.3f} km/h")


Highest speed without being stopped: 95.050 km/h


# Oppgave 9

To find the probability that a randomly selected nut can be screwed onto a randomly selected bolt, we need to calculate the probability that the nut's diameter is greater than the bolt's diameter. In other words, we want to find $P(\text{Nut Diameter} > \text{Bolt Diameter})$, this can be rewritten as $P([\text{Nut Diameter} - \text{Bolt Diameter}] > 0)$. We can simply this further by denoting $[\text{Nut Diameter} - \text{Bolt Diameter}] =Y$:

$$P(Y > 0)$$

Let's write the information given:
- $\mu_B$ = Expectation (mean) of bolt diameter = 15.8 mm
- $\sigma_B$ = Standard deviation of bolt diameter = 0.07 mm
- $\mu_N$ = Expectation (mean) of nut diameter = 16.0 mm
- $\sigma_N$ = Standard deviation of nut diameter = 0.07 mm

We can now calculate the mean of $Y (\mu_Y)$:

$$\mu_Y = \mu_N - \mu_B = 16.0 mm - 15.8 mm = 0.2 mm$$

Then the standard deviation of $Y (\sigma_Y)$:

$$\sigma_Y = \sqrt{\sigma_N^2 + \sigma_B^2} = \sqrt{0.07^2 + 0.07^2} = 0.099 mm$$

We can now use the cumulative distribution function (CDF) of the normal distribution:

$$P(Y > 0) = 1 - P(Y ≤ 0)$$

CDF is denoted as:

$$F(x) = \frac{1}{2}\cdot[1 + \operatorname{erf}(\frac{x - \mu}{\sigma \cdot \sqrt{2}})]$$

The error function: $\operatorname{erf}(z)$ is denoted as:
$$\operatorname{erf}(z)={\frac {2}{\sqrt {\pi }}}\int _{0}^{z}e^{-t^{2}}\,\mathrm {d} t$$

In [188]:
# Define the parameters
mu_B = 15.8
sigma_B = 0.07
mu_N = 16.0
sigma_N = 0.07

mean_Y = 0.2  # Mean of Y
std_dev_Y = 0.099  # Standard deviation of Y

# Calculate P(Y > 0)
prob_Y_greater_than_zero = 1 - stats.norm.cdf(0, loc=mean_Y, scale=std_dev_Y)

# Print the result
print(f'The probability that a randomly selected nut can be screwed onto a randomly selected bolt is: {prob_Y_greater_than_zero:.3f}')

The probability that a randomly selected nut can be screwed onto a randomly selected bolt is: 0.978


# Oppgave 10


**a) What proportion of the production must the company expect to be defective?**

Cylinders with compressive strength below 180.5 kg/cm² are considered defective. To find the proportion of defective cylinders, we need to calculate the probability that a cylinder has a compressive strength below 180.5 kg/cm² using the given normal distribution parameters.

In [189]:
mean = 199.0  # Mean compressive strength (in kg/cm²)
std_dev = 9.5  # Standard deviation (in kg/cm²)
cutoff = 180.5  # Cutoff for defective cylinders (in kg/cm²)

# Calculate the z-score for the cutoff value
z = (cutoff - mean) / std_dev

# Use the z-score to find the probability of being defective
probability_defective = stats.norm.cdf(z)

print(f"Proportion of production expected to be defective: {probability_defective:.4f}")

Proportion of production expected to be defective: 0.0257


**b) What is the probability that such a controlled cylinder has a compressive strength below 187.625 kg/cm²?**

Now, let's find the probability that a controlled cylinder (already loaded with 180.5 kg/cm²) has a compressive strength below 187.625 kg/cm².

First, standardize the controll load and target load of 180.5 kg/cm² and 187.625 kg/cm² to z-scores using the formula:

$$Z = \frac{Y - \mu}{\sigma}$$

Now, calculate $P(Y < 187.625 | Y > 12)$:

$$P(Y < 187.625 | Y > 180.5) = \frac{P(Y < 187.625 \cap Y > 180.5)}{P(Y > 180.5)}$$

Using the properties of complimentary probability: 

$$\frac{P(Y < 187.625 \cap Y > 180.5)}{P(Y > 180.5)} = (\frac{P(Y > 187.625)}{P(Y > 180.5)})^C$$

So, we have:

$$P(Y < 187.625 | Y > 12) = 1 - \frac{P(Y > 187.625)}{P(Y > 180.5)}$$

In [204]:
controlled_load = 180.5  # Load applied to controlled cylinders (in kg/cm²)
target_strength = 187.625  # Target compressive strength (in kg/cm²)

# Calculate the z-score for the target strength
z_controll = (controlled_load - mean) / std_dev
z_target = (target_strength - mean) / std_dev

# Use the z-score to find the probability
probability_below_target = 1 - ((1 - stats.norm.cdf(z_target)) / (1 - stats.norm.cdf(z_controll)))

print(f"Probability that a controlled cylinder has strength below 187.625 kg/cm²: {probability_below_target:.4f}")

Probability that a controlled cylinder has strength below 187.625 kg/cm²: 0.0922


**c) What is the probability that a customer who buys 12 controlled cylinders gets at most 2 with a compressive strength below 187.625 kg/cm²?**

To find the probability that a customer who buys 12 controlled cylinders gets at most 2 with a compressive strength below 187.625 kg/cm², we can use binomial distribution:

$$P(X=k)=C\binom{n}{k}\cdot p^k \cdot (1−p)^{n−k}$$

where:
- $n$ is the number of controlled cylinders: $12$
- $k$ is the number of cylinders we are checking: $2$
- $p$ is the probability of one cylinder being below target: *Ouput from previous code block*

In [191]:
n = 12  # Number of controlled cylinders purchased
p = probability_below_target  # Probability of a single cylinder being below target

# Use the cumulative distribution function (CDF) of the binomial distribution
probability_at_most_2_below_target = stats.binom.cdf(2, n, p)

print(f"Probability of getting at most 2 cylinders below 187.625 kg/cm²: {probability_at_most_2_below_target:.4f}")

Probability of getting at most 2 cylinders below 187.625 kg/cm²: 0.9083
