### The Standard Normal Distribution

The standard normal distribution, also called the z-distribution, is a special normal distribution where the mean is 0 and the standard deviation is 1.

Any normal distribution can be standardized by converting its values into z-scores. Z-scores tell you how many standard deviations from the mean each value lies.

### How to calculate a z-score

![image.png](attachment:image.png)

The standard normal distribution is a probability distribution, so the area under the curve between two points tells you the probability of variables taking on a range of values. The total area under the curve is 1 or 100%

Every z-score has an associated p-value 

By converting a value in a normal distribution into a z-score, you can easily find the p-value for a z-test.

### How to use a z-table

Once you have a z-score, you can look up the corresponding probability in a z-table.

In [21]:
import scipy.stats as stats
import pandas as pd

In [22]:
stats.norm.ppf(.95)

1.6448536269514722

In [23]:
stats.norm.cdf(1.64)

0.9494974165258963

In [24]:
alpha = 0.05

In [25]:
z_critical = stats.norm.ppf(1 - alpha) #(use alpha = alpha/2 for two-sided)

In [26]:
z_critical

1.6448536269514722

### How to Calculate Confidence Intervals

![image-2.png](attachment:image-2.png)

In [27]:
def confidence_interval(df: pd.DataFrame, feature: str, alpha) -> tuple:
    mean = df[feature].mean()
    standard_deviation = df[feature].std()
    z_critical = stats.norm.ppf(1 - alpha)
    lower = mean -  z_critical * (standard_deviation /sqrt(df.shape[0]))
    upper = mean +  z_critical * (standard_deviation /sqrt(df.shape[0]))
    return lower, upper

In [28]:
lower, upper = confidence_interval(df, 'X1', 0.05)

NameError: name 'df' is not defined