## In general

In a z-test, we need to compare two given sample means. The sample follows a Gaussian distribution. A z-test is used when the population parameters like standard deviation are known.

z = (x — μ) / (σ / √n)

x = sample mean

mu = population mean

σ / √n = standard deviation of population (Standard Error)

If the p-value is lower than 0.05, reject the null hypothesis or else accept the null hypothesis.

Null Hypothesis: Population mean is same as the sample mean

Alternate Hypothesis: Population mean is not the same as the sample mean

## One-Sample Z test (A one sided z-test will have one critical boundary, while a two sided z-test will have two critical boundaries.)

A one-sample z-test allows for us to see if a particular group of data is actually from a larger population of data. 

Situations which warrant a z-test:
- Sample size greater than 30
- Independent data points
- Normally distributed data
- Randomly selected data
- Equal sample sizes

Let’s take a mean of 156 for this blood pressure dataset.

Null Hypothesis: There is no difference in the mean

Alternate Hypothesis: Means are different

In [1]:
import pandas as pd
from scipy import stats
from statsmodels.stats import weightstats as stests
df = pd.read_csv("ztest_data.csv")
ztest, pval = stests.ztest(df['bp_before'], x2=None, value=156)
print(float(pval))
if pval < 0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

0.6651614730255063
accept null hypothesis


## Two Sample Z-test

H0: mean of two samples is the same

H1: mean of two samples is not the same

In [2]:
ztest ,pval1 = stests.ztest(df['bp_before'], x2=df['bp_after'], value=0,alternative='two-sided')
print(float(pval1))
if pval1 < 0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

0.002162306611369422
reject null hypothesis
