# Z-test
- Compares sample average to population average
- Calculates Z-score
- Identifies normal variance in the averages
- Most useful with a sample of over 3oitems
- Must have appoximate normal distribution
- Must have equal variance between samples
- All data points must be independent

## One-sampled
- Population standard deviation known
- Sample size > 30
- Approximately normally distributed

In [5]:
%conda install statsmodels

3 channel Terms of Service accepted
Retrieving notices: done
Channels:
 - defaults
Platform: win-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: c:\Users\paul\programming\courses\python-exercises\.conda

  added / updated specs:
    - statsmodels


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    bottleneck-1.4.2           |  py311h57dcf0c_0         146 KB
    numexpr-2.11.0             |  py311hdb065b2_0         212 KB
    pandas-2.3.1               |  py311h885b0b7_0        14.7 MB
    patsy-1.0.1                |  py311haa95532_0         365 KB
    python-tzdata-2025.2       |     pyhd3eb1b0_0         141 KB
    pytz-2025.2                |  py311haa95532_0         235 KB
    statsmodels-0.14.5         |  py311hf9130e5_0        11.9 MB
    ------------------------------------------------------------
           



    current version: 25.5.1
    latest version: 25.7.0

Please update conda by running

    $ conda update -n base -c defaults conda




In [None]:
import numpy as np
from statsmodels.stats.weightstats import ztest

# Claim (Alternative hypothesis): This model of smartphone has an average battery life of 12 hours
# Population: All smartphones that are that model
population_mean = 12  # μ

# Known population standard deviation
population_std_dev = 0.5  # σ

# 100 phones tested (sample size of 100)  # n
# Average battery life of 11.8 hours  # x
data = [11.8] * 100

# z = (x - μ) / (σ / sqrt(n))
z_statistic, p_value = ztest(data, value=population_mean)

print(f"Z-Statistic: {z_statistic:.4f}")
# Either compare the z-stat to a critical value table, or standardize it by calculating a p-value

print(f"P-Value: {p_value:.4f}")

alpha = 0.05
if p_value < alpha:
    print(
        "Reject the null hypothesis: The average battery life is different from 12"
        " hours."
    )
else:
    print(
        "Fail to reject the null hypothesis: The average battery life is not"
        " significantly different from 12 hours."
    )

Z-Statistic: -560128131373970.2500
P-Value: 0.0000
Reject the null hypothesis: The average battery life is different from 12 hours.


## Two-sampled
- Two normally distributed independet populations
- Samples from both populations
- Again, the null hypothesis would be that they have the same average

In [7]:
import numpy as np
import scipy.stats as stats

# Null hypothesis: There is no difference in test scores between online classes and offline classes

# Group 1 - offline classes
n1 = 50  # Sample size
x1 = 75  # Sample mean
s1 = 10  # Sample standard deviation

# Group 2 - online classes
n2 = 60  # Sample size
x2 = 80  # Sample mean
s2 = 12  # Sample standard deviation

D = 0  # Expected distance between means

z_score = ((x1 - x2) - D) / np.sqrt((s1**2 / n1) + (s2**2 / n2))
print("Z-Score:", np.abs(z_score))

alpha = 0.05  # Significance value
z_critical = stats.norm.ppf(1 - alpha / 2)
print("Critical Z-Score:", z_critical)

if np.abs(z_score) > z_critical:
    print("Reject the null hypothesis.")
else:
    print("Fail to reject the null hypothesis.")

Z-Score: 2.3836564731139807
Critical Z-Score: 1.959963984540054
Reject the null hypothesis.
