### One-Sample Z-Test

**1 - What it is?**

A. `One-Sample Z-Test` checks if the mean of your sample is signficantly different from the population mean (when population standard deviation is known, and sample size is large, usually n>=30).

`Think of it like`
"Is my small group's average weight `really different` from the entire country's average weight, or is it just random chance?"

**2 - Real-Life Example**

Suppose: `The average time on site (population mean)` for users = 5 minutes.
- From analytics, we take a sample of 50 visitors and find the `average time spent=5.5 minutes`. 
- We know the `population standard deviation (sigma) = 1 minute.

**Question: Is this increase real(significant) or just random nosie?**

**3 - Hypotheses**
- H0 (Null Hypotheses) : population mean (mue) = 5 (no difference, sample mean = population mean)
- H1 (Alternate Hypotheses) : population mean (mue) != 5 (there is a difference)

This is `two-tailed test`.

- One tail = One direction -> care only if it's higher or lower.
- Tow tails = Both directions -> care if it's different at all.

**4 - Formula**

Z = (sample mean - population)/(population standard deviation/sqrt(sample size)​)

Z = (x bar - μ) / (σ/sqrt(n))

**Where:**
- x bar  = sample mean
- μ (meu) = population mean
- σ (sigma) = population standard deviation
- n = sample size

**5 - Python Implementation (Step-by-Step)**

In [1]:
import numpy as np
from scipy import stats

# -------------------------------------
# Step 1: Define population parameters
# -------------------------------------
population_mean = 5 # μ = average time spent (in minutes) - populiation mean
population_std = 1  # σ = population standard deviation
sample_size = 50    # n = number of visitors in our sample

# ------------------------------------------
# Step 2: Define our sample mean (from data)
# ------------------------------------------
sample_mean = 5.5 # The average time spent from the sample

# -------------------------------------------
# Step 3: Calculate Standard Error (SE)
# -------------------------------------------
# Formula: SE = σ / sqrt(n)
standard_error = population_std / np.sqrt(sample_size)
print("Standard Error (SE):", standard_error)

# --------------------------------------------
# Step 4: Calculate Z-Statistic
# --------------------------------------------
# Formula; Z = (x bar - μ) / SE
z_stat = (sample_mean - population_mean) / standard_error
print("Z-Statistic: ", z_stat)

# --------------------------------------------
# Step 5: Calculate p-value
# --------------------------------------------
# Since this is a two-tailed test:
p_value = 2 * (1 - stats.norm.cdf(abs(z_stat)))
print("P-Value: ", p_value)

# ---------------------------------------------
# Step 6: Set significance level (a)
# ---------------------------------------------
alpha = 0.05

# ---------------------------------------------
# Step 7: Decision
# ---------------------------------------------
if p_value < alpha:
    print("Reject H0: The sample mean is significantly different from the population mean.")
else:
    print("Fail to Reject H0: No significant difference between sample mean and population mean.")

Standard Error (SE): 0.1414213562373095
Z-Statistic:  3.5355339059327378
P-Value:  0.00040695201744500586
Reject H0: The sample mean is significantly different from the population mean.


**6 - Walkthrough of Results**

1. Standard Error -> measures how much sample means fluctuate from population mean.
2. Z-Statistic -> tells us how many "standard errors" away the sample mean is from the population mean.
3. p-value -> probability of observing this difference (or more extreme) if null hypothesis is true.
4. Decision -> if p < a (0.05), result is **statistically significant**.

**7 - Analogy to Remember**

Think of it like a **class exam**

- Population mean = the **average score of the entire country**.
- Sample mean = your class's ***average**
- Z-test checks -> "Is your class performing **really differently** than the country average, or is this difference just luck?"