### Two-Sample (Right-Tailed) Z-Test -- A/B Test Example

#### 1 - What is it?

A. `Right One-Tailed Two-Sample Z-Test` checks whether the mean of **group B** (treatment/new) is **significantly higher** than the mean of **group A** (control/old). Use when **population standard deviations (std1,std2)** are known (or assumed known) and sample sizes are large (n1, n2>30). *This tests differences between two independent groups,* not sample vs population.

`Think of it like` comparing two different buckets of users - "is the new landing page's average session time really higher than the old one, or just random noise?"

#### 2 - Real-Life Example : Right-Tailed Two-Sample Z-Test (A/B)

`Scenario:` You ran an A/B test on session time.

* Group A (Control - old page)
    - mean = 5.00 minutes
    - population std (sigma1) = 1.10 minutes
    - sample size (n1) = 200

* Group B (Treatment - new page)
    - mean = 5.40 minutes
    - population std (sigma2) = 1.20 minutes
    - sample size (n2) = 220

`Claim to test:`
    **New landing page increased average session time ->** directional, So **right-tailed**

#### 3 - Hypotheses

* *H0 (Null)*: u_B = u_A (no difference)
    (equivalently: u_B - u_A = 0)
* *H1 (Alternate, right-tailed):* u_B > u_A
    (we want evidence that treatment mean is larger)

This is **right-tailed** because alternate says *"greater than"*.

#### 4 - Formula

* Difference of sample means:
   - Absolute difference = x_B - x_A

* Standard error of the difference (known standard deviation case):
   - SE = SQRT((std_A^2/nA)+(std_B^2/nB))

* Z-statistics:
   - Z = Absolute difference / SE 
   - = (x_B - x_A) / SQRT((std_A^2/n_A)+(std_B^2/n_B))

* Right-tailed p-value:
    - p = 1 - cdf(Z)

* Critical value for a = 0.05 (right-tail):
    - z_crit = z 1-a = stats.norm.ppf(0.95) ~ 1.645

**Manual Step-by-step Calculation (digit-by-digit)**

In [5]:
import numpy as np
from scipy import stats

In [9]:
## Inputs:

#Group A
x_A = 5.00
std_A = 1.10
n_A = 200

# Group B
x_B = 5.40
std_B = 1.20
n_B = 220

# Calc starts here
std_A_square = 1.10**2
std_B_square = 1.20**2
print(f"std squares: std_A:{std_A_square} and std_B:{std_B_square}")

# 1. Difference of means:
diff = x_B - x_A
print(f"\ndifference: {diff}")

# 2. Variance contributions:
v_c_A = std_A_square/n_A
v_c_B = std_B_square/n_B
print(f"\nVariance Contribution: Group A: {v_c_A:.6f}")
print(f"\nVariance Contribution: Group B: {v_c_B:.6f}")

# 3. Sum of contributions:
total_contributions = v_c_A+v_c_B
print(f"\nTotal Contributions: {total_contributions}")

# 4. Standard error (SE):
SE = np.sqrt(total_contributions)
print(f"\nStandard Error: {SE}")

# 5. Z-statistics
z_stat = diff / SE

# 6. Right-tailed p-value:
p = 1 - stats.norm.cdf(z_stat)
print(f"\nRight-tailed p_value = {p}")

# 7. Compare to critical z (a=0.05): 1.645
print(f"\nSince Z = {z_stat:.4f} > 1.645, reject H0.")


std squares: std_A:1.2100000000000002 and std_B:1.44

difference: 0.40000000000000036

Variance Contribution: Group A: 0.006050

Variance Contribution: Group B: 0.006545

Total Contributions: 0.012595454545454545

Standard Error: 0.11222947271307367

Right-tailed p_value = 0.00018253517121069596

Since Z = 3.5641 > 1.645, reject H0.


#### 5. Python Implementation (Step-by-Step)

In [11]:
import numpy as np
from scipy import stats

# ---------------------------------------------
# Step 1: Inputs
# ---------------------------------------------
mean_A = 5.00   # Control mean
sigma_A = 1.10  # Control population std
n_A = 200       # Control observation size

mean_B = 5.40   # treatment mean (test variable)
sigma_B = 1.20  # treatment population std (test population)
n_B = 220       # treatment observation size

alpha = 0.05    # Significance level (right-tailed)

# ---------------------------------------------
# Step 2: Difference & SE
# ---------------------------------------------
diff = mean_B - mean_A
se = np.sqrt((sigma_A**2)/n_A+(sigma_B**2)/n_B)
print(f"Difference (B - A): {diff:.4f}")
print(f"Standard Error (SE): {se:.6f}")

# ---------------------------------------------
# Step 3: Z-statistic
# ---------------------------------------------
z_stat = diff / se
print(f"Z statistics: {z_stat:.4f}")

# ---------------------------------------------
# Step 4: P-value (right-tailed)
# ---------------------------------------------
p_value = 1 - stats.norm.cdf(z_stat)
print(f"P-Value (right-tailed): {p_value:.6f}")

# ---------------------------------------------
# Step 5: Critical Value
# ---------------------------------------------
crit = stats.norm.ppf(1-alpha)
print(f"Critical Value (right, alpha={alpha}: {crit:.6f})")

# ---------------------------------------------
# Step 6: Decisions
# ---------------------------------------------
print("\nDecision (p-value)", "Reject H0" if p_value < alpha else "Fail to Reject H0")
print("Decision (z vs critical):",
      "Reject H0" if z_stat > crit else "Fail to Reject H0")

Difference (B - A): 0.4000
Standard Error (SE): 0.112229
Z statistics: 3.5641
P-Value (right-tailed): 0.000183
Critical Value (right, alpha=0.05: 1.644854)

Decision (p-value) Reject H0
Decision (z vs critical): Reject H0


#### 6. Business Interpretation

* The observed increase `0.40 minutes` (from 5.00 -> 5.40) is **3.56 standard errors above** what we'd except if there were no effect.
* p = 0.00018 -> about **0.018%** chance this improvement happened by random chance under H0.
* Since **Z (3.56) > 1.645** and **p < 0.05** we **reject H0**