Cell 1 (imports + DB connection)

In [1]:
from pathlib import Path

import sqlite3

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

project_root = Path.cwd().parent

db_path = project_root / "data" / "Synthetic Dataset" / "product_analytics.db"

conn = sqlite3.connect(db_path)

Cell 2 (pull costumer LTV by acquisition channel)
    
    - This will set up a comparison problem.

In [2]:
query = """

 SELECT

    u.acquisition_channel,

    o.net_revenue

FROM orders o

JOIN users u ON u.user_id = o.user_id;

"""

df = pd.read_sql_query(query, conn)

df.head()

Unnamed: 0,acquisition_channel,net_revenue
0,organic,249.36
1,organic,152.0
2,organic,32.59
3,organic,55.14
4,paid_search,23.41


Cell 3 (compare two groups)

    - Choose two real channels that have enough data.
    - Think: two strategies, two signals, two cohorts.

In [3]:
df["acquisition_channel"].value_counts()


acquisition_channel
organic        3434
paid_search    1556
referral       1164
paid_social    1104
email          1046
affiliate       696
Name: count, dtype: int64

In [10]:
group_a = df.loc[df["acquisition_channel"] == "organic", "net_revenue"]

group_b = df.loc[df["acquisition_channel"] == "paid_search", "net_revenue"]

group_a.describe(), 

group_b.describe()

count    1556.000000
mean      156.831112
std        98.660399
min         9.930000
25%        78.960000
50%       139.310000
75%       215.907500
max       564.310000
Name: net_revenue, dtype: float64

 "The question"

     ## Hypothesis
      
         - ** Null hypothesis (H₀): ** The mean net revenue per order is the same for both acquisition channels.
         - ** Alternative hypothesis (H₁): ** The mean net revenue per order differs between channels.

Cell 4 (confidence intervals)

    - Building a 95% confidence interval for each group's mean.

In [11]:
def mean_ci(x, alpha=0.05):

    mean = x.mean()

    std = x.std(ddof=1)

    n = len(x)

    z = 1.96 # approx for 95%

    margin = z * std / np.sqrt(n)

    return mean, mean - margin, mean + margin

mean_a, lo_a, hi_a = mean_ci(group_a)

mean_b, lo_b, hi_b = mean_ci(group_b)

(mean_a, lo_a, hi_a), (mean_b, lo_b, hi_b)

((np.float64(151.94065521258008),
  np.float64(148.698626058195),
  np.float64(155.18268436696516)),
 (np.float64(156.8311118251928),
  np.float64(151.92887663569934),
  np.float64(161.73334701468627)))

Interpretation Rule 
      
      - If intervals overlap a lot --> weak evidence.
      - If intervals are well separated --> strong evidence.


Cell 6 (two sample T-test "formal test")

In [12]:
from scipy import stats

t_stat, p_value = stats.ttest_ind(group_a, group_b, equal_var=False)

t_stat, p_value


(np.float64(-1.6309009079195809), np.float64(0.10301774249793194))

How to read p_values

     - p_valur < 0.05 --> statistically significant difference
     - p_value ≥ 0.05 --> cannot reject null

     "This does not mean 'true or false' - it means evidence strength"

Cell 7 (why significance ≠ usefulness)

In [13]:
effect_size = mean_a - mean_b

effect_size

np.float64(-4.890456612612724)

Compare
     
    - Effect size magnitude
    - Variability of the data

    A tiny but "significant" effect can be useless in practice. - this is quant level skeptecism.

## Day 6 - Hypothesis Testing Insights

   - Confidence intervals provide an intuitive range of plausible values for the mean.
   - Statistical significance depends on both the effect size sample size.
   - A low p_value does not guarantee practical importance.
   - Hypothesis testing helps distinguish signal from noise but must be paired with domain judgement.

   These concepts are critical for validating strategies and avoiding overfitting. 