# A/B Experiment Evaluation Utility

#### `Purpose`
Implements a rigorous two-proportion z-test framework to compare conversion
rates between control and treatment marketing cohorts. Designed to ensure
analytical soundness, prevent manual bias in variant selection, and enforce
minimum data sufficiency for hypothesis testing.

#### `Context`
This component operationalizes statistical validation within the campaign
performance analytics pipeline and enables repeatable decision-making grounded
in measurable uplift rather than directional assumptions.


In [None]:
import psycopg2
import pandas as pd
import numpy as np
from statsmodels.stats.proportion import proportions_ztest
import sys

# Defining database credentials for connection
# Purpose: store configuration parameters for secure database access
DB_NAME = "sql_project_p10"
DB_USER = "postgres"
DB_PASSWORD = "password"
DB_HOST = "localhost"
DB_PORT = "5432"

def main():
    print("Starting A/B test script")

    # Establishing connection to PostgreSQL database
    try:
        conn = psycopg2.connect(
            dbname=DB_NAME,
            user=DB_USER,
            password=DB_PASSWORD,
            host=DB_HOST,
            port=DB_PORT
        )
    except Exception as e:
        print("ERROR: could not connect to Database. Check credentials.")
        print(e)
        sys.exit(1)

    # Reading ab_counts.sql
    try:
        df = pd.read_sql("SELECT * FROM ab_counts ORDER BY impressions DESC;", conn)
    except Exception as e:
        print("ERROR: failed to read ab_counts table. Make sure table exists.")
        print(e)
        conn.close()
        sys.exit(1)
    
    # Ensuring minimum sample for A/B testing to enforce requirement of at least two experiment groups
    if df.shape[0] < 2:
        print("Not enough variants in ab_counts. Need at least two rows.")
        conn.close()
        sys.exit(1)

    print("\nFound the Top 10 variants:")
    print(df[['variant','conversions','impressions','n_campaigns']].head(10).to_string(index=False))

    # Requesting control and treatment variant names to allow user-driven experiment configuration or fallback to automatic selection
    print("\nType the exact name of the CONTROL variant (or hit Enter to use 2nd top):")
    control = input("CONTROL > ").strip()
    print("Type the exact name of the TREATMENT variant (or hit Enter to use top):")
    treatment = input("TREATMENT > ").strip()

    # If user leaves blank, picking the top two by impressions
    if control == "" or treatment == "":
        print("No variants provided therefore, using the top two variants by impressions.")
        treatment_row = df.iloc[0]
        control_row = df.iloc[1]
    else:
        # finding rows by name (case-sensitive match)
        if treatment not in df['variant'].values or control not in df['variant'].values:
            print("One or both provided variant names not found. Please copy exact variant names from the list above.")
            conn.close()
            sys.exit(1)
        treatment_row = df[df['variant'] == treatment].iloc[0]
        control_row = df[df['variant'] == control].iloc[0]

    # Printing the selected experiment groups
    print(f"\nTesting CONTROL='{control_row['variant']}' vs TREATMENT='{treatment_row['variant']}'")
    print(f"Control: conversions={int(control_row['conversions'])}, impressions={int(control_row['impressions'])}")
    print(f"Treat  : conversions={int(treatment_row['conversions'])}, impressions={int(treatment_row['impressions'])}")

    # Preventing invalid test conditions
    conv_A = int(control_row['conversions'])
    n_A = int(control_row['impressions'])
    conv_B = int(treatment_row['conversions'])
    n_B = int(treatment_row['impressions'])

    if n_A == 0 or n_B == 0:
        print("WARNING: one group has zero impressions; test will be unreliable.")
        # avoiding division by zero
        n_A = max(1, n_A)
        n_B = max(1, n_B)

    # Enforcing logical bounds on conversions to maintain data integrity by preventing impossible conversion counts

    conv_A = min(conv_A, n_A)
    conv_B = min(conv_B, n_B)

    # Executing two-proportion z-test for conversion rate difference
    # Purpose: statistically evaluate performance gap between variants

    try:
        count = np.array([conv_B, conv_A])
        nobs = np.array([n_B, n_A])
        stat, pval = proportions_ztest(count, nobs, alternative='two-sided')
    except Exception as e:
        print("ERROR: z-test failed. Details:")
        print(e)
        conn.close()
        sys.exit(1)

     # Computing performance metrics and lift
    # Purpose: quantify directional improvement and scale of treatment impact

    rate_A = conv_A / n_A
    rate_B = conv_B / n_B
    abs_lift = rate_B - rate_A
    rel_lift = (abs_lift / rate_A) if rate_A != 0 else float('nan')

    # Displaying experiment outcome
    print("\n--- A/B Test Results ---")
    print(f"Control rate  = {rate_A:.4%}")
    print(f"Treatment rate = {rate_B:.4%}")
    print(f"Absolute lift (B - A) = {abs_lift:.4%}")
    print(f"Relative lift = {rel_lift:.2%}")
    print(f"z-statistic = {stat:.4f}")
    print(f"p-value = {pval:.6f}")

    # Applying significance threshold to draw inference
    # Purpose: determine whether observed uplift is statistically credible

    alpha = 0.05
    if pval < alpha:
        print("\nVerdict: There is a statistically significant difference (p < 0.05).")
    else:
        print("\nVerdict: No statistically significant difference (p >= 0.05).")

    # Closing Database Connection
    print("\nCompleted. " \
    "Close the database connection and exit.")
    conn.close()

# Executing main program entrypoint
if __name__ == "__main__":
    main()


Starting A/B test script

Found the Top 10 variants:
     variant  conversions  impressions  n_campaigns
      Search       157625    221415139        40157
  Influencer       158468    220769081        40169
       Email       156882    220144927        39870
     Display       157063    220074756        39987
Social Media       156885    219056401        39817

Type the exact name of the CONTROL variant (or hit Enter to use 2nd top):


  df = pd.read_sql("SELECT * FROM ab_counts ORDER BY impressions DESC;", conn)


Type the exact name of the TREATMENT variant (or hit Enter to use top):
No variants provided, using the top two variants by impressions...

Testing CONTROL='Influencer' vs TREATMENT='Search'
Control: conversions=158468, impressions=220769081
Treat  : conversions=157625, impressions=221415139

--- A/B Test Results ---
Control rate  = 0.0718%
Treatment rate = 0.0712%
Absolute lift (B - A) = -0.0006%
Relative lift = -0.82%
z-statistic = -2.3217
p-value = 0.020250

=> Verdict: There is a statistically significant difference (p < 0.05).

Completed. Close the database connection and exit.


This notebook performs a controlled experiment analysis to determine whether differences in marketing campaign performance are statistically significant. Campaign variants are retrieved from a PostgreSQL table (`ab_counts`) containing impressions and conversions aggregated by channel/type.

The procedure includes:
- Establishing a secure database connection
- Retrieving experimental groups and validating data availability
- Selecting control and treatment variants (user-specified or automatically based on highest impressions)
- Computing conversion rates and lift metrics
- Conducting a two-proportion Z-test to assess whether observed performance differences are statistically meaningful

The null hypothesis assumes equal conversion rates between variants. A p-value < 0.05 indicates sufficient evidence to reject this assumption and conclude a significant performance difference.