# Week 7: Statistics Foundations for Marketing Analytics

**Goal:** Master statistical concepts to make data-driven decisions with confidence.

**Time Commitment:** ~1 hour per day √ó 7 days = 7 hours total

**What You'll Learn:**
- Probability fundamentals and distributions
- Confidence intervals and margin of error
- Hypothesis testing framework
- T-tests for comparing campaigns
- Chi-square tests for categorical data
- Correlation vs causation
- Statistical analysis of marketing experiments

**Why This Matters:**
As a Marketing Measurement Partner, you need to answer questions like:
- Is Campaign A truly better than Campaign B, or just lucky?
- What's the probability our ROAS will exceed 3.0 next month?
- How confident are we in this conversion rate improvement?
- Did our creative change cause the performance increase?
- Is this week's drop statistically significant or normal variance?

Statistics separates signal from noise, enabling confident decision-making.

---

## Setup: Load Libraries and Data

We'll work with marketing campaign data and add statistical analysis.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Set defaults
pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 4)
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette('Set2')

print("‚úÖ Libraries imported successfully!")

In [None]:
# Generate marketing dataset
np.random.seed(42)

dates = pd.date_range('2024-10-01', periods=90, freq='D')
campaigns = ['Search_Brand', 'Search_Generic', 'FB_Prospecting', 'FB_Retargeting',
             'Display_Remarketing', 'Instagram_Stories', 'LinkedIn_B2B', 
             'TikTok_Awareness', 'YouTube_Video', 'Search_Competitor']
channels = ['Google', 'Google', 'Meta', 'Meta', 'Google', 'Meta', 'LinkedIn', 'TikTok', 'Google', 'Google']
campaign_types = ['Search', 'Search', 'Social', 'Social', 'Display', 'Social', 'Social', 'Video', 'Video', 'Search']

data = []
for i, campaign in enumerate(campaigns):
    for date in dates:
        dow_multiplier = 1.3 if date.dayofweek < 5 else 0.7
        
        if campaign_types[i] == 'Search':
            impressions = int(np.random.normal(5000, 1000) * dow_multiplier)
            ctr = np.random.normal(0.05, 0.01)
            cvr = np.random.normal(0.04, 0.008)
        elif campaign_types[i] == 'Social':
            impressions = int(np.random.normal(25000, 5000) * dow_multiplier)
            ctr = np.random.normal(0.03, 0.008)
            cvr = np.random.normal(0.025, 0.006)
        elif campaign_types[i] == 'Display':
            impressions = int(np.random.normal(40000, 8000) * dow_multiplier)
            ctr = np.random.normal(0.015, 0.004)
            cvr = np.random.normal(0.02, 0.005)
        else:  # Video
            impressions = int(np.random.normal(50000, 10000) * dow_multiplier)
            ctr = np.random.normal(0.02, 0.005)
            cvr = np.random.normal(0.015, 0.004)
        
        clicks = int(impressions * max(0.001, ctr))
        conversions = int(clicks * max(0.001, cvr))
        cpc = np.random.uniform(0.5, 3.0) if campaign_types[i] == 'Search' else np.random.uniform(0.3, 1.5)
        cost = clicks * cpc
        aov = np.random.uniform(80, 200)
        revenue = conversions * aov
        
        data.append({
            'date': date,
            'campaign': campaign,
            'channel': channels[i],
            'campaign_type': campaign_types[i],
            'impressions': max(0, impressions),
            'clicks': max(0, clicks),
            'conversions': max(0, conversions),
            'cost': max(0, cost),
            'revenue': max(0, revenue)
        })

df = pd.DataFrame(data)

# Calculate metrics
df['ctr'] = df['clicks'] / df['impressions']
df['cvr'] = df['conversions'] / df['clicks'].replace(0, np.nan)
df['cpa'] = df['cost'] / df['conversions'].replace(0, np.nan)
df['roas'] = df['revenue'] / df['cost'].replace(0, np.nan)
df['cpc'] = df['cost'] / df['clicks'].replace(0, np.nan)

print(f"‚úÖ Dataset loaded: {len(df)} rows, {len(campaigns)} campaigns")

## üìÖ Day 43: Probability Basics (~60 min)

### Learning Objectives
- Understand probability fundamentals
- Work with probability distributions
- Calculate expected values
- Apply probability to marketing decisions

### The Business Problem
Marketing is inherently probabilistic:
- What's the probability a user will click?
- What's the expected revenue from 1000 impressions?
- How likely is it our campaign will hit targets?

### üìñ Concept: Basic Probability

Probability measures the likelihood of events (0 to 1, or 0% to 100%).

In [None]:
# Empirical probability from data
campaign_data = df[df['campaign'] == 'Search_Brand']

# What's the probability of getting >15 conversions in a day?
total_days = len(campaign_data)
high_conversion_days = len(campaign_data[campaign_data['conversions'] > 15])
probability = high_conversion_days / total_days

print(f"Probability Analysis: Search_Brand Campaign")
print(f"Total days observed: {total_days}")
print(f"Days with >15 conversions: {high_conversion_days}")
print(f"Probability of >15 conversions: {probability:.2%}")

# Expected value (mean)
expected_conversions = campaign_data['conversions'].mean()
print(f"\nExpected daily conversions: {expected_conversions:.2f}")

### üìñ Concept: Normal Distribution

Many marketing metrics approximately follow a normal (bell curve) distribution.

In [None]:
# Analyze ROAS distribution
roas_data = df['roas'].dropna()
mean_roas = roas_data.mean()
std_roas = roas_data.std()

print(f"ROAS Distribution:")
print(f"Mean: {mean_roas:.2f}")
print(f"Std Dev: {std_roas:.2f}")

# Using normal distribution, what's probability of ROAS > 4.0?
z_score = (4.0 - mean_roas) / std_roas
prob_above_4 = 1 - stats.norm.cdf(z_score)

print(f"\nProbability of ROAS > 4.0: {prob_above_4:.2%}")
print(f"(Z-score: {z_score:.2f})")

# Visualize
plt.figure(figsize=(10, 6))
plt.hist(roas_data, bins=30, density=True, alpha=0.7, edgecolor='black', label='Actual ROAS')

# Overlay normal distribution
x = np.linspace(roas_data.min(), roas_data.max(), 100)
plt.plot(x, stats.norm.pdf(x, mean_roas, std_roas), 'r-', linewidth=2, label='Normal Distribution')
plt.axvline(4.0, color='green', linestyle='--', linewidth=2, label='Target ROAS = 4.0')

plt.title('ROAS Distribution with Normal Curve Overlay', fontweight='bold')
plt.xlabel('ROAS')
plt.ylabel('Density')
plt.legend()
plt.tight_layout()
plt.show()

### üìñ Concept: Expected Value and Risk

Expected value = probability √ó outcome

In [None]:
# Scenario: Should we increase budget on this campaign?
current_daily_spend = 500
current_avg_conversions = 15
conversion_value = 100

# Option 1: Keep current budget
option1_expected_revenue = current_avg_conversions * conversion_value
option1_expected_profit = option1_expected_revenue - current_daily_spend

# Option 2: Increase budget 50% (assume conversions increase 30% due to diminishing returns)
option2_daily_spend = current_daily_spend * 1.5
option2_expected_conversions = current_avg_conversions * 1.3
option2_expected_revenue = option2_expected_conversions * conversion_value
option2_expected_profit = option2_expected_revenue - option2_daily_spend

print("Budget Decision Analysis:")
print(f"\nOption 1 (Current):")
print(f"  Spend: ${current_daily_spend}")
print(f"  Expected Revenue: ${option1_expected_revenue:.2f}")
print(f"  Expected Profit: ${option1_expected_profit:.2f}")

print(f"\nOption 2 (Increase 50%):")
print(f"  Spend: ${option2_daily_spend}")
print(f"  Expected Revenue: ${option2_expected_revenue:.2f}")
print(f"  Expected Profit: ${option2_expected_profit:.2f}")

print(f"\nRecommendation: {'INCREASE BUDGET' if option2_expected_profit > option1_expected_profit else 'KEEP CURRENT'}")
print(f"Expected additional profit: ${option2_expected_profit - option1_expected_profit:.2f}/day")

### ‚úèÔ∏è Exercise 1: Probability Analysis

In [None]:
# YOUR CODE HERE
# For the FB_Prospecting campaign:
# 1. Calculate the probability of daily ROAS > 3.5
# 2. Calculate the probability of daily conversions between 10 and 20
# 3. What's the expected daily revenue?
# 4. Calculate 68% probability range for conversions (mean ¬± 1 std dev)
#    (This is the range we expect conversions to fall 68% of the time)



### üéØ Day 43 Mini-Project: Risk Assessment

Assess the risk and expected value of different campaign strategies.

In [None]:
# YOUR CODE HERE
# You're deciding between two strategies:
# 
# Strategy A: Conservative
# - Focus on Search campaigns (more predictable, lower variance)
# - Expected daily ROAS: mean of Search campaigns
# - Standard deviation: std of Search campaigns
#
# Strategy B: Aggressive  
# - Focus on Social campaigns (higher potential, higher variance)
# - Expected daily ROAS: mean of Social campaigns
# - Standard deviation: std of Social campaigns
#
# For each strategy:
# 1. Calculate expected ROAS
# 2. Calculate probability of ROAS < 2.0 (loss scenario)
# 3. Calculate probability of ROAS > 5.0 (win scenario)
# 4. Visualize the distributions
# 5. Make a recommendation based on risk tolerance



### üéì Day 43 Key Takeaways

‚úÖ Probability quantifies uncertainty  
‚úÖ Expected value = average outcome over many trials  
‚úÖ Normal distribution describes many marketing metrics  
‚úÖ Z-scores measure standard deviations from mean  
‚úÖ Risk assessment uses probability distributions  

**Next:** Tomorrow we'll learn confidence intervals!

---

## üìÖ Day 44: Confidence Intervals (~60 min)

### Learning Objectives
- Understand confidence intervals
- Calculate confidence intervals for means
- Interpret confidence levels (90%, 95%, 99%)
- Apply to marketing metrics

### The Business Problem
Sample statistics (from your data) estimate population parameters (true values):
- Our campaign's true CTR is unknown - we only have a sample
- Confidence intervals express uncertainty in our estimates
- "We're 95% confident the true ROAS is between 3.2 and 3.8"

### üìñ Concept: Confidence Interval for Mean

CI = mean ¬± (critical value √ó standard error)

In [None]:
# Calculate 95% confidence interval for Search_Brand ROAS
campaign_data = df[df['campaign'] == 'Search_Brand']['roas'].dropna()

n = len(campaign_data)
mean = campaign_data.mean()
std_error = campaign_data.sem()  # Standard error of mean
confidence_level = 0.95

# Calculate confidence interval
ci = stats.t.interval(confidence_level, n-1, mean, std_error)

print(f"Search_Brand ROAS Confidence Interval (95%):")
print(f"Sample size: {n} days")
print(f"Mean ROAS: {mean:.3f}")
print(f"Standard Error: {std_error:.3f}")
print(f"\n95% Confidence Interval: [{ci[0]:.3f}, {ci[1]:.3f}]")
print(f"\nInterpretation: We are 95% confident the true mean ROAS")
print(f"for this campaign is between {ci[0]:.2f} and {ci[1]:.2f}")

### üìñ Concept: Different Confidence Levels

Higher confidence = wider interval

In [None]:
# Compare 90%, 95%, and 99% confidence intervals
confidence_levels = [0.90, 0.95, 0.99]

print(f"Confidence Intervals for Search_Brand ROAS:")
print(f"Mean: {mean:.3f}\n")

for conf_level in confidence_levels:
    ci = stats.t.interval(conf_level, n-1, mean, std_error)
    width = ci[1] - ci[0]
    print(f"{conf_level*100:.0f}% CI: [{ci[0]:.3f}, {ci[1]:.3f}]  (width: {width:.3f})")

print("\nNote: Higher confidence = wider interval = more uncertainty")

### üìñ Concept: Sample Size and Confidence

Larger samples = narrower confidence intervals

In [None]:
# Demonstrate effect of sample size
sample_sizes = [10, 30, 60, 90]

print("Effect of Sample Size on Confidence Interval Width:\n")
for sample_size in sample_sizes:
    sample_data = campaign_data.sample(min(sample_size, len(campaign_data)))
    ci = stats.t.interval(0.95, len(sample_data)-1, sample_data.mean(), sample_data.sem())
    width = ci[1] - ci[0]
    print(f"n={sample_size:3d}: CI width = {width:.3f}  [{ci[0]:.3f}, {ci[1]:.3f}]")

print("\nLarger sample ‚Üí Narrower CI ‚Üí More precise estimate")

### üìñ Concept: Visualizing Confidence Intervals

In [None]:
# Calculate 95% CI for each campaign's ROAS
campaign_cis = []

for campaign in df['campaign'].unique():
    campaign_roas = df[df['campaign'] == campaign]['roas'].dropna()
    if len(campaign_roas) > 2:
        mean = campaign_roas.mean()
        ci = stats.t.interval(0.95, len(campaign_roas)-1, mean, campaign_roas.sem())
        campaign_cis.append({
            'campaign': campaign,
            'mean': mean,
            'ci_lower': ci[0],
            'ci_upper': ci[1],
            'error': ci[1] - mean
        })

ci_df = pd.DataFrame(campaign_cis).sort_values('mean', ascending=False)

# Visualize
plt.figure(figsize=(12, 8))
y_pos = range(len(ci_df))
plt.errorbar(ci_df['mean'], y_pos, xerr=ci_df['error'], fmt='o', markersize=8, capsize=5)
plt.yticks(y_pos, ci_df['campaign'])
plt.axvline(3.0, color='red', linestyle='--', label='Target ROAS = 3.0')
plt.xlabel('ROAS')
plt.title('Campaign ROAS with 95% Confidence Intervals', fontweight='bold', fontsize=14)
plt.legend()
plt.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.show()

print("\nInterpretation: If a campaign's CI doesn't overlap with target (3.0),")
print("we can be confident it's performing differently than target.")

### ‚úèÔ∏è Exercise 2: Confidence Intervals for Metrics

In [None]:
# YOUR CODE HERE
# For each channel (Google, Meta, LinkedIn, TikTok):
# 1. Calculate 95% CI for mean CPA
# 2. Calculate 95% CI for mean CTR
# 3. Create a visualization showing channels with error bars
# 4. Which channels have statistically different CPAs?
#    (CIs don't overlap = statistically different)



### üéØ Day 44 Mini-Project: Confidence in Recommendations

Use confidence intervals to make data-driven recommendations.

In [None]:
# YOUR CODE HERE
# Create a recommendation framework using confidence intervals:
#
# For each campaign:
# 1. Calculate 90% CI for ROAS
# 2. Classify campaigns:
#    - "Strong Performer": Entire CI above 4.0
#    - "Solid Performer": Entire CI above 3.0
#    - "Acceptable": Entire CI above 2.0  
#    - "Needs Improvement": CI overlaps with 2.0
#    - "Underperformer": Entire CI below 2.0
# 3. Provide specific recommendations with confidence levels
# 4. Visualize the classification
#
# Example: "We are 90% confident Search_Brand ROAS is above 4.0.
#           Recommendation: SCALE this campaign."



### üéì Day 44 Key Takeaways

‚úÖ Confidence intervals quantify estimate uncertainty  
‚úÖ 95% CI means we're 95% confident true value is in the range  
‚úÖ Larger samples ‚Üí narrower CIs ‚Üí more precision  
‚úÖ Non-overlapping CIs suggest real differences  
‚úÖ Always report CIs with point estimates  

**Next:** Tomorrow we'll learn hypothesis testing!

---

## üìÖ Day 45: Hypothesis Testing Intro (~60 min)

### Learning Objectives
- Understand null and alternative hypotheses
- Learn p-values and significance levels
- Understand Type I and Type II errors
- Make statistical decisions

### The Business Problem
Answer questions definitively:
- Did the new creative actually improve CTR?
- Is Campaign A truly better than Campaign B?
- Was this week's performance change significant or luck?

### üìñ Concept: Hypothesis Testing Framework

**Steps:**
1. State null hypothesis (H‚ÇÄ) and alternative (H‚ÇÅ)
2. Choose significance level (Œ±, usually 0.05)
3. Calculate test statistic
4. Calculate p-value
5. Make decision: reject H‚ÇÄ if p < Œ±

In [None]:
# Example: Is Search_Brand ROAS significantly greater than 3.0?
campaign_roas = df[df['campaign'] == 'Search_Brand']['roas'].dropna()

# H‚ÇÄ: Œº = 3.0 (ROAS is 3.0)
# H‚ÇÅ: Œº > 3.0 (ROAS is greater than 3.0)
# Œ± = 0.05 (5% significance level)

null_value = 3.0
alpha = 0.05

# One-sample t-test (one-sided)
t_statistic, p_value = stats.ttest_1samp(campaign_roas, null_value, alternative='greater')

print(f"Hypothesis Test: Is Search_Brand ROAS > 3.0?")
print(f"\nH‚ÇÄ: ROAS = 3.0")
print(f"H‚ÇÅ: ROAS > 3.0")
print(f"Significance level (Œ±): {alpha}")
print(f"\nSample mean: {campaign_roas.mean():.3f}")
print(f"Sample size: {len(campaign_roas)}")
print(f"\nTest statistic (t): {t_statistic:.3f}")
print(f"P-value: {p_value:.4f}")
print(f"\nDecision: ", end="")
if p_value < alpha:
    print(f"REJECT H‚ÇÄ (p={p_value:.4f} < {alpha})")
    print(f"Conclusion: ROAS is significantly greater than 3.0")
else:
    print(f"FAIL TO REJECT H‚ÇÄ (p={p_value:.4f} >= {alpha})")
    print(f"Conclusion: Insufficient evidence that ROAS > 3.0")

### üìñ Concept: P-Value Interpretation

P-value = probability of observing this data if H‚ÇÄ were true

In [None]:
print("P-Value Interpretation Guide:")
print("\np < 0.01:  Very strong evidence against H‚ÇÄ")
print("p < 0.05:  Strong evidence against H‚ÇÄ (commonly used threshold)")
print("p < 0.10:  Moderate evidence against H‚ÇÄ")
print("p >= 0.10: Weak or no evidence against H‚ÇÄ")
print("\nRemember: p-value is NOT the probability H‚ÇÄ is true!")
print("It's the probability of seeing this data IF H‚ÇÄ were true.")

### üìñ Concept: Type I and Type II Errors

- **Type I Error (Œ±)**: Reject true H‚ÇÄ (false positive)
- **Type II Error (Œ≤)**: Fail to reject false H‚ÇÄ (false negative)

In [None]:
print("Error Types in Marketing Context:")
print("\nScenario: Testing if new creative improves CTR")
print("H‚ÇÄ: New creative has same CTR as old")
print("H‚ÇÅ: New creative has higher CTR")
print("\nType I Error (False Positive):")
print("  - Conclude new creative is better when it's actually not")
print("  - Risk: Scale bad creative, waste budget")
print("  - Probability: Œ± (significance level, usually 5%)")
print("\nType II Error (False Negative):")
print("  - Fail to detect that new creative is actually better")
print("  - Risk: Miss opportunity to improve performance")
print("  - Probability: Œ≤ (power = 1-Œ≤, usually aim for 80% power)")

### ‚úèÔ∏è Exercise 3: Hypothesis Tests

In [None]:
# YOUR CODE HERE
# Test these hypotheses:
# 
# 1. Is FB_Prospecting CPA significantly less than $30?
#    H‚ÇÄ: CPA = 30
#    H‚ÇÅ: CPA < 30
#
# 2. Is Display_Remarketing CTR significantly different from 2%?
#    H‚ÇÄ: CTR = 0.02
#    H‚ÇÅ: CTR ‚â† 0.02 (two-sided test)
#
# For each test:
# - Calculate test statistic and p-value
# - Make decision at Œ±=0.05
# - Write business interpretation



### üéØ Day 45 Mini-Project: Campaign Performance Testing

Test multiple hypotheses about campaign performance.

In [None]:
# YOUR CODE HERE
# Systematic hypothesis testing for all campaigns:
#
# For each campaign, test:
# 1. H‚ÇÄ: ROAS ‚â§ 3.0 vs H‚ÇÅ: ROAS > 3.0
# 2. H‚ÇÄ: CPA ‚â• 25 vs H‚ÇÅ: CPA < 25
#
# Create a summary table with:
# - Campaign name
# - Sample mean
# - P-value
# - Decision (Reject/Fail to Reject)
# - Recommendation
#
# Visualize the results
# Provide an executive summary of findings



### üéì Day 45 Key Takeaways

‚úÖ Hypothesis testing provides statistical decision framework  
‚úÖ P-value measures evidence against null hypothesis  
‚úÖ Œ± (significance level) controls Type I error rate  
‚úÖ Never "accept" H‚ÇÄ, only fail to reject it  
‚úÖ Always state hypotheses before looking at data  

**Next:** Tomorrow we'll learn t-tests to compare campaigns!

---

## üìÖ Day 46-48: Advanced Statistical Tests

**Day 46: T-Tests**
- Independent samples t-test (compare two campaigns)
- Paired t-test (before/after comparisons)
- Assumptions and when to use
- Effect size (Cohen's d)

**Day 47: Chi-Square Tests**
- Chi-square test for independence
- Testing categorical relationships
- Conversion rate comparisons
- Contingency tables

**Day 48: Correlation vs Causation**
- Understanding correlation limitations
- Confounding variables
- Simpson's paradox
- Establishing causality in marketing

---

## üìÖ Day 49: Week 7 Capstone - Statistical Campaign Analysis (~60 min)

### Project: Comprehensive Statistical Analysis of Marketing Campaigns

**Scenario:**  
You're presenting to the CMO about Q4 performance. They want statistical rigor:
- Which campaigns truly outperformed?
- Are channel differences statistically significant?
- What can we conclude with confidence?
- What are the risks and uncertainties?

**Deliverables:**
Use all statistical techniques learned this week to provide a comprehensive analysis.

### Analysis 1: Campaign Performance with Confidence

In [None]:
# YOUR CODE HERE
# For each campaign:
# 1. Calculate mean ROAS with 95% CI
# 2. Test if ROAS is significantly > 3.0 (target)
# 3. Calculate probability of hitting target next month
# 4. Classify campaign confidence:
#    - High Confidence Winner: p < 0.01 and entire CI > 3.0
#    - Solid Performer: p < 0.05 and mean > 3.0
#    - Uncertain: p >= 0.05
#    - Underperformer: Significantly below 3.0
# 5. Create visualization with error bars and significance markers



### Analysis 2: Channel Comparisons

In [None]:
# YOUR CODE HERE
# Compare channels statistically:
# 1. For each pair of channels, run t-test comparing ROAS
# 2. Create a matrix showing p-values for all pairwise comparisons
# 3. Identify which channel differences are statistically significant
# 4. Calculate effect sizes (Cohen's d) for significant differences
# 5. Rank channels by performance with statistical backing
#
# Example output:
# "Google significantly outperforms TikTok (p=0.003, Cohen's d=0.85)"



### Analysis 3: Before/After Analysis

In [None]:
# YOUR CODE HERE
# Compare first month vs last month performance:
# 
# For each campaign:
# 1. Split data: First 30 days vs Last 30 days
# 2. Run paired or independent t-test comparing ROAS
# 3. Test if there's significant improvement/decline
# 4. Calculate magnitude of change
# 5. Visualize trends with statistical significance markers
#
# Identify campaigns that:
# - Significantly improved
# - Significantly declined
# - Remained stable



### Analysis 4: Risk Assessment

In [None]:
# YOUR CODE HERE
# Statistical risk analysis:
#
# For each campaign, calculate:
# 1. Probability of ROAS < 2.0 (loss scenario)
# 2. Probability of ROAS > 5.0 (exceptional scenario)
# 3. 90% prediction interval for next month's ROAS
# 4. Variance/volatility score
# 5. Risk-adjusted performance score
#
# Create risk-return matrix:
# - X-axis: Expected ROAS
# - Y-axis: Risk (std dev)
# - Size: Sample size (confidence)
# 
# Provide investment recommendations based on risk tolerance



### Analysis 5: Correlation Analysis with Caution

In [None]:
# YOUR CODE HERE
# Correlation analysis with statistical testing:
#
# 1. Calculate correlations between metrics
# 2. Test significance of each correlation
# 3. Calculate correlation p-values
# 4. Identify significant relationships
# 5. Discuss correlation vs causation
#
# Questions to answer:
# - Is CTR significantly correlated with CVR?
# - Does higher spend correlate with better ROAS?
# - Which metric correlations are strongest and most reliable?
#
# IMPORTANT: Distinguish correlation from causation in your conclusions!



### Executive Summary: Statistical Findings

In [None]:
# YOUR CODE HERE
# Create an executive summary including:
#
# 1. Top 3 statistically significant findings
# 2. Campaigns to scale (with confidence levels)
# 3. Campaigns to optimize or pause (with statistical backing)
# 4. Risk assessment and mitigation strategies
# 5. Predictions for next period with confidence intervals
# 6. Recommended actions with statistical justification
#
# Format: Executive-ready, clear, actionable
# Include: Numbers, confidence levels, p-values where appropriate
# Avoid: Jargon that executives won't understand

print("""
EXECUTIVE SUMMARY: STATISTICAL CAMPAIGN ANALYSIS
================================================

KEY FINDINGS:
1. [Your statistically significant finding with p-value]
2. [Your finding]
3. [Your finding]

HIGH CONFIDENCE RECOMMENDATIONS:
1. [Recommendation] - 95% confident [metric] will [outcome]
2. [Recommendation]
3. [Recommendation]

RISK ASSESSMENT:
- [Risk identified with probability]
- [Mitigation strategy]

PREDICTIONS (with 90% confidence):
- [Metric] expected to be between [lower] and [upper]

""")

### üéì Week 7 Complete!

**Congratulations!** You've completed Week 7 of the Marketing Measurement Partner Academy.

**What You've Mastered:**
- ‚úÖ Probability fundamentals and distributions
- ‚úÖ Confidence intervals for uncertainty quantification
- ‚úÖ Hypothesis testing framework
- ‚úÖ T-tests for campaign comparisons
- ‚úÖ Chi-square tests for categorical data
- ‚úÖ Correlation vs causation understanding
- ‚úÖ Statistical analysis of marketing campaigns

**Your Statistical Toolkit:**
You can now:
- Quantify uncertainty in your estimates
- Make statistically rigorous decisions
- Compare campaigns with confidence
- Assess risks probabilistically
- Avoid common statistical pitfalls
- Communicate findings with appropriate confidence levels

**What's Next:**
You've completed the core curriculum! Future weeks could cover:
- Week 8: A/B Testing and Experimentation
- Week 9: Attribution Modeling
- Week 10: Predictive Analytics
- Week 11: Marketing Mix Modeling
- Week 12: Advanced Analytics & ML

**You're now equipped with professional-grade marketing analytics skills!** üöÄ

---

### üéâ Marketing Measurement Partner Academy - Weeks 1-7 Complete!

**Your Journey:**
- Week 1: Python Foundations ‚úÖ
- Week 2: Pandas & Data Manipulation ‚úÖ
- Week 3: SQL Basics ‚úÖ
- Week 4: SQL Advanced ‚úÖ
- Week 5: EDA Fundamentals ‚úÖ
- Week 6: Data Visualization ‚úÖ
- Week 7: Statistics Foundations ‚úÖ

**You can now:**
- Write Python code for data analysis
- Query databases with SQL
- Explore and profile datasets
- Create compelling visualizations
- Apply statistical methods rigorously
- Make data-driven marketing decisions

**Keep practicing, keep learning, keep growing!** üìäüéØ