# 📊 A/B/C Marketing Campaign Analysis
---
**Objective:** Analyze the effectiveness of 3 different marketing campaigns (A, B, C) across multiple stores and customer segments over 4 weeks.

**Key Questions:**
- Are the groups balanced in terms of location, store size, and historical performance?
- Which campaign has the highest conversion rate and revenue per customer?
- Are differences statistically significant?


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import statsmodels.api as sm
from statsmodels.formula.api import ols
import warnings
warnings.filterwarnings('ignore')

sns.set(style='whitegrid')

## 📥 Load & Explore Data

In [None]:
df = pd.read_csv('ab_marketing_campaign_v2.csv')
df.head()

## 📊 Group Balance Check

In [None]:
# Distribution of stores and customers per campaign
df['campaign_group'].value_counts()

In [None]:
# Distribution by store location and size
pd.crosstab(df['campaign_group'], df['store_location'])

In [None]:
pd.crosstab(df['campaign_group'], df['store_size_category'])

In [None]:
# Distribution of baseline revenue and customer demographics
sns.boxplot(x='campaign_group', y='avg_revenue_last_3_months', data=df)
plt.title('Baseline Revenue by Campaign Group')
plt.show()

## 🧠 Behavioral Analysis

In [None]:
# Conversion Rate per Group
df_grouped = df.groupby('campaign_group').agg({
    'purchase': 'mean',
    'revenue': 'mean',
    'order_count': 'mean',
    'customer_id': 'nunique'
}).rename(columns={
    'purchase': 'conversion_rate',
    'revenue': 'avg_revenue',
    'order_count': 'avg_order_count',
    'customer_id': 'unique_customers'
})
df_grouped

In [None]:
# Visualize conversion rate
sns.barplot(x=df_grouped.index, y=df_grouped['conversion_rate'])
plt.title('Conversion Rate by Campaign Group')
plt.show()

## 🧪 Statistical Testing

In [None]:
# ANOVA test for revenue
anova_model = ols('revenue ~ campaign_group', data=df).fit()
sm.stats.anova_lm(anova_model, typ=2)

In [None]:
# Chi-square test for purchase rate
contingency_table = pd.crosstab(df['campaign_group'], df['purchase'])
chi2, p, dof, expected = stats.chi2_contingency(contingency_table)
print(f"Chi2 = {chi2:.2f}, p = {p:.4f}")

## 📌 Conclusion & Recommendation
- Summarize key insights from the analysis
- Which campaign is most effective?
- Any recommendations for future rollouts?
