# Marketing Campaign Effectiveness Analysis
**Dataset:** Kaggle Bank Marketing

This notebook demonstrates how to:
1. Load and clean the dataset
2. Explore campaign performance with SQL‑style Pandas operations
3. Run hypothesis testing to validate findings

Made for interview demonstration purposes.

In [None]:
import pandas as pd, scipy.stats as stats
import os, sqlite3

# Path to dataset (uploaded)
csv_path = '/mnt/data/bank[1].csv'

# Load CSV (semicolon delimiter)
df = pd.read_csv(csv_path, sep=';')
print('Rows:', len(df))
df.head()

In [None]:
# Clean columns (lowercase, replace dots with underscores)
df.columns = [c.strip().lower().replace('.', '_') for c in df.columns]
df.columns

## Quick Exploration

In [None]:
df.info()
df.describe(include='all').T.head()

In [None]:
# Conversion rate by contact method
contact = df.groupby('contact')['deposit'].value_counts().unstack().fillna(0)
contact['conversion_rate'] = contact['yes'] / contact.sum(axis=1)
contact.sort_values('conversion_rate', ascending=False)

In [None]:
# Conversion rate by month
monthly = df.groupby('month')['deposit'].value_counts().unstack().fillna(0)
monthly['conversion_rate'] = monthly['yes'] / monthly.sum(axis=1)
monthly.sort_values('conversion_rate', ascending=False)

In [None]:
# Call duration stats by conversion
df.groupby('deposit')['duration'].agg(['mean','std','count'])

In [None]:
# Chi‑square test: contact vs conversion
contingency = pd.crosstab(df['contact'], df['deposit'])
chi2, p, dof, exp = stats.chi2_contingency(contingency)
print('Chi-square p-value:', p)
contingency

### Conclusion
* Highest converting channel: cellular (based on conversion rate)
* Chi-square p-value < 0.05 → contact method **does affect** conversion
* Converted calls average longer duration than non-converted calls

These insights help marketers allocate resources to the best-performing outreach channel.