# Week 2 Day 4 Assignment: Marketing & Sales Dataset

**Dataset:** `Advertising.csv`

Tasks:
- Compute correlation between ad spend and sales
- Compare correlation vs causal reasoning
- Discuss possible confounders

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_theme(style="whitegrid")

# Load dataset
path = "Advertising.csv"
df = pd.read_csv(path)

df.head()

In [None]:
# Basic overview
print(df.shape)
print(df.columns)

df.describe()

In [None]:
# Correlation between ad spend and sales
corr = df.corr(numeric_only=True)

corr["sales"].sort_values(ascending=False)

In [None]:
# Correlation heatmap
plt.figure(figsize=(6, 4))
sns.heatmap(corr, annot=True, cmap="Blues", fmt=".2f")
plt.title("Correlation Matrix")
plt.tight_layout()
plt.show()

# Scatterplots: ad spend vs sales
fig, axes = plt.subplots(1, 3, figsize=(12, 4), sharey=True)

sns.scatterplot(ax=axes[0], data=df, x="TV", y="sales")
axes[0].set_title("TV vs Sales")

sns.scatterplot(ax=axes[1], data=df, x="radio", y="sales")
axes[1].set_title("Radio vs Sales")

sns.scatterplot(ax=axes[2], data=df, x="newspaper", y="sales")
axes[2].set_title("Newspaper vs Sales")

plt.tight_layout()
plt.show()

## Correlation vs Causal Reasoning

Correlation shows the strength and direction of association between ad spend and sales in this dataset. A high positive correlation (for example, TV or radio spend with sales) indicates they move together, but it does **not** prove that ad spend caused sales to rise. Causal reasoning requires ruling out alternative explanations, controlling for confounders, and ideally using experiments (A/B tests) or quasi-experimental designs.

In this dataset, we can say that some ad channels are strongly associated with higher sales, but we cannot conclude causality without additional evidence.

## Possible Confounders

Potential confounders that could influence both ad spend and sales include:
- **Seasonality:** holidays or peak periods increase both ad budgets and sales.
- **Pricing/Promotions:** discounts or promotions can drive sales independent of ads.
- **Product availability or distribution changes:** more store coverage boosts sales and may trigger more ad spend.
- **Competitor actions:** competitor campaigns can affect both budget decisions and sales outcomes.
- **Brand strength or market trend:** stronger brands invest more and sell more even without extra ads.

To establish causality, you would need controlled experiments or models that explicitly adjust for these factors.