# AdSpend Regression Predictor — Ipsha Gautam
This notebook demonstrates multivariate linear regression applied to advertising spend channels to forecast sales. It emphasizes business interpretation of coefficients, residual analysis, and cost-function intuition.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Load dataset

df = pd.read_csv('sample_data/ad_spend_data.csv')
df.head()


## Model fitting and interpretation

In [None]:
X = df[['TV','Radio','Social','Online','Outdoor']]
y = df['Sales']
model = LinearRegression().fit(X, y)
print('Intercept:', model.intercept_)
print('Coefficients:', model.coef_)


## Predicted vs Actual and Residual Analysis

In [None]:
y_pred = model.predict(X)
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)
print('MSE:', mse, ' R2:', r2)
import matplotlib.pyplot as plt
plt.figure(figsize=(7,5))
plt.scatter(y, y_pred, alpha=0.6)
plt.plot([y.min(), y.max()], [y.min(), y.max()], 'r--')
plt.xlabel('Actual Sales ($)')
plt.ylabel('Predicted Sales ($)')
plt.title('Predicted vs Actual Sales — Ipsha Gautam')
plt.show()


## Cost-function intuition (visualizing MSE over a range of a single coefficient)

In [None]:
tv_coef = model.coef_[0]
slopes = np.linspace(tv_coef-0.001, tv_coef+0.001, 80)
costs = []
for m in slopes:
    coefs = model.coef_.copy()
    coefs[0] = m
    y_temp = X.values.dot(coefs) + model.intercept_
    costs.append(mean_squared_error(y, y_temp))
plt.figure(figsize=(7,4))
plt.plot(slopes, costs)
plt.xlabel('TV Coefficient (slope)')
plt.ylabel('MSE')
plt.title('Cost vs TV Coefficient — Ipsha Gautam')
plt.show()


## Business interpretation (concise)

- **Intercept** represents baseline expected sales when ad spend across channels is zero. Interpret with caution in real settings.  
- **Coefficients** quantify marginal expected change in sales for a unit increase in spend for each channel (e.g., per dollar invested).  
- **Residuals analysis** helps detect heteroscedasticity or non-linear patterns that may suggest transformations or different models.  
