# Marketing mix model
Using statsmodels to model, optimize, and forecast

## Data Collection and Preparation

In [1]:
import pandas as pd

# Load data
df=pd.read_csv(r'C:\Users\charl\OneDrive\Desktop\nick notes 2024\data\mmm_data.csv',sep =',')

## Model Selection

In [2]:
# get media spending
mdsp_cols=[col for col in df.columns if 'mdsp_' in col]

build two datasets
‘X’ contains information on advertising spending while ‘y’ holds data on sales

To create our predictive model, we incorporate an essential component using ‘sm.add_constant(X) understand where sales start from when we’re not advertising... a sales baseline in absence of sales

In [3]:
import statsmodels.api as sm

# Create model
X = df[['mdsp_vidtr', 'mdsp_on', 'mdsp_audtr']]
y = df['sales']
X = sm.add_constant(X)  # Add a constant for the intercept

## Training the Model

sm.OLS(y, X): sets up a linear regression model where Y represents the variable you want to predict 
.fit(): actually fits the model to our data, which means it calculates the best-fit line that represents the relationship between our predictors
model.summary()

In [4]:
model = sm.OLS(y, X).fit()
model.summary()

0,1,2,3
Dep. Variable:,sales,R-squared:,0.456
Model:,OLS,Adj. R-squared:,0.448
Method:,Least Squares,F-statistic:,57.27
Date:,"Fri, 24 May 2024",Prob (F-statistic):,6.2100000000000006e-27
Time:,19:29:24,Log-Likelihood:,-3955.9
No. Observations:,209,AIC:,7920.0
Df Residuals:,205,BIC:,7933.0
Df Model:,3,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,6.679e+07,7.24e+06,9.225,0.000,5.25e+07,8.11e+07
mdsp_vidtr,194.5686,22.564,8.623,0.000,150.082,239.055
mdsp_on,74.3041,27.124,2.739,0.007,20.826,127.782
mdsp_audtr,-61.0892,49.578,-1.232,0.219,-158.837,36.659

0,1,2,3
Omnibus:,43.666,Durbin-Watson:,2.154
Prob(Omnibus):,0.0,Jarque-Bera (JB):,92.168
Skew:,0.986,Prob(JB):,9.68e-21
Kurtosis:,5.588,Cond. No.,901000.0


## Model Evaluation

In [5]:
r_squared = model.rsquared
print(f"R-squared: {r_squared}")

R-squared: 0.45596815441052385


Attribution Analysis

coefficients = model.params: This line retrieves the coefficients (also known as parameter estimates) of our regression model and stores them in the variable coefficients

In [6]:
coefficients = model.params
print("Model Coefficients:")
print(coefficients)

Model Coefficients:
const         6.678621e+07
mdsp_vidtr    1.945686e+02
mdsp_on       7.430407e+01
mdsp_audtr   -6.108916e+01
dtype: float64


## Budget Optimization

In [7]:
total_budget = 1000000  # Total advertising budget for a period
allocation_weights = [0.4, 0.3, 0.3]  # Allocation weights for TV, Online, and Radio
budget_allocation = [w * total_budget for w in allocation_weights] #  This line calculates the actual budget allocation for each advertising channel. It multiplies each allocation weight by the total budget to determine how much money should be spent on each channel.
#budget_allocation = [400000.0, 300000.0, 300000.0]

## Forecasting

This line sets up a list called new_advertising_spend with the anticipated spending for the next month on three advertising channelscreates a new dataset called new_data. It’s structured like our original data but with the anticipated spending values for the next month. The ‘const’ column is added with a value of 1 to account for the baseline

Here, we’re using our previously trained regression model (model) to predict sales based on the anticipated advertising spending for the next month. The predict function takes new_data as input and calculates the expected sales

In [8]:
# Predict next month's sales based on anticipated advertising spend
new_advertising_spend = [60000, 35000, 17000]  # Anticipated spend for the next month
new_data = pd.DataFrame({'TV_spend': [new_advertising_spend[0]],
                         'Online_spend': [new_advertising_spend[1]],
                         'Radio_spend': [new_advertising_spend[2]],
                         'const': [1]})  # Include a constant for forecasting
predicted_sales = model.predict(new_data)

# Print results
print(f"Budget Allocation: {budget_allocation}")
print(f"Predicted Sales: {predicted_sales[0]}")

# Budget Allocation: [400000.0, 300000.0, 300000.0]
# Predicted Sales: 4007180768773.236

Budget Allocation: [400000.0, 300000.0, 300000.0]
Predicted Sales: 4007180768677.104
