# Marketing Mix Modeling with Python Project


### Overview
- Import Libraries
- Import df
- Exploratory analysis
- OLS analysis
- Training, Testing, and Spliting the data
- Linear Regression Modeling
- Post modeling analysis



### Import Libraries

In [None]:
import pandas as pd
import plotly.express as px
import statsmodels.api as sm
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

### Import df

In [None]:
df = pd.read_csv('/dataset.csv')
df

Unnamed: 0,Time,sales,FB,TV,Radio
0,Week 1,22,230.1,37.8,69.2
1,Week 2,10,44.5,39.3,45.1
2,Week 3,9,17.2,45.9,69.3
3,Week 4,19,151.5,41.3,58.5
4,Week 5,13,180.8,10.8,58.4
5,Week 6,7,8.7,48.9,75.0
6,Week 7,12,57.5,32.8,23.5
7,Week 8,13,120.2,19.6,11.6
8,Week 9,5,8.6,2.1,1.0
9,Week 10,11,199.8,2.6,21.2


### Exploratory Analysis

In [None]:
fig = px.line(df, x= 'Time', y= df.columns)
fig.show()

In [None]:
df.corr()





Unnamed: 0,sales,FB,TV,Radio
sales,1.0,0.798049,0.396754,0.139209
FB,0.798049,1.0,-0.106758,-0.159078
TV,0.396754,-0.106758,1.0,0.648448
Radio,0.139209,-0.159078,0.648448,1.0


In [None]:
fig1 = px.scatter(df, x = "FB", y = "sales")
fig2 = px.scatter(df, x = "TV", y = "sales")
fig3 = px.scatter(df, x = "Radio", y = "sales")

fig1.show()
fig2.show()
fig3.show()

### OLS Analysis

In [None]:
inputs = ['FB', 'TV', 'Radio']

X= df[inputs]
y= df['sales']
X= sm.add_constant(X)

result = sm.OLS(y, X).fit()

print(result.summary())

                            OLS Regression Results                            
Dep. Variable:                  sales   R-squared:                       0.875
Model:                            OLS   Adj. R-squared:                  0.860
Method:                 Least Squares   F-statistic:                     60.60
Date:                Sun, 17 Dec 2023   Prob (F-statistic):           7.25e-12
Time:                        05:50:16   Log-Likelihood:                -58.368
No. Observations:                  30   AIC:                             124.7
Df Residuals:                      26   BIC:                             130.3
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          3.3613      0.909      3.697      0.0

- R squared is the most important statistic that shows the model fit. A R-squared that is closer to one is the better model, however this varies based on the circumstances of the model.
- F statistic shows how statisticially significant the model is. Generally should be higher than 2.
- Radio has a negative coefficient meaning as Radio Ad spending increases sales decrease. Radio was taken out of the next model for this reason.

In [None]:
inputs = ['FB', 'TV']

X= df[inputs]
y= df['sales']
X= sm.add_constant(X)

result = sm.OLS(y, X).fit()

print(result.summary())

                            OLS Regression Results                            
Dep. Variable:                  sales   R-squared:                       0.872
Model:                            OLS   Adj. R-squared:                  0.862
Method:                 Least Squares   F-statistic:                     91.84
Date:                Sun, 17 Dec 2023   Prob (F-statistic):           9.01e-13
Time:                        05:53:21   Log-Likelihood:                -58.729
No. Observations:                  30   AIC:                             123.5
Df Residuals:                      27   BIC:                             127.7
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          3.1774      0.873      3.639      0.0

### Training, Testing, and Spliting the Data

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X,y, shuffle = False, test_size= 0.5)

### Linear Regression Modeling

In [None]:
model =LinearRegression()

model.fit(X_train, y_train)

print(
    model.score(X_train, y_train),
    model.score(X_test, y_test)
)

0.8852707384190358 0.7585913209805945


In [None]:
model.coef_

array([0.        , 0.05432613, 0.1286808 ])

In [None]:
model.intercept_

3.131964924715666

In [None]:
X.columns

Index(['const', 'FB', 'TV'], dtype='object')

### Post Modeling Analysis

In [None]:
df['prediction']=model.predict(X)

fig = px.line(df, x='Time', y=df.columns)
fig.show()

In [None]:
fb = 1/0.5
fb

2.0

In [None]:
tv= 1/0.12
tv

8.333333333333334