# Notes on MMM

### Table of contents
[Technical implemntation](#Technical-implemntation)  
[1. Adstock function](#1.-Adstock-function)    
  
[Appendix](#Appendix)  
[1. Highlights and Rough work](#Highlights-and-Rough-work)

## Technical implemntation

### 1.1 Adstock function

the adstock function is used to model the delayed effect of media spend, and is given by:
$$w_{t-l}= D^{{(l-p)}^2} \;for\;each\;l\;in\;[0,L)$$  
$$x_t^{*}=Adstock(x_t,\cdots,x_{t-L-1};L,P,D)=\frac{\sum_{l=0}^{L-1}w_{t-l}.x_{t-l}}{\sum_{l=0}^{L-1}w_{t-l}}$$  
<ul>
    <li>D=retention rate</li>
    <li>P=Peak effect</li>
    <li>L=duration of media effect</li>
</ul>

In [1]:
#useful imports
import numpy as np
import pandas as pd

In [None]:
def make_weights(P,L,D):
    weights=[]
    for l in range(L):
        wt_l=D**((l-P)**2)
        weights.append(wt_l)
    
    #return a reveresed version of weights since we count down from t to 0
    return np.array(weights[::-1])


def adstock_transform(X,P,L,D):
    transformed=[]
    #weights stay the same for each sub array of size L in X 
    weights=make_weights(P,L,D)
    #make np in order to perform calcs
    X=np.array(X)

    for idx,xi in enumerate(X):
        #check if there are L-1 observations before xi
        if((idx-(L-1))>=0):
            x_subarr=X[idx-L+1:idx+1]
            adstocked_xi=sum(x_subarr*weights)/sum(weights)
            transformed.append(adstocked_xi)
        else:
            x_subarr=X[0:idx+1]
            #take the last weights
            weights_subarr=weights[(L-1)-idx:]
            adstocked_xi=sum(x_subarr*weights_subarr)/sum(weights)
            transformed.append(adstocked_xi)

    return np.array(transformed)

### 1.2 Hill Function

The hill function represents diminishing returns and is given by $$ Hill(x;K,S)=\frac{1}{1+(x/K)^S}$$
- S=Slope  
- K= half saturation point

Saturation is a state where extra spending doesn't increase sales  

In [None]:
def hill_transform(x,K,S):
    return 1 / (1 + (x / K)**(-S))

the hill functioin suffers from poor identifiability, thus in order to aid with this we set S to 1. Hereafter refered to as the reach transformation.

### 1.3 Combining the carryover and the shape effect 
  
Now we need to decide wether we want to apply the addstock or shape transform first. For datasets where spending is spread evenly between periods the shape might not matter much. In this case we would apply the addstock first. In scenarios where add spend is concentrated in a few periods you would do the inverse.

*Note: we assume media affects to be additive 

The generic equation is given by:$$y_t= \tau\;+\sum_{m=1}^{M}\beta_m Hill(x_{t}^*;K_m,S_m) \;+ \sum_{c=1}^{C} \gamma_c z_{t,c} +\epsilon_t$$
where $\gamma_c$ the effect of control variable c  
and $\beta_m$ the regression coefficient of media spend $x_m$

In [4]:
data =pd.read_csv("data.csv", index_col="wk_strt_dt", parse_dates=True)

## Appendix

### Highlights and rough work

<ul>
    <li>46 control variables, and 13 media channels</li>
    <li>ROAS=return on add spend</li>
    <li>MROAS= marginal ROAS</li>
</ul>

The procedure is as follows:

1. Fit a regression based on our priors for our parameters  
2. Calculate the sales for each channel using our model, each channel's contribution is calculated as the predicted sales of the model without that channel minus the predicted sales with that chanel.   
3. Use channel contribution to calculate ROAS and MROAS

-SEM= Paid search add

-A good model should take previous thouch points into account.

Since the model is multiplicative it is of the form:
$$ y= \beta_{0}.x_{var1}^{\beta_{var1}} \cdots x_{control1}^{\beta_{control1}}$$
  
hence we take the log of both sides and get:
$$\log y = \beta_{0}+\beta_{var1}x_{var1} \cdots +x_{control1}{\beta_{control1}}$$

-media spend has a delayed affect, this is modeled through the adstock function  
$$w_{t-l}= D^{{(l-p)}^2} \;for\;each\;l\;in\;[0,L)$$  
$$x_t^{*}=Adstock(x_t,\cdots,x_{t-L-1};L,P,D)=\frac{\sum_{l=0}^{L-1}w_{t-l}.x_{t-l}}{\sum_{l=0}^{L-1}w_{t-l}}$$  
<ul>
    <li>D=retention rate</li>
    <li>P=Peak effect</li>
    <li>L=duration of media effect</li>
</ul>
 

-diminshing returns modeled by the hill function  
-Hill function: $$ Hill(x;K,S)=\frac{1}{1+(x/K)^S}$$
