Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adstock and Saturation transformations #94

Closed
fedemendez1 opened this issue Jun 14, 2021 · 13 comments
Closed

Adstock and Saturation transformations #94

fedemendez1 opened this issue Jun 14, 2021 · 13 comments

Comments

@fedemendez1
Copy link

Hi team,

I have a question, do you apply the adstock and saturation (hill) functions to the spend vector or to the execution vector? If the latter, will it work the same way if I have GRPs vs if I have impressions data?

Thanks,

R

@gufengzhou
Copy link
Contributor

hi, what do you mean by "execution vector"? if you mean non-spend media variables, then yes, we apply adstock & saturation on all variables specifies in set_mediaVarName, regardless of GRP/Imp/spend.

@fedemendez1
Copy link
Author

My questions would be:

  1. whether the adstock and the saturation functions need be applied to the GRPs/Impressions data or to the Spend data ?
  2. if, answer to 1) is GRPs/Impressions data, does the following lines of code work indistinctly if I have GRPs or Impressions data ?

I am asking because when I give adstocked_impressions to the below function, the x_scurve vector that I get has most of its elements close to zero, and I thought this might be because impressions data is usually "big figures" and maybe I should be giving GRPs values instead?

Thanks a lot for clarifying this.

## step 3: s-curve transformation gammaTrans <- round(quantile(seq(range(x_normalized)[1], range(x_normalized)[2], length.out = 100), gamma),4) x_scurve <- x_normalized**alpha / (x_normalized**alpha + gammaTrans**alpha)

@gufengzhou
Copy link
Contributor

robyn's adstock can be applied to either spend or GRP/imp/click etc. Which kind of metric and how big the numbers are doesn't really matter, as long as it's numerical. Also "close to zero" is not really a problem, too, because the hill function also normalises any values to domain of 0-1 AFAIK. Does it help? Or what error do you see exactly?

@fedemendez1
Copy link
Author

image

This is the x_scurve I get for one of my channels, after applying adstock and saturation (using impressions).
See how most of the weeks are close to zero? Can that be an issue?
This x_scurve vector is going to be directly used as an independent variable in the model?

@gufengzhou
Copy link
Contributor

gufengzhou commented Jun 15, 2021

x_scurve is just transformed from your x_adstocked and of course your raw x. it just reflects how your input data is. does the raw imp also look similar? if yes, it only means you had a big campaign around xmas 2018. so it looks fine to me.
yes, x_scurve will go into the fitting by glmnet

@fedemendez1
Copy link
Author

Ok, that makes sense, thanks a lot for your quick responses!

@fedemendez1
Copy link
Author

Hello, sorry for re-opening, but I got another question.

Is there a reason why the output from saturation transformation x_scurve is rangebound between 0 and 1?

Thanks.

R

@gufengzhou
Copy link
Contributor

gufengzhou commented Jul 1, 2021

Hi, this is just how hill function works: x ** alpha / (x ** alpha + gamma ** alpha). when the part gamma ** alpha approaches 0, the function approaches 1. When gamma ** alpha approaches positive infinity, the function approaches 0

@fedemendez1
Copy link
Author

Do you think it would be correct to scale up x after the saturation transformation?

Trying a multiplicative model approach but when I apply log() to 'x_saturated' , I then get negative contributions even with a positive coefficient.

Would be great to get your expert opinion. Thanks!

@gufengzhou
Copy link
Contributor

cool that you try to customise. well, because x_saturated is between 0-1, log(x_saturated) will be <=0, then when coef is positive, the contribution = coef * log(x_saturated) that is negative. This post might help you specifying log link in glmnet, which should look like glmnet(x, log(y),...), so you should probably directly log the dependent variable in cv.glmnet and glmnet. BUT after this, the whole decomposition in Robyn probably won't work, because multiplicative equation is quite difficult to decomp, which is one of our reasons not already touching this part yet. You probably need to also customise the decomp. Just so you know.

@fedemendez1
Copy link
Author

Exactly. Yes, I am in fact testing log-log, so log in both y and x.
Yes, noticed your decomp function won't work, that's why I am trying some other decomp function, but led me to realize that log(x_saturated) is negative therefore contribution will be negative.
But if I try re-scaling up your x_saturated, I can have log(x_saturated) to be positive. But wondered if doing that makes sense? Variance of x_saturated will remain the same of course...

@gufengzhou
Copy link
Contributor

Yes you can try to upscale. Although decomp is still difficult for log-log, because the current decomp function only works for linear additive model. Let me know if you make progress!

@fedemendez1
Copy link
Author

Will do, many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants