Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decomposition vs. Prediction #85

Closed
wpro-ds opened this issue May 26, 2021 · 2 comments
Closed

Decomposition vs. Prediction #85

wpro-ds opened this issue May 26, 2021 · 2 comments

Comments

@wpro-ds
Copy link

wpro-ds commented May 26, 2021

Hi Robyn team,

Thanks for the great package ! I have been experimenting with it and seeing positive results. My question is somewhat related to #79 . For that issue resolution, you mentioned that Robyn is supposed to be used as a decomposition tool rather than a prediction tool. I think it would be useful to have some predictive functionality in the model. My questions are the following:

  1. How do we ensure that the model is reliable (i.e. validate the model) and that we can trust its recommendations ? In classic ML approaches, we answer this question based on prediction error on hold out data. In the absence of this predictive functionality in Robyn, what approaches do you recommend? P.S. - This is a critical issue to get buy-in when requesting increased budgets :)

  2. In Make predictions #79 , you mentioned that it is controversial how to best provide future dataframe for intercept/trend/season/other baselines. Could you shed some light on that ? What are the issues ?

Again, thanks for this wonderful package and looking forward to future releases.

@gufengzhou
Copy link
Contributor

Hi, thanks for trying out Robyn!

  1. For you information, we've removed the time series out-of-sample validation about a month ago. One important reason is that we want to build a new feature to enable MMM users to refresh the initial model using new data, a direct conflict to our previous OOS validation approach. As you know Robyn uses ridge regression, an approach that prevents overfitting by design. To be precise, we do have a 100-fold lambda cross validation for ridge regression. This is the major reason we're confident to go without time series OOS validation.
  2. For example, if you use Prophet for forecasting, you'll need to provide the future dataframe. While for some predictors (trend/season/weekday etc.) you can use the default predicted values from Prophet, for other predictors you need to make some strong assumptions. For example, if you have competitors as predictor, you'll have to somehow predict the future competition itself first. Weather as predictor is another example, which you'll need to forecast and is a topic for itself. Another example would be Covid, if you have it in the model: we all know it's not easy to predict Covid. That's why.

To summarise, Robyn's recommendation is based on your model choice in the end. If you ask how can you know if you've selected the "right" decay and saturation for your media, well the only way to know that is actually experimental calibration. In the spirit of "All models are wrong, some are useful", we believe only experiments can give you certainty. A model that is closer to experiment is therefore "more correct". Hope it makes sense.

@wpro-ds
Copy link
Author

wpro-ds commented Jun 2, 2021

Thanks for the responses ! Appreciate it and look forward to the new features.

@wpro-ds wpro-ds closed this as completed Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants