Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Modeltime #2

Closed
mdancho84 opened this issue Jul 10, 2020 · 7 comments
Closed

Integration with Modeltime #2

mdancho84 opened this issue Jul 10, 2020 · 7 comments

Comments

@mdancho84
Copy link
Collaborator

mdancho84 commented Jul 10, 2020

Hey @simonpcouch & @topepo

I'd like to open this issue to keep track of how I plan to use stacks within the modeltime forecasting framework. There shouldn't be anything additional required on your part to get the integration to happen. On my end, I'll just allow a model_stack that has been "fitted" (contains a "member_fits" list element) to be allowed in the modeltime_table().

Once stacks is released, just be aware that if you change the argument name or object class names, that it will break modeltime until I can catch up.


Plan

The goal is to integrate model_stack objects into the modeltime forecasting workflow similar to how I integrate workflow objects.

It's quite simple - add the fitted model_stack to a Modeltime Table just like you add a fitted workflow.

image

Then the fitted model stack will follow the same forecasting workflow.

image

To achieve this result, there are only a few requirements (things you need to be aware of that are intricacies of time series cross-validation and the modeltime forecasting workflow).

Modeltime Forecasting Workflow Requirements:

  • Objects must be "fitted" stacks, meaning you can call a predict() method and they work just like calling predict(workflow, new_data). Therefore, there should be a way to easily determine if a stack has been fitted or not. Only if a stack has been fitted, can it be added to a Modeltime Table. It looks like this can be detective if a model has a "member_fits" element.
  • Sequential models (e.g. ARIMA, Exponential Smoothing, RNN, LSTM) must be able to have stacks preserve the time-based sequence. Note that this is only required for sequential models, and non-sequential models like Random Forest can use the normal cross-validation. This should be already taken care of with rsample and timetk.
    • Cross Validation: rsample::rolling_origin() or timetk::time_series_cv() as the grid tuning strategy
    • Final Evaluation: rsample::initial_time_split() or timetk::time_series_split() as the final training and evaluation sets.
@simonpcouch
Copy link
Collaborator

Sounds great! Thanks for all the detail. The features you're relying on feel like they should be pretty stable right now, but I'll make sure to let you know if any change.

@mdancho84
Copy link
Collaborator Author

The only thing I need is to get the predict.model_stack() function working. Once that happens, I can begin testing with modeltime.

@simonpcouch
Copy link
Collaborator

Yeah, that's up for the next week or two! I'll drop a note here when we get the basics working.

@mdancho84
Copy link
Collaborator Author

Ok, that would be great. I'll then work on modeltime integration. I'm excited about this!

@mdancho84
Copy link
Collaborator Author

Once we get the naming conventions down in #13, I’m going to begin working on the Modeltime integration.

One concern I have is the butcher (#10). I have a modeltime_refit() method that retrains models on new data. I’m thinking the stack member models will need to be retrained to refit the stack on the full time series. My plan is to use the member models, so I’m hoping butcher won’t chop those out.

@simonpcouch
Copy link
Collaborator

Fair warning that development will probably slow up quite a bit for the next month or so, and then pick back up after then. I don't imagine we'd undergo any changes re: #13 beyond finding and replacing function names, so the API should otherwise remain stable in that sense. :-)

Still need to spend more time on the butcher methods before I'll have a good sense of what operations will still be able to carried out. Thinking about what a refit would look like, though, if that method uses a new training set, you'd probably need to start back up at the tuning candidates step, since the data stack is made up of the collated assessment set predictions from the tune results.

@github-actions
Copy link

github-actions bot commented Mar 6, 2021

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants