-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation Strategy and in/out of sample dataframes #13
Comments
Simply put, the training data is never returned to the user the training data is used as both the training and validation set. The training portion is hardcoded at 70% from what I can recall. (in the future a parameter can be exposed to the user). The in-sample forecast (30%) is the test set, the out-of-sample forecast is the extrapolation without test data. Imagine predicting whether or not bankruptcies would occur without having labels to confirm it. This project needs an immense amount of attention. But thank you for appreciating it in its current form. Once some of my commitments ease up, I would turn it into something more accessible and light weight. |
Thanks, It's clear now. About the model fitting in the training data, Isn't there any value in knowing and returning the perfomance of the models in the training set? And how is the validation being made for the in-sample forecasts? I mean are the forecasts produced all at once (i.e, multi-step ahead forecasts) or do you apply something like rolling/walk-forward validation (re-training and forecasting) for each next time step in the test set? I assume, the former is applied but would like to confirm. Nice to hear about the future potential in the project, I could make some contributions as well. |
I guess it all depends on your purpose, would you mind looking into how one can return the training decompositions for GluonTS and AutoArima, both used in the package. I think I know how to do it for Prophet.
One multi-step forecast, the parameter called len is the amount of steps. Your very helpful to help me consider how to approach the future course of this project. Please keep on playing and let me know if you come across additional steps that need clarifications, suggestions and what not. I would be happy to have someone on-board. Let me know if you want to take some of these development tasks onto your own shoulders. (I have got a list of ideas at the end of the package) Best, |
That's good to hear. It's a really nice work so far and as you said there are of course a lot of ideas and much open space for constant improvement and enchancement of the project. At this moment I'm experimenting with a certain use case that might lead to publishing a research paper, probably using also Atspy, so it could be cited as well. This period the work/personal schedule is a bit hectic, so I will get back to you when it's more convenient for contributing. Regards, |
Thank you for this interesting and promising project. I've been using it lately and I have a couple of quenstions :
How are the forecasts being generated in the in-sample and out-of-sample dataframes? As one-step ahead or multi-step ahead forecasts?
From my experiments I perceive the forecast_in dataframe as the forecasts generated on the validation set and the forecast_out dataframe as the forecasts generated on the test set, like the attached image. Do I get this right?
Is there a way to access the fitted values on the training set?
The text was updated successfully, but these errors were encountered: