-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statsforecat.predict expects wrong dataframe shape on X_df #798
Comments
Actually the quick fix |
Hey. The X_df argument is for future values of exogenous features and you don't have any, so you don't need to provide it. |
@jmoralez so what is the correct way to predict future values for some input other than the training dataframe? |
@Vitorbnc You can just run |
@elephaint but I would like to do the following:
How can I supply an unseen input sequence to the model and get it to predict the next sequence? |
In an ARIMA model, we train on [x_t, x_{t+1}, ...., x_{t + T}] for a particular time series. We can then make predictions for that series only, for an arbitrary long period following x_{t + T} (by setting the horizon in our predict function). We can optionally add exogenous variables during training, and during the prediction period (the latter by including them in You can't 'supply an unseen input sequence' to an ARIMA model. If you have an unseen input sequence, you would normally train an ARIMA model on that unseen input sequence, and subsequently create forecasts for a horizon using that newly trained model. Each new time series requires a new ARIMA model. |
@elephaint ok then, that makes sense. Thanks for the explanation. I am trying to do the same with I am sure there is no |
Thanks - it's very hard to debug based on this picture only. Can you share a minimal working example of your code? Based on the picture I can only suggest to double check the existence of NaN in train_df. Note that you are training and testing on the same timestamps - I assume that is on purpose (as it's something that you'd want to avoid normally in forecasting)? I.e. you're supplying the full train_df as training set, and use a subset of train_df as test set. Hence, any test results you get will not be representative for the actual forecasting performance. I.e. normally one would do something like this:
in order to properly separate train- and test sets. |
@elephaint Yes, I will separate them in the real use case, for now I am just trying to write a single function that can take ML, Stats and Neural models and predict future data for comparison.
|
Thanks for the code. There are three issues when I'm running it, solving these three produces correct forecasts.
I'd advise you to read the end-to-end walkthrough of ML Forecast, which may help you avoid these and potential further issues. |
What happened + What you expected to happen
I am trying to supply a subset of the training dataset to X_df in sf.predict but I am getting the error
ValueError: Expected X to have shape (12, 2), but got (12, 3)
.Changing
expected_shape = (h * len(self.ga), self.ga.data.shape[1] + 1)
toexpected_shape = (h * len(self.ga), self.ga.data.shape[1] + 2)
seems to fix the issueVersions / Dependencies
pytorch 2.2.0 cpu_py311hd080823_0
pytorch-lightning 2.2.1 pyhd8ed1ab_0 conda-forge
statsforecast 1.7.3 pyhd8ed1ab_0 conda-forge
statsmodels 0.14.1 py311h59ca53f_0 conda-forge
Reproduction script
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: