-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prediction on unseen data #83
Comments
Hi @randomgitdude, check out #67 and the updated tutorials on https://github.com/jdb78/pytorch-forecasting/blob/master/docs/source/tutorials/stallion.ipynb, |
@AlexMRuch Thank you for pointing out to that updated tutorial. However I have few questions:
But I wonder if this is the correct approach or am I missing something ? |
For point 1, do you mean scaling the future target and covariate data or do you mean scaling the past historical data that has been trained on? If you mean the latter, I'm not sure as my time series skills are not that great and @jdb78 may have more thoughts. For historical data, I think you can still do all the scaling you need on the For point 2, I'm glad to hear you got the predictions working. I implemented the forecasting methods just as @jdb78 did in the tutorial and the results have face validity with what I'd expect (and they do differ from the evaluation plots), so that's about all I can speak to the question of whether the approach is correct. |
Future target data.
I was referring to future data. For the historical data - the data is actually scaled in the training class. Thus my assumption is that it should be used to scale future data. Why ? Well, simply because you have to scale the future values according to the mean and std of the training data not according to the mean or std of the future values. As for point no. 2 maybe @jdb78 can elaborate ? |
Ah, yeah, I definitely see your point and am curious to know what is best-practice as well! Thanks for clarifying! |
Issue #51 sheds some light on it. There are basically two approaches. Both are implemented in PyTorch Forecasting. |
The first option is off the table for various reasons - at least IMHO. |
In practice there should be minimal leakage by normalising on the entire training set instead of the encoder if the variable in question is not the target. Probably, normalising something else on the encoder sequence only would not work because the normalisation would not be stable. If you want to contribute this feature, feel invited to raise a PR! |
Ok - so few questions:
|
Sure.
Hope this is helpful. |
Hi,
Firstly - thank you for your time, work and commitment that went into this package. All is good stuff. Yet one on thing I'm kinda struggling is how to check the predictions on the data that is not visible in the trainer class (from the documentation). I guess I should append it with the original data - but do you have any good practices that you can share ?
The text was updated successfully, but these errors were encountered: