Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choose proper moving average kernel for short input #4

Closed
PeihanDou opened this issue Nov 25, 2021 · 5 comments
Closed

Choose proper moving average kernel for short input #4

PeihanDou opened this issue Nov 25, 2021 · 5 comments

Comments

@PeihanDou
Copy link

Hello! Thank you for your well-commented code! I'm currently using Autoformer to deal with some data which have very short input length, such as only 8 timestamps. I noticed that the default moving average kernel size in series decomposition part is 25, which maybe too long for the input in this case. I tried some smaller kernel such as 3, 5, 7. But the model turned out to be worse on validation dataset. Do you have any suggestions about adjusting hyper parameters for short input? Any suggestion would be appreciated. Thank you!

@wuhaixu2016
Copy link
Collaborator

Hi, thanks for your usage of this repo.
(1) input length
I think the input length should be re-considered. We have discussed the input length in Appendix C of the paper. Generally speaking, longer inputs will provide more information, which can benefit the forecasting. Also, the input length can be affected by the sampling rate. Thus, I suggest to re-check the data pattern for the determination of input length.

(2) Only use the decoder (If your prediction horizon is long)
if you only have a limitation on the input length, while the forecasting horizon is long, I think you can only adopt the Autoformer decoder. And use the input length as the 'label_len' in this repo.

(3) input is short and output is also short.
In this condition, you can remove the moving average, and only use the Auto-Correlation. Because the shorter time series will contain a simpler temporal pattern, maybe you don't need decomposition.

@PeihanDou
Copy link
Author

Thank you for your respond! It is very insightful.

For (2), could you give some more clarification? If we only use the decoder, then how to handle the encoder's output? Or you are saying that use only decoder to generate all Q, K, V and let decoder be the whole model? Thank you!

@wuhaixu2016
Copy link
Collaborator

wuhaixu2016 commented Nov 25, 2021

I mean the latter case: "only decoder to generate all Q, K, V and let decoder be the whole model?"
In your case, the encoder seems to be meaningless if it only captures the information of 8 time points. You can adopt the Autoformer decoder to aggregate past information and generate the future. Note that, in this case, the decoder does not have the cross information, thus, it only contains one Auto-Correlation block, which is more like an encoder.

@PeihanDou
Copy link
Author

Thank you very much! That make sense!

@Med-Rokaimi
Copy link

condition

for (3), how to remove the moving average please?
I've tried to remove it from the settings but coming with error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants