-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ran Single Dataset with UniST, got weird results. #14
Comments
Hi, Yilun, Thank you for your question and for trying out our code! The issue you're encountering is related to the initialization and training of certain modules in the model. Specifically, during the pretraining stage, we primarily focus on training the core model. However, the prompting stage introduces new components, such as memory pools, which are crucial for effective performance. These components are not optimized during the pretraining stage and are randomly initialized. Therefore, if you run the zero-shot evaluation script immediately after the pretraining stage without going through the prompt-tuning stage, the results are not correct because these newly introduced modules haven't been trained yet. The correct pipeline for zero-shot inference is as follows:
By following this pipeline, you will ensure that all components of the model are properly trained and optimized. |
Hi Yuan, Thanks for your reply! Additional question, if I want to train on a dataset (say, BikeNYC), and zero-shot evaluate on another different dataset (say, 'TrafficCD'), do I need to run the prompt-tuning script or not? If yes, I need to run prompt-tuning on BikeNYC, right? |
You have two options: 1. Evaluate the Base Model without Prompting: You can remove the prompting design by modifying the 2. Retain the Prompting Design: You need to run the prompt-tuning script on the source dataset. This step is necessary to optimize the additional model parameters, such as the memory pools, before performing zero-shot evaluation on the target dataset. |
I will try both. Thanks.
|
Thanks, I succeeded in getting reasonable results. |
Hello Yuan,
This is Yilun Jin (HKUST). Thanks very much for your insightful work and sharing the code!
I was trying to have a glimpse at how your code should work, so I did a simple experiment, which is training (
pretrain.sh
) on one dataset, and zero-shot evaluation on the same dataset (zero_shot.sh
), which (I suppose) should be equivalent to ordinary, dataset-specific spatio-temporal forecasting.What I did was:
Run the
pretrain.sh
with the following line:python main.py --device_id 0 --machine machine --dataset BikeNYC --task short --size middle --mask_strategy_random 'batch' --lr 3e-4 --used_data 'diverse' --prompt_ST 0 --few_ratio 1.0
and the model is saved at
experiments/Pretrain_Dataset_BikeNYC_Task_short_FewRatio_1.0/model_save/model_best
.Run the
zero_shot.sh
with the following line.python main.py --device_id 0 --machine machine --task short --size middle --prompt_ST 1 --pred_len 6 --his_len 6 --num_memory_spatial 512 --num_memory_temporal 512 --prompt_content 's_p_c' --dataset BikeNYC --used_data 'diverse' --file_load_path experiments/Pretrain_Dataset_BikeNYC_Task_short_FewRatio_0.5/model_save/model_best --few_ratio 0.0
and the evaluation results (I suppose) should be in
src/experiments/Test_Dataset_BikeNYC_Task_short_FewRatio_0.0/result.txt
.However, the results I get was
which was even below HA.
I think I might be doing something wrong here, so I just list everything I did, and maybe you can help to see what I am doing wrong.
Best,
Yilun
The text was updated successfully, but these errors were encountered: