-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I test other settings on few-shot ability of OFA? #11
Comments
Hi @karin0018 , thank you for your interest! Yes, your config for inference is correct. However, you might also want to change the training n_way like here and here. This parameter controls the max n_way that the model sees during training. So direct inference on 40-way is technically possible, but you might want to train a model that sees 40-way classification before inference. Hope this helps. We will push a major update and some bug fixes along with our camera-ready version, stay tuned. |
Thanks for your reply! I try to train max n_ways=20 on arXiv dataset and max n_ways=40 in FB15K-237. After training 20 epoch, I got the results like this:
Is that reasonable? Also, I attempted to use your original training settings with |
Hi @karin0018, we just identified a dumb bug in few-shot scenario and it was fixed by this commit. Can you try pulling the repo again and train the model? In case you are interested in the cause of the bug. For a node in the i^th class, we should have the i^th class node to be labeled as positive, however, in this implementation, we accidentally have i^th through the last node to be labeled as positive (see the commit for details). Hence, you are getting near random-guess results. We just tried training with the updated code, you should be able to reproduce the paper's results. For the training time, do you mean end-to-end training or few-shot training? The end-to-end training script is expected to run roughly 2-days for 50 epochs on a Nvidia a100. However, it is abnormal if the low-resource experiment takes that long, did you also use a 40-class scenario? Again, sorry for the bug, and we are serious about the reproducibility of our work, we will make sure similar things don't happen again. Meanwhile, we have implemented a multi-gpu version that should be online in the next few days, hope that will alleviate the training time issue. |
Thanks for your detailed reply, I'll try again. ^-^. About the training time, I run the low-resource experiment by using the given command: |
I see, that might be it, if you don't really care about graph tasks, you can remove chemblpre from the task, which should speed up the training a lot. |
Thanks for your advices and I tried again. ^-^.
|
Hi @karin0018 , Sorry for the late reply. Interestingly, we are getting better results on arxiv and worse results on FB, but I think the margins are reasonable. The most likely cause is that you used more ways for testing, which caused the split to be different, and consequently, the results are different. (few-shot experiments are sensitive to class split) We just pushed a new version, if you'd like, you can pull and try again. |
Hi! First of all, thanks for your amazing work!
In Section E.2 you provide some experiment results spanning more ways and shots on ogbn-arxiv and FB15K237 datasets, such as 3/5-ways on ogbn-arxiv and 10/20-ways on FB15K237. I want to test other ways on these datasets, like 10/20/30/40-ways, how should I do it?
Can I just modify the
config/data_config.yaml
andconfig/task_config.yaml
like this :config/data_config.yaml
config/task_config.yaml
if not, what should I do to test other few-shot settings?
The text was updated successfully, but these errors were encountered: