You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your excellent work. Recently I want to reproduce some results in Villa and conduct pre-training on indomain datasets. I am curious about whether it is possible to mimic the adversarial training codes in train_vqa_adv.py to pretraining stage simply? Is there any specific configuration for adversarial training in pretraining stage?
The text was updated successfully, but these errors were encountered:
Sorry for the late response due to holiday season. Yes, basically you can follow the adversarial training code in train_vqa_adv.py to get the adversarial pre-training code ready. We also plan to release the pre-training code. Thanks for your reminder. Please stay tuned. We will get this done asap.
Meanwhile, you can also try by yourself. There is no specific things that you need to worry about. Basically, follow the pre-training configuration file of the UNITER code base, and then add the adversarial-training-related hyper-parameters. Hope it helps!
Hi,zhe;
Thanks for your response. I have a follow-up question. When I run the pretraining code in this VILLA repo, I found the training is very slow by using the default setting (setting the worker=4), the GPU utilization is very slow. When I set the worker=8 or higher, It will raise a problem as following.
I am wondering that Do you have the same phenomenon during training? How is your training speed in pretraining?
Thanks for trying our code. Empirically, we did not seem to meet the problem that you mentioned. How low is your GPU utilization?
We ran the pretraining code on our internal Microsoft GPU clusters, and did not observe the low utilization phenomenon. This may be caused by your RAM size or disk speed, or other constraints. When you tried the fine-tuning code, did you also have the same low-utilization problem? Thanks.
Hi, zhe;
Thanks for your excellent work. Recently I want to reproduce some results in Villa and conduct pre-training on indomain datasets. I am curious about whether it is possible to mimic the adversarial training codes in train_vqa_adv.py to pretraining stage simply? Is there any specific configuration for adversarial training in pretraining stage?
The text was updated successfully, but these errors were encountered: