You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm trying to fintune TVR dataset with HERO pretrained model.
But with 5000 or 10000 train steps, I failed to reach the performance of the paper.
How many train steps are needed to finetune TVR dataset?
Is the number of GPU is critical to performance? I'm running this finetuning with 4 gpus.
Also, the paper doesn't describe any about hard negative sampling, but it seems to be important.
3. Have you done ablation study about hard negatives? Could you share your experience?
The text was updated successfully, but these errors were encountered:
We have provided with the best training config. The performance reported in the paper is from 5000 steps on 8 GPUs.
GPUs will affect the performance, as our hard negative sampling is conducted across all GPUs. So with less GPUs, you are seeing less examples in a single training steps.
For hard negatives, we strictly followed the original TVR work on how the model get trained. Please have a check on their repo.
Thanks.
Hi, I'm trying to fintune TVR dataset with HERO pretrained model.
But with 5000 or 10000 train steps, I failed to reach the performance of the paper.
Also, the paper doesn't describe any about hard negative sampling, but it seems to be important.
3. Have you done ablation study about hard negatives? Could you share your experience?
The text was updated successfully, but these errors were encountered: