Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How many train steps are needed to get the performance of the paper when finetuning TVR dataset? #14

Closed
liveseongho opened this issue Mar 10, 2021 · 2 comments

Comments

@liveseongho
Copy link

Hi, I'm trying to fintune TVR dataset with HERO pretrained model.
But with 5000 or 10000 train steps, I failed to reach the performance of the paper.

  1. How many train steps are needed to finetune TVR dataset?
  2. Is the number of GPU is critical to performance? I'm running this finetuning with 4 gpus.

Also, the paper doesn't describe any about hard negative sampling, but it seems to be important.
3. Have you done ablation study about hard negatives? Could you share your experience?

@linjieli222
Copy link
Owner

Hi,

Thanks for your interests in this project.

  1. We have provided with the best training config. The performance reported in the paper is from 5000 steps on 8 GPUs.

  2. GPUs will affect the performance, as our hard negative sampling is conducted across all GPUs. So with less GPUs, you are seeing less examples in a single training steps.

  3. For hard negatives, we strictly followed the original TVR work on how the model get trained. Please have a check on their repo.
    Thanks.

@liveseongho
Copy link
Author

Thanks for your quick response! 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants