joint training hyper parameters #7

lxa9867 · 2022-04-04T15:44:13Z

Hi,

Thank you for sharing your work. I am writing to inquire the hyper parameters used for joint training. The arxiv paper mentioned that

which says that the joint training uses 32 V100 GPUs and 2 video clips for each GPU.

I consider it means 32G V100 GPU. But I think it's not possible to add 2 video clip within 32G memory. I cannot reproduce the result using 8 V100 32G GPU with 1 clip per GPU, would you like to give me some advice? Thank you!

lxa9867 · 2022-04-05T03:43:38Z

BTW, would you like to share your script for pretraining, thanks!

wjn922 · 2022-04-05T14:43:07Z

Hi,

We use 32 V100 GPUs with 32G memory. And sorry for the mistake, it is actually is 1 for joint training and 2 for pretraining.
The joint training and pretraining stages would take around 1~2 days depends on the backbones.

We do not support multi-node training in the repo now. For single node training, the pretraining script is

python3 -m torch.distributed.launch --nproc_per_node=8  --use_env \
main_pretrain.py  --dataset_file all --binary --with_box_refine \
--batch_size 2 --num_frames 1 \
--epochs 12 --lr_drop 8 10 \
[backbone]

lxa9867 · 2022-04-05T17:24:35Z

Thank you for your information!

lxa9867 · 2022-04-09T03:33:57Z

Did you freeze text encoder during pretraining?

wjn922 · 2022-04-09T16:44:30Z

The text encoder is not frozen for pretraining. While for joint training, it is frozen.

lxa9867 · 2022-04-09T16:48:13Z

I think I found the reason I cannot reproduce the result...The paper has some ambiguity for this part. Thank you.

lxa9867 · 2022-04-18T19:05:35Z

Hi,

May I ask why you freeze the text encoder during joint training? Since the joint training doesn't need a pretraining process, I would assume the text encoder is not trained? Thank you!

wjn922 · 2022-04-19T03:26:09Z

We have experimented whether to freeze the text encoder with the R50 backbone. And we found freezing the text encoder would have 1+ points gain.

lxa9867 · 2022-04-19T03:27:03Z

Thank you very much!

lxa9867 closed this as completed Apr 5, 2022

lxa9867 reopened this Apr 9, 2022

lxa9867 closed this as completed Apr 9, 2022

wjn922 mentioned this issue Apr 12, 2022

Issue w.r.t pretraining models #11

Closed

lxa9867 reopened this Apr 18, 2022

lxa9867 closed this as completed Apr 19, 2022

wjn922 mentioned this issue Aug 3, 2022

how do you get pretrained model #19

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

joint training hyper parameters #7

joint training hyper parameters #7

lxa9867 commented Apr 4, 2022

lxa9867 commented Apr 5, 2022

wjn922 commented Apr 5, 2022 •

edited

lxa9867 commented Apr 5, 2022

lxa9867 commented Apr 9, 2022

wjn922 commented Apr 9, 2022

lxa9867 commented Apr 9, 2022 •

edited

lxa9867 commented Apr 18, 2022

wjn922 commented Apr 19, 2022

lxa9867 commented Apr 19, 2022

joint training hyper parameters #7

joint training hyper parameters #7

Comments

lxa9867 commented Apr 4, 2022

lxa9867 commented Apr 5, 2022

wjn922 commented Apr 5, 2022 • edited

lxa9867 commented Apr 5, 2022

lxa9867 commented Apr 9, 2022

wjn922 commented Apr 9, 2022

lxa9867 commented Apr 9, 2022 • edited

lxa9867 commented Apr 18, 2022

wjn922 commented Apr 19, 2022

lxa9867 commented Apr 19, 2022

wjn922 commented Apr 5, 2022 •

edited

lxa9867 commented Apr 9, 2022 •

edited