Is EMA used in this work? #5

JacobYuan7 · 2022-01-06T07:35:30Z

Hello author, thanks for your great work. I raise a question about the usage of Exponential Moving Average (EMA) in this paper, hoping you can provide me with some clues. It seems that this paper does not detail in this part. As far as I know, MDETR uses it and evaluate use the EMA model. So I wonder is it used in this work? If it is actually used, why should we evaluate by the EMA model rather than the original one?

mmaaz60 · 2022-01-06T09:46:32Z

Hi @JacobYuan7,

Thank you for your interest in this work. Similar to MDETR, MDef-DETR also uses EMA during training and for evaluating the weights are loaded from the original model. You can try using the ema model for testing by loading the weights from checkpoint["model_ema"] instead of checkpoint["model"], and it should give almost the same results. Let me know if you have any questions.

JacobYuan7 · 2022-01-07T07:37:07Z

Hi @JacobYuan7,

Thank you for your interest in this work. Similar to MDETR, MDef-DETR also uses EMA during training and for evaluating the weights are loaded from the original model. You can try using the ema model for testing by loading the weights from checkpoint["model_ema"] instead of checkpoint["model"], and it should give almost the same results. Let me know if you have any questions.

As I understand it, MDETR uses 'model_ema' to evaluate the model, which is shown in:
https://github.com/ashkamath/mdetr/blob/bf09d98b0b41cd615185dcb0082299a5ba24c319/scripts/eval_lvis.py#L101
Correct me if I am wrong, many thanks!

BTW, the training of the language model follows MDETR, right? With a warmup schedule and then decrease linearly back to zero for the rest of the training.

mmaaz60 · 2022-01-09T13:31:01Z

As I understand it, MDETR uses 'model_ema' to evaluate the model, which is shown in: https://github.com/ashkamath/mdetr/blob/bf09d98b0b41cd615185dcb0082299a5ba24c319/scripts/eval_lvis.py#L101 Correct me if I am wrong, many thanks!

Hi, my apologies for the delayed reply. Yes, your understanding is correct. MDETR is using model_ema for evaluation during training and using model for inference (hubconf.py). However, I think using model_ema as well for inference would be more appropriate.

BTW, the training of the language model follows MDETR, right? With a warmup schedule and then decrease linearly back to zero for the rest of the training.

Yes, this is the case. Further, we are planning to release the training scripts by the end of this month. Stay tuned!

JacobYuan7 · 2022-01-30T07:15:50Z

As I understand it, MDETR uses 'model_ema' to evaluate the model, which is shown in: https://github.com/ashkamath/mdetr/blob/bf09d98b0b41cd615185dcb0082299a5ba24c319/scripts/eval_lvis.py#L101 Correct me if I am wrong, many thanks!

Hi, my apologies for the delayed reply. Yes, your understanding is correct. MDETR is using model_ema for evaluation during training and using model for inference (hubconf.py). However, I think using model_ema as well for inference would be more appropriate.

BTW, the training of the language model follows MDETR, right? With a warmup schedule and then decrease linearly back to zero for the rest of the training.

Yes, this is the case. Further, we are planning to release the training scripts by the end of this month. Stay tuned!

Sure, I will! Thx so much for your kind response.

mmaaz60 closed this as completed Feb 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is EMA used in this work? #5

Is EMA used in this work? #5

JacobYuan7 commented Jan 6, 2022

mmaaz60 commented Jan 6, 2022

JacobYuan7 commented Jan 7, 2022 •

edited

mmaaz60 commented Jan 9, 2022

JacobYuan7 commented Jan 30, 2022

Is EMA used in this work? #5

Is EMA used in this work? #5

Comments

JacobYuan7 commented Jan 6, 2022

mmaaz60 commented Jan 6, 2022

JacobYuan7 commented Jan 7, 2022 • edited

mmaaz60 commented Jan 9, 2022

JacobYuan7 commented Jan 30, 2022

JacobYuan7 commented Jan 7, 2022 •

edited