Can you explain why YOLOS-Small has 30 Million parameter while DeiT-S has 22 Million parameter #3

gaopengcuhk · 2021-06-14T22:53:16Z

As the title suggested

Yuxin-CV · 2021-06-15T03:47:10Z

Hi @gaopengcuhk, thanks for your interest in our work and good question!

For the small- and base-sized model, the added parameters mainly come from positional embeddings (PE): we add randomly initialized (512 / 16) x (864 / 16) PE at every Transformer layer to align with the DETR settings initially. But later we find that interpolate the pre-trained first layer PE to a larger size only, i.e., (800 / 16) x (1344 / 16) and without adding other PEs in intermediate layers can strike a better accuracy & parameter tradeoff. I.e., 36.6 AP v.s. 36.1 AP & 24.6 M (22.1 M + 2.5 M 😄) v.s. 30.7 M (22.1 M+ 8.6 M 😭). The tiny-sized model adopts this configuration.

We have added a detailed description in the Appendix and we will submit it to the arxiv soon (next week, hopefully), the pre-trained model will also be released soon, please stay tuned :)

This issue won't be closed until we update our manuscript on arxiv.

gaopengcuhk · 2021-06-15T03:50:49Z

Another question, why only add the prediction head on the last layer? Have you tried to add the prediction head to the last several layers like DETR?

Yuxin-CV · 2021-06-15T04:08:45Z

Another question, why only add the prediction head on the last layer? Have you tried to add the prediction head to the last several layers like DETR?

Thanks for your valuable issue.
We have tried this configuration in our early study, which gives no improvements.

The reason we guess is: for DETR, the deep supervision works because the supervision is "deep enough". I.e., the decoders are stacked upon least 50 / 101 layers ResNet backbone and 6 layers Transformer encoders. While YOLOS with a much shallow network cannot benefit from deep supervision.

gaopengcuhk · 2021-06-15T04:11:28Z

Another question, it seems like you add the position embedding to x every layer. While in Deit, only the first layer add position embedding, is this important in YOLOS?

Yuxin-CV · 2021-06-15T04:14:35Z

Another question, it seems like you add the position embedding to x every layer. While in Deit, only the first layer add position embedding, is this important in YOLOS?

We have actually answered here: #3 (comment): YOLOS with only first layer PE added is better in terms of AP and parameter efficiency :)

gaopengcuhk · 2021-06-15T04:15:59Z

Thank you very much for your reply.

Yuxin-CV · 2021-06-15T04:16:43Z

This issue won't be closed until we update our manuscript on arxiv.

Yuxin-CV · 2021-06-22T01:36:53Z

This issue won't be closed until we update our manuscript on arxiv.

We have updated our manuscript on arxiv, and as such I'm closing this issue. Let us know if you have further questions.

gaopengcuhk added the question label Jun 14, 2021

Yuxin-CV added the good first issue label Jun 15, 2021

Yuxin-CV closed this as completed Jun 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can you explain why YOLOS-Small has 30 Million parameter while DeiT-S has 22 Million parameter #3

Can you explain why YOLOS-Small has 30 Million parameter while DeiT-S has 22 Million parameter #3

gaopengcuhk commented Jun 14, 2021

Yuxin-CV commented Jun 15, 2021 •

edited

Loading

gaopengcuhk commented Jun 15, 2021

Yuxin-CV commented Jun 15, 2021

gaopengcuhk commented Jun 15, 2021

Yuxin-CV commented Jun 15, 2021

gaopengcuhk commented Jun 15, 2021

Yuxin-CV commented Jun 15, 2021 •

edited

Loading

Yuxin-CV commented Jun 22, 2021

Can you explain why YOLOS-Small has 30 Million parameter while DeiT-S has 22 Million parameter #3

Can you explain why YOLOS-Small has 30 Million parameter while DeiT-S has 22 Million parameter #3

Comments

gaopengcuhk commented Jun 14, 2021

Yuxin-CV commented Jun 15, 2021 • edited Loading

gaopengcuhk commented Jun 15, 2021

Yuxin-CV commented Jun 15, 2021

gaopengcuhk commented Jun 15, 2021

Yuxin-CV commented Jun 15, 2021

gaopengcuhk commented Jun 15, 2021

Yuxin-CV commented Jun 15, 2021 • edited Loading

Yuxin-CV commented Jun 22, 2021

Yuxin-CV commented Jun 15, 2021 •

edited

Loading

Yuxin-CV commented Jun 15, 2021 •

edited

Loading