You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yes, we calculated FLOPs by summing up those of object detection backbone + object detection RCNN + NMS + modality interaction transformer for object detection-based vision-and-language models.
hi
when you compute the FLOPS in table 6 for baseline models such as ViLBERT, do you also include the FLOPS computation of feature extraction models?
The text was updated successfully, but these errors were encountered: