You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We add a model interaction using cross-model attention in Masked Video Encoder(MAE) and Multimodal Video Encoder which is based on UniFomerV2. The model with the largest number of parameters has not yet been public. You can try it yourself ~
Hi, where is the action recognition code for Table 1 in the paper, that is "Action recognition results on Kinetics & Something-Something"? Thanks.
The text was updated successfully, but these errors were encountered: