You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I checked the file model_video_caption_mplug.py but I find there is no universal layer module in the code, but I see that images are first fed into visual encoders and then fed into text encoders. Does it mean universal layer is actually text encoder?
The text was updated successfully, but these errors were encountered:
I checked the file model_video_caption_mplug.py but I find there is no universal layer module in the code, but I see that images are first fed into visual encoders and then fed into text encoders. Does it mean universal layer is actually text encoder?
The text was updated successfully, but these errors were encountered: