Hello,
First of all, I would like to thank you for your excellent work on InternVideo-Next. I found the paper very insightful.
In the paper, you mention the model’s zero-shot video–text retrieval capability using MobileCLIP-B. I was wondering whether you plan to publish or release the text encoder that was used for this setup.
Thank you very much for your time and consideration.