You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi thanks for providing great work!
Just in case, did you experiment with the text encoder of the clip instead of the language pretrained model?
Of course, I think the clip text encoder only learns the alignment with visual features, but I'm still curious whether clip text encoder has generation ability.
Thanks
The text was updated successfully, but these errors were encountered:
Hi thanks for providing great work!
Just in case, did you experiment with the text encoder of the clip instead of the language pretrained model?
Of course, I think the clip text encoder only learns the alignment with visual features, but I'm still curious whether clip text encoder has generation ability.
Thanks
The text was updated successfully, but these errors were encountered: