Unofficial implementation for Sigmoid Loss for Language Image Pre-Training
-
Updated
Sep 26, 2023 - Python
Unofficial implementation for Sigmoid Loss for Language Image Pre-Training
Bias-to-Text: Debiasing Unknown Visual Biases through Language Interpretation
VTC: Improving Video-Text Retrieval with User Comments
📍 Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)
Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Multi-Aspect Vision Language Pretraining - CVPR2024
A codebase for flexible and efficient Image Text Representation Alignment
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Official repository for "CLIP model is an Efficient Continual Learner".
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models. [ICCV 2023 Oral]
Demographic Bias of Vision-Language Foundation Models in Medical Imaging
SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models
Recognize Any Regions
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
FLAIR: A Foundation LAnguage-Image model of the Retina for fundus image understanding.
PyTorch implementation of ICML 2023 paper "SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation"
Add a description, image, and links to the vision-language-pretraining topic page so that developers can more easily learn about it.
To associate your repository with the vision-language-pretraining topic, visit your repo's landing page and select "manage topics."