PyTorch code for Finding in NAACL 2022 paper "Probing the Role of Positional Information in Vision-Language Models".
-
Updated
Jul 20, 2022 - Python
PyTorch code for Finding in NAACL 2022 paper "Probing the Role of Positional Information in Vision-Language Models".
Pytorch Implementation of NeuralTwinsTalk Presented @ IEEE HCCAI 2020.
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
[TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”
Official code of the paper ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling accepted at MICCAI 2024.
[NeurIPS 2023] Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation
Mixed vision-language Attention Model that gets better by making mistakes
[NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation
VizWiz Challenge Term Project for Multi Modal Machine Learning @ CMU (11777)
Official PyTorch implementation and benchmark dataset for IGARSS 2024 ORAL paper: "Composed Image Retrieval for Remote Sensing"
The code for generating natural distribution shifts on image and text datasets.
This repository contains a spatial understanding test suite for vision-language models
[ICCV 2021] On the hidden treasure of dialog in video question answering
Python scripts to use for captioning images with VLMs
Unofficial implementation for Sigmoid Loss for Language Image Pre-Training
Vision-lanugage model example code.
Authors official PyTorch implementation of the "ContraCLIP: Interpretable GAN generation driven by pairs of contrasting sentences".
Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]
[ICLR'24] Official code for "C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion"
Vision Language Dataset Construction Library for Remote Sensing Domain
Add a description, image, and links to the vision-language topic page so that developers can more easily learn about it.
To associate your repository with the vision-language topic, visit your repo's landing page and select "manage topics."