Vision-Language, Solve GQA(Visual Reasoning in the Real World) dataset.
-
Updated
May 9, 2019 - Python
Vision-Language, Solve GQA(Visual Reasoning in the Real World) dataset.
Official PyTorch Implementation of RITC
[ICCV2021 & TPAMI2023] Vision-Language Transformer and Query Generation for Referring Segmentation
[ICCV 2021] On the hidden treasure of dialog in video question answering
The repository of ECCV 2020 paper `Active Visual Information Gathering for Vision-Language Navigation`
💐Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
PyTorch code for Finding in NAACL 2022 paper "Probing the Role of Positional Information in Vision-Language Models".
This repository contains a spatial understanding test suite for vision-language models
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)
Pytorch Implementation of NeuralTwinsTalk Presented @ IEEE HCCAI 2020.
Authors official PyTorch implementation of the "ContraCLIP: Interpretable GAN generation driven by pairs of contrasting sentences".
DramaQA Starter Code (2021)
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022
MixGen: A New Multi-Modal Data Augmentation
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise interaction
Code for the EMNLP 2021 Oral paper "Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search" https://arxiv.org/abs/2109.05433
Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)
Add a description, image, and links to the vision-language topic page so that developers can more easily learn about it.
To associate your repository with the vision-language topic, visit your repo's landing page and select "manage topics."