#

vision-and-language

Here are 149 public repositories matching this topic...

plxmert

phiyodr / plxmert

PyTorch code for Finding in NAACL 2022 paper "Probing the Role of Positional Information in Vision-Language Models".

naacl transformers vision-and-language pre-training vision-language lxmert naacl2022 unibwm

Updated Jul 20, 2022
Python

clp-research / cost-sharing-reference-game

Source code and documentation for the LREC-COLING'24 paper "Sharing the Cost of Success: A Game for Evaluating and Learning Collaborative Multi-Agent Instruction Giving and Following Policies"

reinforcement-learning multi-agent vision-and-language

Updated Mar 27, 2024
Python

mltrev23 / plate-yolov5m

Training and inferencing model to extract license number plate

ai vision vision-and-language yolov5

Updated Apr 22, 2024
Python

claromes / toolazytowritealt

alt text for lazy people

clarifai language-model vision-and-language streamlit blip-2 llm-hackathon

Updated Oct 5, 2023
Python

williamcfrancis / vlm-comparison-gemini-cog

A comparitive study between the two of the best performing open source Vision Language Models - Google Gemini Vision and CogVLM

ai gemini vision vlm vision-and-language vision-language-model cogvlm google-gemini gemini-pro

Updated Jan 28, 2024
Python

JChiyah / exploring-mm-in-simmc2

Code and models for the paper 'Exploring Multi-Modal Representations for Ambiguity Detection & Coreference Resolution in the SIMMC 2.0 Challenge' published at AAAI 2022 DSTC10 Workshop

machine-learning coreference-resolution vision-and-language ambiguity-resolver bert-models dstc10

Updated Nov 23, 2022
Python

ellenzhuwang / implicit_vkood

An end-to-end multimodal framework incorporating explicit knowledge graphs and OOD-detection. (NeurIPS23)

knowledge-graph multimodal vision-and-language multimodal-deep-learning ood-detection implicit-differentiation neurips-2023

Updated Jan 23, 2024
Python

vyskocj / VinVL-L

VinVL+L: Enriching Visual Representation with Location Context in Visual Question Answering (VQA)

computer-vision visual-question-answering vision-and-language location-recognition

Updated Jan 26, 2023
Python

LivXue / VCNLG

Vision-Controllable Natural Language Generation

natural-language-generation vision-and-language natual-language-processing

Updated Mar 8, 2024
Python

guoyang9 / ELIP

Efficient language image pre-training

efficient vision-and-language

Updated Dec 12, 2023
Python

itsShnik / allForOne

PyTorch implementation of the paper: All For One: Multi-modal Multi-Task Learning

deep-learning sentiment-classification multi-task-learning visual-question-answering vision-and-language multi-modal-learning

Updated Jul 17, 2020
Python

gallardorafael / multilingual-mmf

Multilanguage vision and language research. Fork of Facebook AI Research (FAIR) modular framework for vision & language research (MMF).

deep-learning multilanguage vision-and-language

Updated Aug 15, 2021
Python

gchrupala / peppa

Code for the paper "Learning English with Peppa Pig" https://doi.org/10.48550/arXiv.2202.12917

speech-processing spoken-language-processing vision-and-language grounding child-language

Updated May 27, 2022
Python

yubin1219 / text2image_manipulation

Multimodal Learning - using CLIP (Internship Project)

transformer image-manipulation text-to-image vision-and-language

Updated Jan 27, 2022
Python

JHKim-snu / PGA

Under review. [IROS 2024] PGA: Personalizing Grasping Agents with Single Human-Robot Interaction

personalization semi-supervised-learning vision-and-language robotic-manipulation visual-grounding multi-modal-learning

Updated Mar 30, 2024
Python

ellenzhuwang / implicitOOD

An end-to-end vision and language model incorporating explicit knowledge graphs and OOD-detection.

deep-learning transformer knowledge-graph multimodal-learning mscoco-dataset visual-question-answering vision-and-language ood-detection

Updated May 3, 2024
Python

MikeWangWZHL / Multitask-Finetuning_CLIP

Code for paper "Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning" COLING 2022 workshop

vision-and-language few-shot-learning

Updated May 8, 2023
Python

MichiganNLP / Scalable-VLM-Probing

Probe Vision-Language Models

language benchmark natural-language-processing computer-vision deep-learning vision neural-networks clip vision-and-language

Updated Jul 27, 2023
Python

adaptively-finetuning-transformers

itsShnik / adaptively-finetuning-transformers

Adaptively fine tuning transformer based models for multiple domains and multiple tasks

transformers pytorch visual-question-answering finetuning vision-and-language vlbert lxmert vqav2 vqacpv2 spottune blockdrop

Updated May 22, 2023
Python

tsujuifu / code_ssi

An implementation of SSI

ai-gaming vision-and-language naacl2021

Updated Dec 17, 2023
Python

Improve this page

Add a description, image, and links to the vision-and-language topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-and-language topic, visit your repo's landing page and select "manage topics."