Language Models as Knowledge Bases for Visual Word Sense Disambiguation (VWSD)

Install

git clone https://github.com/anastasiakrith/llm-for-vwsd.git
cd llm-for-vwsd

On the project folder run the following commands:

$ virtualenv env to create a virtual environment
$ source venv/bin/activate to activate the environment
$ pip install -r requirements.txt to install packages
Create a .env file with the environmental variables. The project needs a OPENAI_API_KEY with the API key corresponding to your openai account, and optionally a DATASET_PATH corresponding to the absolute path of VWSD dataset.

python vl_retrieval_eval.py -llm "gpt-3" -vl "clip" -baseline -penalty

python qa_retrieval_eval.py -llm "gpt-3.5" -captioner "git" -strategy "greedy" -prompt "no_CoT" -zero_shot

The implementation relies on resources from openai-api and hugging-face transformers.