This notebook contains the basic implementations to replicate the results from the original paper using the provided check points.

In [16]:
# clone the repository (forked from the original one)
!git clone https://github.com/cuevascarlos/CoVR.git

In [3]:
# some adjustements to keep everything consistent
!cp -r CoVR/configs ./
!cp -r CoVR/tools ./
!cp -r CoVR/src ./
!cp -r CoVR/test.py ./
!cp -r CoVR/train.py ./

In [5]:
# install requirements (if using colab)
!python -m pip install -r CoVR/requirements.txt

Collecting pytorch-lightning==2.4.0 (from -r CoVR/requirements.txt (line 1))
  Downloading pytorch_lightning-2.4.0-py3-none-any.whl.metadata (21 kB)
Collecting lightning-utilities==0.11.6 (from -r CoVR/requirements.txt (line 2))
  Downloading lightning_utilities-0.11.6-py3-none-any.whl.metadata (5.2 kB)
Collecting lightning==2.3.3 (from -r CoVR/requirements.txt (line 3))
  Downloading lightning-2.3.3-py3-none-any.whl.metadata (35 kB)
Collecting hydra-core==1.3.2 (from -r CoVR/requirements.txt (line 4))
  Downloading hydra_core-1.3.2-py3-none-any.whl.metadata (5.5 kB)
Collecting pandas==2.0.3 (from -r CoVR/requirements.txt (line 6))
  Downloading pandas-2.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
Collecting opencv-python-headless==4.5.5.64 (from -r CoVR/requirements.txt (line 7))
  Downloading opencv_python_headless-4.5.5.64-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
Collecting transformers==4.26.1 (from -r CoVR/requir

In [6]:
# download annotations if needed
!bash CoVR/tools/scripts/download_annotation_cirr.sh

Downloading [0;34mCIRR Train[0m annotations...
Downloading [0;34mCIRR Val[0m annotations...
Downloading [0;34mCIRR Test[0m annotations...


In [8]:
# load the drive (if using colab)
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


Compute BLIPS2 embeddings for train, dev and test sets (uncomment those you need to compute)

In [None]:
# !python tools/embs/save_blip2_embs_imgs.py --image_dir drive/MyDrive/CIRR/images/train --save_dir drive/MyDrive/CIRR/blip2-embs-large/train

In [None]:
# !python tools/embs/save_blip2_embs_imgs.py --image_dir drive/MyDrive/CIRR/images/test1 --save_dir drive/MyDrive/CIRR/images/test1_blip2

In [3]:
# !python tools/embs/save_blip2_embs_imgs.py --image_dir drive/MyDrive/CIRR/images/dev --save_dir drive/MyDrive/CIRR/blip2-embs-large/dev

Generate blip embeddings for train, dev and test sets (uncomment those you need to compute)

In [None]:
# !python tools/embs/save_blip_embs_imgs.py --image_dir drive/MyDrive/CIRR/images/train/ --save_dir drive/MyDrive/CIRR/blip-embs-large/train

In [17]:
# !python tools/embs/save_blip_embs_imgs.py --image_dir drive/MyDrive/CIRR/images/test1/ --save_dir drive/MyDrive/CIRR/blip-embs-large/test1

In [None]:
# !python tools/embs/save_blip_embs_imgs.py --image_dir drive/MyDrive/CIRR/images/dev/ --save_dir drive/MyDrive/CIRR/blip-embs-large/dev

The next cell is used to download the checkpoints to replicate the results from the paper. We reproduced so far the following checkpoints:

- BLIP (1) + CIRR (WebVid-CoVR) (3) (i.e, BLIP pretrained in WebVid-CoVR and fine tuned in CIRR)
- BLIP 2 (2) + WebVid-CoVR+CC-CoIR (2) (i.e, BLIP2 pretrained on WebVid-CoVR+CC-CoIR)
- BLIP 2 (2) + CIRR (WebVid-CoVR+CC-CoIR) (5) (i.e, BLIP2 pretrained on WebVid-CoVR+CC-CoIR and fine tuned in CIRR)

In [22]:
!bash tools/scripts/download_pretrained_models.sh

Select the model to download:
1) [0;34mBLIP 1 (CoVR)[0m
2) [0;34mBLIP 2 (CoVR-2)[0m
Press Enter for default ([0;32mBLIP 2[0m)
Enter your choice (1/2): 2

Select the BLIP 2 model finetuned with dataset:
1) [0;34mAll[0m
2) [0;34mWebVid-CoVR + CC-CoIR[0m
3) [0;34mCC-CoIR[0m
4) [0;34mWebVid-CoVR[0m
5) [0;34mCIRR (WebVid-CoVR + CC-CoIR)[0m
6) [0;34mCIRR (CC-CoIR)[0m
7) [0;34mFashionIQ (WebVid-CoVR + CC-CoIR)[0m
8) [0;34mFashionIQ (CC-CoIR)[0m
Press Enter for default ([0;32mAll[0m)
Enter your choice (1/2/3/4/5/6/7/8): 5
Downloading [0;34mckpt_5.ckpt[0m checkpoint...
[0;32mDownload successful.[0m


In [20]:
# Run this cell to replicate the results for the first option mentioned before
import os

os.environ["HYDRA_FULL_ERROR"] = '1'

!python test.py test=cirr data=cirr model=blip-large model/ckpt=cirr_ft-covr+gt trainer=gpu trainer.devices=1 machine=server machine.batch_size=256 trainer/logger=csv

In [21]:
# Run this cell for the second option mentioned before
import os

os.environ["HYDRA_FULL_ERROR"] = '1'

!python test.py test=cirr data=cirr model=blip2-coco model/ckpt=blip2-l-coco_coir+covr trainer=gpu trainer.devices=1 machine=server machine.batch_size=128 trainer/logger=csv

In [35]:
!cp -r outputs/test drive/MyDrive/CIRR/

In [13]:
# Run this cell to replicate the results for the third option mentioned (BLIP2 + fine tuning in CIRR)
# So far we must change the checkpoint folder names in order to run it properly (check if we can do it directly with the options)
import os

os.environ["HYDRA_FULL_ERROR"] = '1'

!python test.py test=cirr data=cirr model=blip2-coco model/ckpt=blip2-l-coco_coir+covr trainer=gpu trainer.devices=1 machine=server machine.batch_size=128 trainer/logger=csv

2024-12-26 21:38:52.835128: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-26 21:38:52.852099: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-26 21:38:52.874190: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-26 21:38:52.880934: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-26 21:38:52.896763: I tensorflow/core/platform/cpu_feature_guar

In [14]:
# save the test outputs in the drive
!cp -r outputs/test/ drive/MyDrive/CIRR/