Skip to content

XudongLinthu/upgradable-multimodal-intelligence

Repository files navigation

Upgradable Multimodal Intelligence

Code for CVPR 23 Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval

This sample implmentation is largely based on the codebase of Just Ask. So please refer to it for setting up the environment. We will release the full codebase soon after CVPR.

After setting up, using the files under How2QA can obtain results in Table 1 for the Text+Text variant using the following command

python main_videoqa.py --checkpoint_dir=LOCATION_OF_EXPERIMENT --dataset=how2qa --lr=0.00005 --mlm_prob 0. --qmax_words 120 --baseline to --lm all-mpnet-base-v2 --epochs 30

About

Code for CVPR 23 Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published