Skip to content

Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"

License

Notifications You must be signed in to change notification settings

coranholmes/SwinBERT

 
 

Repository files navigation

Dense caption generation using SwinBERT

The code provided here was used in the TEVAD paper to generate text features. Please note that these codes are not actively maintained and should be used at your own risk. For instructions on setting up the environment, please refer to the README_orig.md file.

Requirements

pre-requisites:

sudo apt-get install libopenmpi-dev
pip install -r requirements.txt

To install apex:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir ./

Download

Refer to the Download section in the original README_orig.md. For pretrained model, I am currently using VATEX only.

Dense caption generation

Take the ucf-crime dataset as an example, set the paths in the below command and run accordingly

python ./src/tasks/dense_caption_mass.py \
--resume_checkpoint path/to/SwinBERT/models/table1/vatex/best-checkpoint/model.bin \
--eval_model_dir path/to/SwinBERT/models/table1/vatex/best-checkpoint/ \
--dataset_path path/to/SwinBERT/datasets/Crime/data/ \
--caption_file path/to/SwinBERT/datasets/Crime/RTFM_train_caption/vatex_all_captions.txt \
--file_type video \
--file_format mp4 \
--do_lower_case \
--dense_caption \
--do_test

Generate sentence embeddings based on captions

Run

python src/tasks/generate_caption_se.py --dataset ucf \
--caption_path path/to/SwinBERT/datasets/Crime/RTFM_train_caption/vatex_all_captions.txt \
--output_path path/to/SwinBERT/datasets/Crime/RTFM_train_caption/sent_emb_n/

About

Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.8%
  • Shell 0.2%