R3-Transformer

This is the official code release for R3-Transformer proposed in Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language.

Installation

Option (I)

All dependencies are included in the original model's container. First install the latest docker. Then pull our docker image by:

docker pull hassanhub/vid_cap:latest

Then run the container by:

docker run --gpus all --name r3_container -it -v /home/

Note: This image already includes CUDA-related drivers and dependencies.

Option (II)

Alternatively, you can create your own environment and make sure the following dependencies are installed:

Python 3.7/3.8
Tensorflow 2.3
CUDA 10.1
NVIDIA Driver v 440.100
CuDNN 7.6.5
opencv-python
h5py
transformers
matplotlib
scikit-image
nvidia-ml-py3
decord
pandas
tensorcore.dataflow

Data Preparation

In order to speed-up data infeed, we utilize a multi-chunk hdf5 format. There are two options for getting data prepared for train/evaluation.

Option (I)

Download pre-extracted features using SlowFast-50-8x8 pre-trained on Kinetics 400 from this link:

Parts 0-10 (coming soon...)

Option (II)

Alternatively, you can follow these steps to extract a customized version of features using your own visual backbone:

Download YouCook II
Download ActivityNet Captions
Pre-process raw video files using this script
Extract visual features using your visual backbone or our pre-trained SlowFast-50-8x8 using this script
Store features and captions in a multi-chunk hdf5 format using this script

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
data		data
data_utils		data_utils
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
evaluate.py		evaluate.py
evaluators.py		evaluators.py
requirements.txt		requirements.txt
t5.py		t5.py
train.py		train.py
trainers.py		trainers.py
tx_helper.py		tx_helper.py
vid_cap.py		vid_cap.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R3-Transformer

Installation

Option (I)

Option (II)

Data Preparation

Option (I)

Option (II)

About

Releases

Packages

Languages

License

hassanhub/R3Transformer

Folders and files

Latest commit

History

Repository files navigation

R3-Transformer

Installation

Option (I)

Option (II)

Data Preparation

Option (I)

Option (II)

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages