CVPR-code-release

Environment Setup

All the code has been run and tested on:

Python 2.7.15 (coco-caption requires 2.7)
Pytorch 1.0.0
CUDA 9.0
TITAN X/Xp and GTX 1080Ti GPUs

First clone the repository:

git clone https://github.com/shenkev/Caption-Images-through-a-Lifetime-by-Asking-Questions.git

Go into the downloaded code directory
Add the project to PYTHONPATH

cd <path_to_downloaded_directory>
export PYTHONPATH=$PWD

1. Python dependencies and Stanford NLP

chmod +x setup.sh
./setup.sh

This will:

Install python dependencies
Download Stanford NLP package for parsing part-of-speech
Download coco-caption
Download pyciderevalcap

2. Download images and preprocess them

Download the images from this link. We need the 2014 training images and 2014 val images.
You should put the train2014/ and val2014/ in a directory of your choice, denoted as $IMAGE_ROOT.
Download pretrained resnet model from here and place in Utils/preprocess/checkpoint
Preprocess images the images by running

python Utils/preprocess/preprocess_imgs.py --input_json Data/annotation/dataset_coco.json --output_dir $IMAGE_ROOT/features --images_root $IMAGE_ROOT

Warning: the prepro script will fail with the default MSCOCO data because one of their images is corrupted. See this issue for the fix, it involves manually replacing one image in the dataset.

3. Download training data and preprocessing

Download training data here
Unzip it into Data/annotation
Precompute indexes for CIDEr

python Utils/preprocess/preprocess_cider.py --data_file Data/annotation/cap_train.p --output_file Data/annotation/coco-words

Prepare lifelong learning data splits

python Utils/preprocess/preprocess_llsplits.py --data_file Data/annotation/cap_train.p --output_file Data/annotation/train3_split --warmup 3 --num_splits 4 --num_caps 2

You can play with the chunk sizes and # chunks using warmup and num_splits parameters

Training

You can either download trained caption, question generator, VQA modules or train them yourself

1. Download pretrained modules

You can download trained Caption, Question generator, VQA modules
Download model checkpoints here
Place in Data/model_checkpoints
The captioning module was trained using 10% warmup data

1. Training modules

Train caption module
In Experiments/caption.json change exp_dir to the working directory, img_dir to $IMAGE_ROOT

python Scripts/train_caption3.py --experiment Experiments/caption3.json

Train VQA module
In Experiments/vqa.json change exp_dir to the working directory, img_dir to $IMAGE_ROOT

python Scripts/train_vqa.py --experiment Experiments/vqa.json

Train question generator module
In Experiments/question3.json change exp_dir to the working directory, img_dir to $IMAGE_ROOT, vqa_path to vqa model checkpoint and cap_path to caption model checkpoint

python Scripts/train_quegen.py --experiment Experiments/question3.json

2. Lifelong training

In Experiments/lifelong3.json change exp_dir to the working directory, img_dir to $IMAGE_ROOT, vqa_path to vqa model checkpoint and cap_path to caption model checkpoint, quegen_path to question generator model checkpoint
You can play with parameters H, lamda, k

python Scripts/train_lifelong.py --experiment Experiments/lifelong3.json

Track training

cd Results/lifelong
tensorboard --logdir tensorboard/

Visualize qualitative results

cd Results/lifelong/lifelong3

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dependencies		Dependencies
Experiments		Experiments
Losses		Losses
Models		Models
Scripts		Scripts
Utils		Utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dependencies

Dependencies

Experiments

Experiments

Losses

Losses

Models

Models

Scripts

Scripts

Utils

Utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

setup.sh

setup.sh

Repository files navigation

CVPR-code-release

Contents

Environment Setup

1. Python dependencies and Stanford NLP

2. Download images and preprocess them

3. Download training data and preprocessing

Training

1. Download pretrained modules

1. Training modules

2. Lifelong training

About

Releases

Packages

Languages

License

shenkev/Caption-Lifetime-by-Asking-Questions

Folders and files

Latest commit

History

Repository files navigation

CVPR-code-release

Contents

Environment Setup

1. Python dependencies and Stanford NLP

2. Download images and preprocess them

3. Download training data and preprocessing

Training

1. Download pretrained modules

1. Training modules

2. Lifelong training

About

Resources

License

Stars

Watchers

Forks

Languages