Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
cv My first commit Sep 15, 2016
flickr30k-caption My first commit Sep 15, 2016
flickr8k-caption My first commit Sep 15, 2016
misc Create Sep 19, 2016
misc_saver passed test, but currently not used Sep 20, 2016
misc_saver2_reg_atten_ws Update Attention_Weights_Criterion.lua Sep 24, 2016
vis My first commit Sep 15, 2016 Update Sep 29, 2016
convert_checkpoint_gpu_to_cpu.lua My first commit Sep 15, 2016
eval.lua Update eval.lua Sep 27, 2016
model_id.json My first commit Sep 15, 2016 My first commit Sep 15, 2016 Create Sep 27, 2016
test_attention_weights_criterion.lua Update test_attention_weights_criterion.lua Sep 24, 2016
test_bilinear_version2.lua My first commit Sep 15, 2016
test_language_model.lua add test code for beam search Sep 19, 2016
test_language_model_reg.lua add test on language model which use regularizaion on output attentio… Sep 21, 2016
test_load_glove.lua My first commit Sep 15, 2016
test_nngmodule.lua Update test_nngmodule.lua Sep 15, 2016
train.lua My first commit Sep 15, 2016
train_reg_on_att.lua Create train_reg_on_att.lua Sep 27, 2016

Image caption with semantic attention

note that this repository are mainly borrowed from neuraltalk2, hats off to Karpathy, what a great job he has done! And the model implemented here is from image caption with semantic attention, Quanzeng You et al. CVPR2016.

without regularization on attention weights

current results table

beam_size Bleu_1 Bleu_2 Bleu_3 Bleu_4 METEOR CIDEr
2 0.884 0.726 0.58 0.46 0.308 1.214
3 0.891 0.739 0.597 0.479 0.311 1.239
4 0.891 0.742 0.601 0.484 0.312 1.244
5 0.892 0.743 0.603 0.488 0.313 1.249
7 0.893 0.744 0.605 0.489 0.313 1.25

with regularization on attention weights

current result: to be updated.. L1 loss on output attention weights(seems not improve too much):

beam_size Bleu_1 Bleu_2 Bleu_3 Bleu_4 METEOR CIDEr
7 0.898 0.751 0.614 0.498 0.315 1.26

Attention Weights Criterion on attention weights(however, not finetuning on cnn part):

regularization attention model:

beam_size Bleu_1 Bleu_2 Bleu_3 Bleu_4 METEOR CIDEr
7 0.905 0.759 0.622 0.506 0.321 1.3
  • (may add comment later, below is the comment from neuraltalk2, shoule remove it in the near future)


For evaluation only

This code is written in Lua and requires Torch. If you're on Ubuntu, installing Torch in your home directory may look something like:

$ curl -s | bash
$ git clone ~/torch --recursive
$ cd ~/torch; 
$ ./      # and enter "yes" at the end to modify your bashrc
$ source ~/.bashrc

See the Torch installation documentation for more details. After Torch is installed we need to get a few more packages using LuaRocks (which already came with the Torch install). In particular:

$ luarocks install nn
$ luarocks install nngraph 
$ luarocks install image 

We're also going to need the cjson library so that we can load/save json files. Follow their download link and then look under their section 2.4 for easy luarocks install.

If you'd like to run on an NVIDIA GPU using CUDA (which you really, really want to if you're training a model, since we're using a VGGNet), you'll of course need a GPU, and you will have to install the CUDA Toolkit. Then get the cutorch and cunn packages:

$ luarocks install cutorch
$ luarocks install cunn

If you'd like to use the cudnn backend (the pretrained checkpoint does), you also have to install cudnn. First follow the link to NVIDIA website, register with them and download the cudnn library. Then make sure you adjust your LD_LIBRARY_PATH to point to the lib64 folder that contains the library (e.g. Then git clone the cudnn.torch repo, cd inside and do luarocks make cudnn-scm-1.rockspec to build the Torch bindings.

For training

If you'd like to train your models you will need loadcaffe, since we are using the VGGNet. First, make sure you follow their instructions to install protobuf and everything else (e.g. sudo apt-get install libprotobuf-dev protobuf-compiler), and then install via luarocks:

luarocks install loadcaffe

Finally, you will also need to install torch-hdf5, and h5py, since we will be using hdf5 files to store the preprocessed data.

Phew! Quite a few dependencies, sorry no easy way around it :\

I'd like to distribute my GPU trained checkpoints for CPU

Use the script convert_checkpoint_gpu_to_cpu.lua to convert your GPU checkpoints to be usable on CPU. See inline documentation for why this separate script is needed. For example:

th convert_checkpoint_gpu_to_cpu.lua gpu_checkpoint.t7

write the file gpu_checkpoint.t7_cpu.t7, which you can now run with -gpuid -1 in the eval script.


BSD License.


Parts of this code were written in collaboration with my labmate Justin Johnson.

I'm very grateful for NVIDIA's support in providing GPUs that made this work possible.

I'm also very grateful to the maintainers of Torch for maintaining a wonderful deep learning library.

You can’t perform that action at this time.