Deprecated: I refactored ->
Jupyter Notebook Python
Latest commit cebe2ae Dec 19, 2016 @apple2373 committed on GitHub out of data notification
Failed to load latest commit information.
codes *pyc removal Mar 21, 2016
data initial commit Dec 15, 2015
evalutation_script initial commit Dec 15, 2015
experiment1 initial commit Dec 15, 2015
images modules and sampled added Jan 2, 2016
models initial commit Dec 15, 2015
work initial commit Dec 15, 2015
.gitignore added Mar 21, 2016 out of data notification Dec 18, 2016 added Mar 21, 2016 script modify Dec 15, 2015 br Dec 29, 2015

I no longer maintain this repository. This implementation is not that clean and hard to use if you want to train on your own data. I re-implemented from scratch. The new one is much faster, accurate, and clean. It can even generate Chinese captions. Please see the better implementation.

image caption generation by chainer

This codes are trying to reproduce the image captioning by google in CVPR 2015. Show and Tell: A Neural Image Caption Generator

The training data is MSCOCO. I used GoogleNet to extract images feature in advance (preprocessed them before training), and then trained language model to generate caption.

I made pre-trained model available. The model achieves CIDEr of 0.66 for the MSCOCO validation dataset. To achieve the better score, the use of beam search is first step (not implemented yet). Also, I think the CNN has to be fine-tuned.
Update: I implemented a beam search. Check the usage below.

More information including some sample captions are in my blog post.


chainer 1.6 and some more packages.
!!Warning ** Be sure to use chainer 1.6.** Not the latest version. If you have another version, no guarantee to work.
If you are new, I suggest you to install Anaconda ( and then install chainer. You can watch the video below.

I have a problem to prepare environment

I prepared a video to show how you prepare environment and generate captions on ubuntu. I used a virtual machine just after installing ubuntu 14.04. If you imitate as in the video, you can generate captions. The process is almost the same for Mac. Windows is not suported because I cannot use it (Acutually chainer does not officialy support windows). Or, some commands that might help:

#get and install anaconda. you might want to check the latest link.
bash -b
echo 'export PATH=$HOME/anaconda/bin:$PATH' >> .bashrc
echo 'export PYTHONPATH=$HOME/anaconda/lib/python2.7/site-packages:$PYTHONPATH' >> .bashrc
source .bashrc
conda update conda -y
# install chainer 
pip install chainer==1.6

I just want to generate caption!

OK, first, you need to download the models and other preprocessed files. Then you can generate caption.

Google Drive suddenly shut down the hosting service and the file downlaod no longer works.

I don't have time to uplaod somewhere else, but all files are here:

cd codes
python -i ../images/test_image.jpg

This generate a caption for ../images/test_image.jpg. If you want to use your image, you just have to indicate -i option to image that you want to generate captions.

Once you set up environment, you can use it as a module.Check the ipython notebooks. This includes beam search. English:

Also, you can try beam search as:

cd codes
python -b 3 -i ../images/test_image.jpg

-b option indicates beam size. Default is 3.

I want to train the model by myself.

I extracted the GoogleNet features and pickled, so you use it for training.

 cd codes
 python  -g 0 # to use gpu. change the number to gpu_id

The log and trained model will be saved to a directory (experiment1 is defalt)
If you want to change, use -d option.

 python -d ./yourdirectory

I want to train from other data.

Sorry, current implementation does not support it. You need to preprocess the data. Maybe you can read and modify the code.

I want to fine-tune CNN part.

Sorry, current implementation does not support it. Maybe you can read and modify the code.

I want to generate Japanese caption.

I made pre-trained Japanese caption model available. You can download Japanese caption model with the following script.

cd codes
python -v ../work/index2token_jp.pkl -m ../models/caption_model_jp.chainer -i ../images/test_image.jpg

Japnese Notebook:
Japnese Blogpost: