Skip to content
A simple implementation of neural image caption generator
Python Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
code
data
scripts
.gitignore
LICENSE
README.md
requirements.txt

README.md

Image-Caption-Generator

A simple implementation of neural image caption generator

Please note that the code in this repo is for use in talks/workshops. There is a lot of room for improvement (in terms of both accuracy and efficiency) so that these aspects can be discussed during the sessions.

You may refer to Tensorflow's im2text Model for a stable and accurate implementation.

Setup

Create Directories

  • Run ./scripts/mkdir.sh

Downloading Datasets

  • Run ./scripts/download_images.sh
  • This downloads Flick8K dataset

Downloading Models

  • The VGG16 model would be downloaded automatically when the model is trained for the first time and would be cached on the disk.
  • Alternatively, run python3 vgg16.py. It would download the VGG16 model, produce the embeddings for a test image and compare with a pre-computed embedding.

Processing Images

  • Update data_dir in code/preprocess.py and set mode_list=["train", "test", "debug"]
  • Run python3 preprocess.py

Train

  • Run python3 train.py
You can’t perform that action at this time.