Skip to content
This repository contains pretrained Show and Tell: A Neural Image Caption Generator implemented in Tensorflow.
Branch: master
Clone or download
KranthiGV Merge pull request #25 from rom1504/patch-1
Updated URL of im2txt repo
Latest commit 152ed6b Nov 4, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
extras Added tensorboard vis files on request. Jun 20, 2017
LICENSE Added license Apr 19, 2017
README.md fix link to im2txt Nov 3, 2018
model.ckpt-1000000.index Added .index file Apr 19, 2017
model.ckpt-2000000.index 2M with finetuning May 13, 2017
word_counts.txt

README.md

(tl;dr)
2M iterations finetuned checkpoint file | Released under MIT License

1M iterations checkpoint file | Released under MIT License

word_counts.txt (at this repository)

model.ckpt-2000000.index (at this repository. Place it in the same folder as the model checkpoint used.)

model.ckpt-1000000.index (at this repository. Place it in the same folder as the model checkpoint used.)

Show and Tell : A Neural Image Caption Generator

Pretrained model for Tensorflow implementation found at tensorflow/models of the image-to-text paper described at:

"Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge."

Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan.

Full text available at: http://arxiv.org/abs/1609.06647

Contact

Kranthi Kiran GV (KranthiGV | kranthi.gv@gmail.com)

Generating Captions

Steps

  1. Follow the steps at im2txt to clone the repository, install bazel, etc.

  2. Download the desired model checkpoint:
    2M iterations finetuned checkpoint file | Released under MIT License
    1M iterations checkpoint file | Released under MIT License

  3. Clone the repository: git clone https://github.com/KranthiGV/Pretrained-Show-and-Tell-model.git

# Path to checkpoint file.
# Notice there's no data-00000-of-00001 in the CHECKPOINT_PATH environment variable
# Also make sure you place model.ckpt-2000000.index (which is cloned from the repository)
# in the same location as model.ckpt-2000000.data-00000-of-00001
# You can use model.ckpt-1000000.data-00000-of-00001 similarly
CHECKPOINT_PATH="/path/to/model.ckpt-2000000"


# Vocabulary file generated by the preprocessing script.
# Since the tokenizer could be of a different version, use the word_counts.txt file supplied. 
VOCAB_FILE="/path/to/word_counts.txt"

# JPEG image file to caption.
IMAGE_FILE="/path/to/image.jpeg"

# Build the inference binary.
bazel build -c opt im2txt/run_inference

# Run inference to generate captions.
bazel-bin/im2txt/run_inference \
  --checkpoint_path=${CHECKPOINT_PATH} \
  --vocab_file=${VOCAB_FILE} \
  --input_files=${IMAGE_FILE}

Extras

  1. Graph.pbtxt is uploaded on request.
  2. Training stats are uploaded for use with tensorboard.
    tensorboard --logdir="./extras/tensorboard/"
You can’t perform that action at this time.