Pretrained model for img2txt? #466

ludazhao opened this Issue Sep 28, 2016 · 54 comments


None yet

Please let us know which model this issue is about (specify the top-level directory)


Can someone release a pre-trained model for the img2txt model trained on COCO? Would be great for someone here who doesn't have the computational resource yet to do a full training run. Thanks!


@cshallue: could you comment on this? Thanks.




Sorry, we're not releasing a pre-trained version of this model at this time.

@cshallue cshallue closed this Sep 30, 2016
psycharo commented Oct 5, 2016 edited

here are links to a pre-trained model:


@psycharo thanks for sharing! Perhaps you could also share your word_counts.txt file. Different versions of the tokenizer can yield different results, so your model is specific to the word_counts.txt file that you used.


@psycharo my training is still training on our GPU instance. It seems it would take another two weeks to finish. I would appreciate it if you would also release the fine-tuned model.


@psycharo Thanks for sharing your checkpoint!

When I try to use it I'm getting the error: "ValueError: No checkpoint file found in: None".
I don't have any trouble doing run_inference my own checkpoint files but I can't do it on yours. I've tried lots of things: adding a trailing "/", using absolute paths, relative paths, ..... Nothing seems to work.

Suggestions welcomed.
@cshallue - Any thoughts?

Thanks all.

Last login: Sat Oct 15 07:10:56 2016 from
user123@myhost:~$ ls -l /tmp/checkpoint_tmp/
total 175356
-rw-r--r-- 1 user123 user123  19629588 Oct 15 07:04 graph.pbtxt
-rw-r--r-- 1 user123 user123 149088120 Oct 15 07:04 model.ckpt-2000000
-rw-r--r-- 1 user123 user123  10675545 Oct 15 07:04 model.ckpt-2000000.meta
-rw-rw-r-- 1 user123 user123    156438 Oct 15 07:08 word_counts.txt
user123@myhost:~$  /data/home/user123/tensorflow_models/models/im2txt/bazel-bin/im2txt/run_inference   --checkpoint_path=/tmp/checkpoint_tmp   --vocab_file=/tmp/checkpoint_tmp/word_counts.txt   --input_files=${IMAGE_FILE}
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
Traceback (most recent call last):
  File "/data/home/user123/tensorflow_models/models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/", line 83, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/data/home/user123/tensorflow_models/models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/", line 49, in main
  File "/data/home/user123/tensorflow_models/models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/", line 118, in build_graph_from_config
    return self._create_restore_fn(checkpoint_path, saver)
  File "/data/home/user123/tensorflow_models/models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/", line 92, in _create_restore_fn
    raise ValueError("No checkpoint file found in: %s" % checkpoint_path)
ValueError: No checkpoint file found in: None
cshallue commented Oct 15, 2016 edited

@ProgramItUp Try the following: --checkpoint_path=/tmp/checkpoint_tmp/model.ckpt-2000000

When you pass a directory, it looks for a "checkpoint state" file in that directory, which is an index of all checkpoints in the directory. Your directory doesn't have a checkpoint state file, but you can just pass it the explicit filename.

PredragBoksic commented Oct 15, 2016 edited

Getting better, but...

Traceback (most recent call last):
  File "/home/gamma/bin/models-master/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/", line 83, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/gamma/bin/models-master/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/", line 53, in main
    vocab = vocabulary.Vocabulary(FLAGS.vocab_file)
  File "/home/gamma/bin/models-master/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/", line 50, in __init__
    assert start_word in reverse_vocab

Looks like the word_counts.txt file above is not formatted as expected:

b'a' 969108
b'</S>' 586368
b'<S>' 586368
b'.' 440479
b'on' 213612
b'of' 202290
b'the' 196219
b'in' 182598
b'with' 152984
... expects:

a 969108
</S> 586368
<S> 586368
. 440479
on 213612
of 202290
the 196219
in 182598
with 152984

A quick fix is to reformat the word_counts.txt in that way. Or, you could replace line 49 of with

reverse_vocab = [eval(line.split()[0]) for line in reverse_vocab]

In the long run, I'll come up with a way to make sure word_counts.txt is outputted the same for everyone.

PredragBoksic commented Oct 15, 2016 edited

It works!

Captions for image cb340488986cc40f8ec610348b7f5a24.jpg:
  0) a woman is standing next to a horse . (p=0.000726)
  1) a woman is standing next to a horse (p=0.000638)
  2) a woman is standing next to a brown horse . (p=0.000373)

@PredragBoksic great!

@psycharo , what version of python did you use to generate the word_counts.txt file?

I expect the script to output lines of the form:

a 969108
</S> 586368
<S> 586368


b'a' 969108
b'</S>' 586368
b'<S>' 586368

I didn't generate the word_counts.txt file. I changed the line 49 as you suggested it, with:

    """ WORKAROUND for vocabulary file """
    """reverse_vocab = [line.split()[0] for line in reverse_vocab]"""
    reverse_vocab = [eval(line.split()[0]) for line in reverse_vocab]

I have Python 2.7.12 on KUbuntu 16.04 with CUDA 8.0 and CUDNN 5.1 and GTX970. I would not know how to do it in Python, because I program in Java usually. Do you need some code to change that file?


@PredragBoksic I'm asking the creator of that file. You can just keep using the workaround :)


@cshallue python 3.5. I had to make a couple of dirty hacks to make it work on that version of python, this is why word_counts.txt looks different.


@psycharo How many hours did this take to train? I think that people would appreciate what you shared more if you mentioned this.


initial training took about 2-3 days, finetuning for 1m iterations took around 5-6 days. I used single GPU, Tesla P100.


@cshallue Thanks for the prompt replies. Your suggestions worked.

I was not able to follow the full execution path of the code:

Where would be the right place to put a bit of error checking to make sure that the files
--checkpoint_path, --vocab_file, --input_files exist and throw an error if they don't?

In the case of the checkpoint file it would be helpful to throw an error if "checkpoint state" is not found.
Where would this happen?



There are already error checks for all those things.

If no checkpoint state or no checkpoint file is found in --checkpoint_path, it will fail the check here.

If --vocab_file doesn't exist it will fail the check here.

If no files match --input_files then you will get the message "Running caption generation on 0 files matching..." and inference will exit: see here.


I did not notice any meaningful error messages, for example when the image file was missing. I suppose that this functionality will be completed in the future.


@cshallue: I am running the finetuning step of the optimization. What I noticed was that the loss function is not changing much for the initial 22000 steps. The loss is pretty much stuck at 2.40.

I have attached the log file by pumping the stderr to a text file. Is the loss going to go significantly down in the remaining iterations? Or am I missing some "gotcha"?

cshallue commented Oct 25, 2016 edited

@siavashk The loss reported by the training script is expected to be really noisy: it reports on single batches of only 32 examples.

Are you running the evaluation script on the validation files? We expect to see validation perplexity decreasing slowly. It decreases slowly because the model is already near optimal and because we use a smaller learning rate during finetuning.

siavashk commented Oct 25, 2016 edited

@cshallue Maybe I am overly anxious, 22000 steps is about 1% of the optimization. I am just worried that it has been three weeks since I started training this model, and it seems it is going to take another two weeks for it to converge.
I am not running the validation script, since the training itself is taking too long (it's been three weeks now and I am at 1 million iterations). I thought running an additional validation step would make this even longer.


You won't be able to tell much from the training losses for a single batch any more. They will keep jumping around.

You could always just use the model in its current form. It will probably be sensible. There is not much improvement after 1M steps of fine tuning.

Or you could use the model shared in this thread above.

@mainyaa mainyaa added a commit to mainyaa/im2txt_api that referenced this issue Oct 29, 2016
@mainyaa mainyaa add: models from tensorflow/models#466 21d247c

@siavashk do we need to rerun the pre-training step if we use the word_counts.txt file from @psycharo or what is the correct workflow here?


@hholst80, I don't think you need to pre-train. Here is how I used @psycharo's pre-trained model:

  1. Download the finetuned model
  2. Download the word_counts.txt file
  3. Run the evaluation script as described here
  4. If you get an issue similar to this, do the patch as described by @cshallue in the same comment.

@psycharo Thanks a lot! It saves much time!
Would you please share the latest checkpoint that you got?


I almost finished the training (3000 000) if somebody is interested. I'll train something with the inception resnet V2 later.


Could you please share the trained model after finish it? I think a great number of people is interested in it and would appreciate it. Thanks!


Could you, please, also post your trained data with the inception resnet V2 ? Thank you very much

victoriastuart commented Dec 9, 2016 edited

You people (all) are fabulous!

I needed to edit the TensorFlow-provided im2txt scripts (and add to my $PYTHONPATH -- py27 venv -- via a *.pth file), as the paths in the script in the via github cloned repo ( were not working for me. I did all of this without the use of bazel -- just straight-up edits in an editor and implementations in a terminal (linux).

I downloaded psycharo's pretrained model (thank you very much!), edited the file as suggested by cshallue and -- presto! -- I'm successfully classifying images! :-)

Thank you to all involved. :-)


@victoriastuart , I believe that Bazel works well if you clone the entire repository, enter the appropriate folder and use Bazel execution in that ../folder. It's counterintuitive.


@PredragBoksic: ahh good to know - thanks! I'm new to bazel ecosystem, and the instructions on the im2txt site are not as clear as they could be, in my opinion. Anyway, it's working and even better, I learned a lot while sorting it out! ;-)


@siavashk hi: when run the evaluate script ,what is the train and eval dir?


@ProgramItUp hi :before run the script use the pre-trained model , what i need to do?

tae-jun commented Dec 28, 2016

Thanks @psycharo @cshallue !!!

I could successfully run the model thanks to you guy πŸ˜„

srome commented Jan 2, 2017

I'm on Python 3.5 and the fix:
reverse_vocab = [eval(line.split()[0]) for line in reverse_vocab]
did not work for me. The bytes were not being decoded into strings, so I was getting the same assertion error. The following does work for me:
reverse_vocab = [eval(line.split()[0]).decode() for line in reverse_vocab]

Thanks for the model @psycharo !



Did you ever post the model you trained onto the internet anywhere ?

inbreaks commented Jan 4, 2017 edited

It's great to share @psycharo


It would be great if it supports TensorFlow Serving, just like the Inception model.

I may open another issue to track this issue if anyone is interested in this.


@outcastrift I just uploaded it. Sorry for the delay guys. The archive contain the model checkpoint @3000000 steps (1000000 without and 2000000 with inception training) and the word_count file.


@TRGNN cool man


hi ,I meet a question while running my demo:
CRITICAL:tensorflow:Vocab file /home/ubuntu/nmodels-master/im2txt/im2txt/data/word_counts.txt not found.
Traceback (most recent call last):
File "/home/ubuntu/models-master/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/", line 83, in
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/ubuntu/models-master/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/", line 53, in main
vocab = vocabulary.Vocabulary(FLAGS.vocab_file)
File "/home/ubuntu/models-master/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/", line 48, in init
reverse_vocab = list(f.readlines())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/lib/io/", line 128, in readlines
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/lib/io/", line 73, in _preread_check
compat.as_bytes(self.__name), 1024 * 512, status)
File "/usr/lib/python2.7/", line 24, in exit
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/", line 469, in raise_exception_on_not_ok_status
tensorflow.python.framework.errors_impl.NotFoundError: /home/ubuntu/nmodels-master/im2txt/im2txt/data/word_counts.txt

However, the word_counts.txt is in the direction......I don't know how to deal it. Please help me. Thank you!

ProgramItUp commented Jan 20, 2017 edited

@adaxidedakaonang yesterday I encountered some the same thing.
There seems to be a bug reading command line arguments. This is a real hack but one thing you can do for your demo is hard code the FLAGS variables in im2txt/ with the full path and file names. Note: there are multiple copies of after running Bazel, change the original file do not try changing a copy in bazel-bin.
Make sense?


I have found this mistake, that's:
tensorflow.python.framework.errors_impl.NotFoundError: /home/ubuntu/nmodels-master/im2txt/im2txt/data/word_counts.txt
This is 'models' rather than 'nmodels'. I input an 'n' mistakely ! However, I'm be grateful for you! Thanks!


Now I meet a question while running this code with os.popen, this is my code :
`import os
import subprocess

def getMessage(img):
jpg = img
cmdline = '''
CHECKPOINT_DIR="${HOME}/models-master/im2txt/im2txt/model/model.ckpt-3000000" & \

             VOCAB_FILE="${HOME}/models-master/im2txt/im2txt/data/word_counts.txt"  & \

             IMAGE_FILE="%s"  & \
             bazel build -c opt im2txt/run_inference  & \
             export CUDA_VISIBLE_DEVICES=""  & \
             bazel-bin/im2txt/run_inference \
             --checkpoint_path=/home/ubuntu/models-master/im2txt/im2txt/model/model.ckpt-3000000 \
             --vocab_file= /home/ubuntu/models-master/im2txt/im2txt/data/word_counts.txt\

''' % (jpg, jpg)
print os.popen(cmdline)


I think you may know what this code mean: I try to run it in shell.
Note: the last three sentence: When I run it in shell directly it is this:
bazel-bin/im2txt/run_inference \

However, I run it by os.popen it begin to be like this:
bazel-bin/im2txt/run_inference --checkpoint_path=${HOME}/models-master/im2txt/im2txt/model/model.ckpt-3000000 --vocab_file=${VOCAB_FILE} --input_files=${IMAGE_FILE}。
Do you know how to write it in 'cmdline'? Thank you~

ylashin commented Jan 22, 2017

If anyone gets error AttributeError: 'module' object has no attribute 'BasicLSTMCell', you can reset your git HEAD to the below commit. Seems the models repo has undergone lots of changes since December 2016.

$ git reset --hard 9997b250

@mathieuarbezhermoso vs @psycharo whose model should i use ??

Does anyone has a inception latest trained model ?

I dont have tesla's i cant train


i downloaded this model
i could not find the word count of this model.
Please share the link after you get home


tf.learn.latest_checkpoint returns a None .
I downloaded the @psycharo model into im2txt/model/train/
whats wrong ?

@FangMath FangMath added a commit to FangMath/models that referenced this issue Feb 4, 2017
@FangMath FangMath Update
Tips to avoid errors:
1. Use Tensorflow 0.12
2. Use export CUDA_VISIBLE_DEVICES="0" for inference on MacBook Pro GPU
3. Specify --checkpoint_path=/Users/fanfang/im2txt/model/train/model.ckpt-2000000 in inferencing if you're using a pretrained model from here tensorflow#466
@MoAbd MoAbd added a commit to MoAbd/models that referenced this issue Feb 4, 2017
@MoAbd MoAbd Sync w TF r0.12 issue(#466) c960219
MoAbd commented Feb 5, 2017

I am trying to convert @mathieuarbezhermoso checkpoint into Const ops using
and it needs file .pb as input_graph , so how can I generate it ?
or if any one have it I`ll be thankful to you.


Did anybody get following error when trying to run the checkpoint with Tensorflow1.0?

NotFoundError (see above for traceback): Tensor name "lstm/basic_lstm_cell/weights" not found in checkpoint files im2txt/im2txt_pretrained2/model.ckpt-3000000
	 [[Node: save/RestoreV2_381 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_381/tensor_names, save/RestoreV2_381/shape_and_slices)]]



@tintelle Looks the default variable names for the BasicLSTMCell were changed in TensorFlow1.0, and they no longer match the checkpoint. See the following thread for a pointer for renaming variables from checkpoints:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment