Manual and model-assisted image captioning
This directory contains a recipe scripts for collecting and reviewing image captioning data with Prodigy. The captioning model is implemented in PyTorch based on this tutorial. To use the pretrained model, download the files from here and place them all in this directory. For more details on custom recipes with Prodigy, check out the documentation.
📺This project was created as part of a step-by-step video tutorial.
For more details on the recipes, check out
image_caption.py or run a recipe with
prodigy image-caption -F image_caption.py --help.
image-caption: Collect image captions manually
Start the server, stream in images from a directory and allow annotating them
with captions. Captions will be saved in the data as the field
prodigy image-caption caption_data ./images -F image_caption.py
image-caption.correct: Model-assisted image captioning
Start the server, stream in images from a directory and display the generated
captions in the text field, allowing the annotator to change them if needed.
Captions will be saved in the data as the field
"caption" and the original
unedited caption will be preserved as
"orig_caption". Prints the counts of
changed vs. unchanged captions on exit.
prodigy image-caption.correct caption_data ./images -F image_caption.py
This recipe expects the files
decoder-5-3000.pkl to be present in the same directory. You can download a
If needed, the recipe could be edited to allow the model path to be passed in as
a recipe argument that's
then passed to
image-caption.diff: Review corrected image captions
Go through all edited captions in a dataset created with
and select why the caption was changed, based on multiple choice options. Prints
the counts of options on exit.
prodigy image-caption.correct caption_data_diff caption_data -F image_caption.py