Skip to content
Branch: master
Find file History

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.

Manual and model-assisted image captioning

This directory contains a recipe scripts for collecting and reviewing image captioning data with Prodigy. The captioning model is implemented in PyTorch based on this tutorial. To use the pretrained model, download the files from here and place them all in this directory. For more details on custom recipes with Prodigy, check out the documentation.

📺 This project was created as part of a step-by-step video tutorial.


For more details on the recipes, check out or run a recipe with --help, for example: prodigy image-caption -F --help.

recipe image-caption: Collect image captions manually

Start the server, stream in images from a directory and allow annotating them with captions. Captions will be saved in the data as the field "caption".

prodigy image-caption caption_data ./images -F

recipe image-caption.correct: Model-assisted image captioning

Start the server, stream in images from a directory and display the generated captions in the text field, allowing the annotator to change them if needed. Captions will be saved in the data as the field "caption" and the original unedited caption will be preserved as "orig_caption". Prints the counts of changed vs. unchanged captions on exit.

prodigy image-caption.correct caption_data ./images -F

This recipe expects the files vocab.pkl, encoder-5-3000.pkl and decoder-5-3000.pkl to be present in the same directory. You can download a pretrained model from here. If needed, the recipe could be edited to allow the model path to be passed in as a recipe argument that's then passed to load_model.

recipe image-caption.diff: Review corrected image captions

Go through all edited captions in a dataset created with image-caption.correct and select why the caption was changed, based on multiple choice options. Prints the counts of options on exit.

prodigy image-caption.correct caption_data_diff caption_data -F

The options are currently hard-coded in the recipe, but the recipe could be modified to take a JSON file of options instead via a recipe argument.

You can’t perform that action at this time.