Skip to content

Latest commit

 

History

History
 
 

image_captioning_pytorch

Image Captioning pytorch

input

Input

(Image from http://images.cocodataset.org/train2017/000000505539.jpg)

output

  • Estimating Caption
### Caption ###
a giraffe and a zebra standing in a field (fc model)
a group of zebras and a giraffe in a field (fc_rl model)
a group of zebras and a giraffe standing on a dirt road (fc_nsc model)

usage

Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.

For the sample image,

$ python3 image_captioning_pytorch.py

If you want to specify the input image, put the image path after the --input option.

$ python3 image_captioning_pytorch.py --input IMAGE_PATH

By adding the --video option, you can input the video.
If you pass 0 as an argument to VIDEO_PATH, you can use the webcam input instead of the video file.

$ python3 image_captioning_pytorch.py --video VIDEO_PATH

The default captioning model is fc_nsc.
By adding the --model fc_rl option, you can use fc_rl model for captioning. Or by adding the --model fc option, you can use fc.

Reference

Framework

Pytorch

Model Format

ONNX opset = 11

Netron