Image captioner CLI using BLIP and BLIP2 models
- Python 3.10 or higher
pip install zz-image-captionYou may need to install pytorch separately depending on your system to use CUDA (default to use CPU if not available).
Print caption for an image to the console
caption image.jpgRename images in a directory with their captions
caption images/ -o filenameWrite metadata for images in a directory with their captions
caption images/ -o metadataPrint caption for an image to the console using the BLIP2 model
caption image.jpg --blip2The following table lists all the command-line arguments available with descriptions and additional details:
| Argument | Type | Choices | Default | Description |
|---|---|---|---|---|
-v, --version |
flag | Display the version of the tool. | ||
input |
string | Path to the input image file or directory. | ||
-o, --output |
string | text, json, metadata, filename | Specify the output type. | |
-a, --append |
string | Append string to caption output. | ||
-t, --token |
integer | 32 | Max token length for captioning. | |
-b, --batch |
integer | 1 | Batch size for captioning. | |
-p, --prompt |
string | Prompt for captioning. | ||
--temp, --temperature |
float | 1.0 | Temperature for captioning. | |
--seed |
integer | Seed for reproducibility. | ||
--large |
flag | Use the large model for captioning. | ||
--cpu |
flag | Use CPU instead of GPU (not recommended). | ||
--blip2 |
flag | Use Blip2 model for captioning. | ||
--verbose |
flag | Print verbose output. | ||
--debug |
flag | Print debug output. |
caption --help