Image to Prompt is a minimal local web app that turns an uploaded image into editable Ideogram 4 JSON prompt.
Florence-2 drafts the scene, object boxes, region captions, and OCR regions. The UI lets you drag, resize, rename, duplicate, delete, hide, add zones, and optionally add a structured style_description before copying or exporting the JSON.
You can queue multiple images at once: drop several files (or a selection) and they analyze sequentially while you edit. A thumbnail rail above the workspace shows each image's status (queued, analyzing, done, failed), and Export all downloads every finished prompt as a zip of JSON files named after the source images.
The easiest way to install and run Image to Prompt is with Pinokio.
Clone or download this repository, then install the Python dependencies from the app folder:
cd app
python -m venv env
source env/bin/activate
uv pip install -r requirements.txtOn Windows, activate the environment with:
.\env\Scripts\Activate.ps1Install PyTorch for your platform. For most CPU and Apple Silicon setups:
uv pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cpu --force-reinstallFor NVIDIA CUDA 12.8:
uv pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128 --force-reinstallStart the app:
PYTORCH_ENABLE_MPS_FALLBACK=1 TOKENIZERS_PARALLELISM=false FLORENCE_MODEL=microsoft/Florence-2-base-ft python app.py --host 127.0.0.1 --port 7860On Windows PowerShell:
$env:PYTORCH_ENABLE_MPS_FALLBACK="1"; $env:TOKENIZERS_PARALLELISM="false"; $env:FLORENCE_MODEL="microsoft/Florence-2-base-ft"; python app.py --host 127.0.0.1 --port 7860Then open http://127.0.0.1:7860.
To use a different Florence-2 model, replace microsoft/Florence-2-base-ft with another model id, such as microsoft/Florence-2-large-ft. The first run downloads model files from Hugging Face, so it may take longer than later launches.
- Drop one or more images into the center canvas or click Choose images (multi-select works). Pasting an image from the clipboard also works.
- Images analyze one at a time. The thumbnail rail at the top shows each image's status; you can keep editing while the rest of the queue processes in the background.
- Click a thumbnail (or use the left/right arrow keys) to switch between images. Each image keeps its own boxes, prompt fields, and JSON edits. A dot on a thumbnail marks unsaved edits, x removes an image, and ↻ retries a failed one.
- Drag or resize boxes directly on the image.
- Edit item labels, literal text, descriptions, and types in the left panel.
- Copy or export the JSON from the right panel. Export all in the rail header downloads a zip with one JSON file per analyzed image.
By default the app omits style_description, because Ideogram 4 treats it as optional and Florence-2 does not reliably infer structured style fields. Select a style preset if you want the app to include a schema-valid style_description; those presets are app defaults, not official Ideogram presets.
The default model is microsoft/Florence-2-base-ft for low memory use. To use another Florence-2 model, start with a model parameter, for example microsoft/Florence-2-large-ft.
The app exposes a single analysis endpoint:
POST /api/analyze
Form field:
file: image file upload
Response:
image: uploaded image dimensionsmodel: model usedcaption: generated scene captionbackground: generated background descriptionpalette: sampled color paletteelements: editable objects/text regions with normalized Ideogram bboxesjson: Ideogram 4 JSON prompt following the documented caption schema
const form = new FormData()
form.append("file", fileInput.files[0])
const response = await fetch("http://127.0.0.1:7860/api/analyze", {
method: "POST",
body: form
})
const result = await response.json()
console.log(result.json)import requests
with open("image.png", "rb") as image_file:
response = requests.post(
"http://127.0.0.1:7860/api/analyze",
files={"file": image_file},
)
print(response.json()["json"])curl -X POST http://127.0.0.1:7860/api/analyze \
-F "file=@image.png"