Image to Prompt

Image to Prompt is a minimal local web app that turns an uploaded image into editable Ideogram 4 JSON prompt.

Florence-2 drafts the scene, object boxes, region captions, and OCR regions. The UI lets you drag, resize, rename, duplicate, delete, hide, add zones, and optionally add a structured style_description before copying or exporting the JSON.

You can queue multiple images at once: drop several files (or a selection) and they analyze sequentially while you edit. A thumbnail rail above the workspace shows each image's status (queued, analyzing, done, failed), and Export all downloads every finished prompt as a zip of JSON files named after the source images.

Install

1. 1-click install with Pinokio

The easiest way to install and run Image to Prompt is with Pinokio.

2. Install manually

Clone or download this repository, then install the Python dependencies from the app folder:

cd app
python -m venv env
source env/bin/activate
uv pip install -r requirements.txt

On Windows, activate the environment with:

.\env\Scripts\Activate.ps1

Install PyTorch for your platform. For most CPU and Apple Silicon setups:

uv pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cpu --force-reinstall

For NVIDIA CUDA 12.8:

uv pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128 --force-reinstall

Start the app:

PYTORCH_ENABLE_MPS_FALLBACK=1 TOKENIZERS_PARALLELISM=false FLORENCE_MODEL=microsoft/Florence-2-base-ft python app.py --host 127.0.0.1 --port 7860

On Windows PowerShell:

$env:PYTORCH_ENABLE_MPS_FALLBACK="1"; $env:TOKENIZERS_PARALLELISM="false"; $env:FLORENCE_MODEL="microsoft/Florence-2-base-ft"; python app.py --host 127.0.0.1 --port 7860

Then open http://127.0.0.1:7860.

To use a different Florence-2 model, replace microsoft/Florence-2-base-ft with another model id, such as microsoft/Florence-2-large-ft. The first run downloads model files from Hugging Face, so it may take longer than later launches.

How to Use

Drop one or more images into the center canvas or click Choose images (multi-select works). Pasting an image from the clipboard also works.
Images analyze one at a time. The thumbnail rail at the top shows each image's status; you can keep editing while the rest of the queue processes in the background.
Click a thumbnail (or use the left/right arrow keys) to switch between images. Each image keeps its own boxes, prompt fields, and JSON edits. A dot on a thumbnail marks unsaved edits, x removes an image, and ↻ retries a failed one.
Drag or resize boxes directly on the image.
Edit item labels, literal text, descriptions, and types in the left panel.
Copy or export the JSON from the right panel. Export all in the rail header downloads a zip with one JSON file per analyzed image.

By default the app omits style_description, because Ideogram 4 treats it as optional and Florence-2 does not reliably infer structured style fields. Select a style preset if you want the app to include a schema-valid style_description; those presets are app defaults, not official Ideogram presets.

The default model is microsoft/Florence-2-base-ft for low memory use. To use another Florence-2 model, start with a model parameter, for example microsoft/Florence-2-large-ft.

API

The app exposes a single analysis endpoint:

POST /api/analyze

Form field:

file: image file upload

Response:

image: uploaded image dimensions
model: model used
caption: generated scene caption
background: generated background description
palette: sampled color palette
elements: editable objects/text regions with normalized Ideogram bboxes
json: Ideogram 4 JSON prompt following the documented caption schema

JavaScript

const form = new FormData()
form.append("file", fileInput.files[0])

const response = await fetch("http://127.0.0.1:7860/api/analyze", {
  method: "POST",
  body: form
})

const result = await response.json()
console.log(result.json)

Python

import requests

with open("image.png", "rb") as image_file:
    response = requests.post(
        "http://127.0.0.1:7860/api/analyze",
        files={"file": image_file},
    )

print(response.json()["json"])

Curl

curl -X POST http://127.0.0.1:7860/api/analyze \
  -F "file=@image.png"

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
.gitignore		.gitignore
README.md		README.md
icon.svg		icon.svg
install.js		install.js
pinokio.js		pinokio.js
pinokio.json		pinokio.json
poster.gif		poster.gif
preview.png		preview.png
reset.js		reset.js
start.js		start.js
torch.js		torch.js
update.js		update.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image to Prompt

Install

1. 1-click install with Pinokio

2. Install manually

How to Use

API

JavaScript

Python

Curl

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image to Prompt

Install

1. 1-click install with Pinokio

2. Install manually

How to Use

API

JavaScript

Python

Curl

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages