Local AI image canvas for prompt-to-image generation, reference-image generation, and multi-step Agent planning. It combines tldraw, Hono, SQLite, and GPT Image 2 into a local-first creative workspace.
- Create and arrange AI-generated images on a tldraw canvas.
- Generate from text prompts or use selected canvas images as references.
- Save project state, generation history, and generated assets locally.
- Configure image providers from
.env, the in-app provider dialog, or Codex login. - Plan multi-image work in the Agent tab, then execute DAG-based generation jobs around a plan node.
- Optionally back up new generated images to Tencent Cloud COS or Cloudflare R2 / S3-compatible storage.
- Browse local outputs in Gallery, including rerun, locate, download, and upload status.
- Node.js
24.15.0. The repo includes.nvmrcand.node-version. - pnpm
9.14.2. The version is pinned inpackage.json. - An OpenAI API key with access to
gpt-image-2, an OpenAI-compatible image endpoint, or a Codex login completed inside the app. - Docker Desktop or a compatible Docker Engine, only if you want the Docker workflow.
Activate the pinned package manager with Corepack if needed:
corepack prepare pnpm@9.14.2 --activateWindows PowerShell:
pnpm install
Copy-Item .env.example .env
pnpm devmacOS/Linux:
pnpm install
cp .env.example .env
pnpm devOpen the web app at http://localhost:5173.
pnpm dev starts both local services:
- API: http://127.0.0.1:8787
- Web: http://localhost:5173, proxying
/apito the API service
The app can start without credentials. Without a usable provider, / shows the credential-aware homepage and generation requests return missing_provider until you configure one.
The default provider order is:
- Environment OpenAI-compatible config from
.envor runtime variables. - Local OpenAI-compatible config saved in the app.
- Codex login fallback.
For the simplest API-key setup, edit .env:
OPENAI_API_KEY=
OPENAI_BASE_URL=
OPENAI_IMAGE_MODEL=gpt-image-2
OPENAI_IMAGE_TIMEOUT_MS=1200000
CODEX_RESPONSES_MODEL=gpt-5.5Leave OPENAI_BASE_URL empty for the official OpenAI API. Set it to an OpenAI-compatible /v1 endpoint when using another provider, and set OPENAI_IMAGE_MODEL if that endpoint expects a different image model name.
When using Codex login, CODEX_RESPONSES_MODEL controls the mainline Responses model for the ChatGPT OAuth bridge; OPENAI_IMAGE_MODEL remains the image-generation tool model.
You can also open the top-right 配置 dialog and save one local OpenAI-compatible provider. Local keys are stored in SQLite under DATA_DIR, returned only as masked values, and preserved until you enter a replacement key.
/is the credential-aware homepage. It offersCodex 登录and接入 APIwhen no provider is available./canvasis the working canvas. Without a provider, it redirects back to/./poolis the bundled Prompt Pool for browsing, searching, favoriting, copying, and reusing curated prompts./galleryremains available even without credentials, so local work can still be viewed.
Environment values are read-only in the provider dialog. If you change .env, restart the API or Docker container.
The right-side panel has two main flows:
Manual: enter a prompt, choose size/quality/format, and generate. Selecting one image shape switches the flow into reference-image generation.Agent: describe a multi-image task, optionally select up to three canvas images, review the generated plan node, then execute it.
Agent planning uses a separate OpenAI-compatible chat configuration from the image provider. Save it in the Agent LLM settings with API key, Base URL, model, timeout, and supportsVision.
When supportsVision is enabled, selected images are attached to the planning request as multimodal inputs. When disabled, selected images are passed only as reference handles for later image generation. Agent messages are not persisted in this version; plan nodes already on the canvas are saved with the normal canvas snapshot.
Plan execution is DAG-based. Independent jobs can run in parallel, jobs that reference generated outputs wait for their dependencies, and Retry failed reruns failed or blocked jobs while keeping successful upstream outputs. A single plan is capped at 16 generated images, including intermediate anchors.
Generated images are always saved locally first. If Tencent Cloud COS or Cloudflare R2 / S3-compatible storage is enabled from the in-app cloud storage dialog, new images are also uploaded to:
<key-prefix>/YYYY/MM/<assetId>.<ext>
The COS fields are prefilled from:
COS_DEFAULT_BUCKETCOS_DEFAULT_REGIONCOS_DEFAULT_KEY_PREFIX
The S3/R2 fields are prefilled from:
S3_DEFAULT_BUCKETS3_DEFAULT_REGIONS3_DEFAULT_KEY_PREFIXR2_DEFAULT_ACCOUNT_IDS3_DEFAULT_ENDPOINT
Saving cloud storage settings performs a test upload and delete before the config is persisted. Provider secrets are stored in local SQLite and only returned as masked values. Cloud upload failures do not fail image generation; the image remains available locally and the history item shows the upload failure.
apps/api Hono API, SQLite storage, provider selection, Agent planning/execution
apps/web Vite + React + tldraw web app
packages/shared Shared contracts and constants
docs Project docs and preview assets
data Local runtime data, ignored by Git
| Command | Description |
|---|---|
pnpm dev |
Start API and web dev servers. |
pnpm api:dev |
Start the API dev workflow. |
pnpm web:dev |
Start the Vite web dev workflow. |
pnpm typecheck |
Typecheck shared, web, and API packages. |
pnpm build |
Build shared, web, and API packages. |
pnpm start |
Start the built API package. |
pnpm --filter @gpt-image-canvas/api smoke:planner |
Check Agent plan validation fixtures. |
pnpm --filter @gpt-image-canvas/api smoke:agent |
Check Agent config and WebSocket basics. |
pnpm --filter @gpt-image-canvas/api smoke:executor |
Check Agent DAG execution with a fake image provider. |
Before completing code changes, run:
pnpm typecheck
pnpm buildFor UI changes, run pnpm dev and verify the Vite app in a browser at http://localhost:5173.
If better-sqlite3 reports a NODE_MODULE_VERSION mismatch after switching Node versions, rebuild it:
pnpm --filter @gpt-image-canvas/api rebuild better-sqlite3 --streamDocker Compose builds shared contracts, the web app, Prompt Pool JSON data, and the API into one image. The API serves both /api and the built web bundle from one localhost port. SQLite data and generated assets persist in host ./data.
Windows PowerShell:
Copy-Item .env.example .env
docker compose config --quiet --no-env-resolution
docker compose up --buildmacOS/Linux:
cp .env.example .env
docker compose config --quiet --no-env-resolution
docker compose up --buildOpen http://localhost:8787 by default. Set PORT in .env before starting Compose to use a different localhost port.
The /pool route reads bundled JSON from prompt-pool-data/prompts-all.json and optionally prompt-pool-data/summary.json. Images are not bundled, mounted, or copied; card images use GitHub raw URLs from the JSON. Advanced users can set PROMPT_POOL_DIR to another directory containing prompts-all.json.
Use docker compose config --quiet --no-env-resolution when real credentials exist. Plain docker compose config expands env files and can print secrets.
Compose defaults SQLITE_JOURNAL_MODE=DELETE and SQLITE_LOCKING_MODE=EXCLUSIVE to avoid SQLite shared-memory errors on Docker Desktop bind mounts. Avoid running pnpm dev and Docker against the same data/ directory at the same time.
Release publishing pushes a multi-platform image to GHCR, so upgrades can pull the repository image instead of rebuilding locally:
docker compose -f docker-compose.ghcr.yml pull
docker compose -f docker-compose.ghcr.yml up -dThe default image is ghcr.io/mrslimslim/gpt-image-canvas:latest. To pin a release, set IMAGE before running Compose, for example ghcr.io/mrslimslim/gpt-image-canvas:v0.4.0.
Release tags are published as vX.Y.Z, X.Y.Z, and X.Y; non-prerelease GitHub Releases also update latest. Public GHCR packages can be pulled anonymously. If GitHub shows the package as private, run docker login ghcr.io or make the package public in the repository package settings.
The Compose build accepts these network-related build args:
NODE_IMAGENPM_CONFIG_REGISTRYAPT_MIRRORAPT_SECURITY_MIRROR
The default NODE_IMAGE is node:24.15.0-bookworm-slim.
DATA_DIR defaults to ./data locally and /app/data in Docker. It contains:
gpt-image-canvas.sqlite: project state, generation history, asset metadata, provider config, Agent LLM config, optional cloud storage config, and Codex OAuth token records.assets/: generated image files.
Do not commit .env, .ralph/, .codex-temp/, data/, generated images, SQLite databases, or build output.
Treat data/gpt-image-canvas.sqlite as sensitive after saving local provider keys, Agent LLM keys, cloud storage secrets, or Codex tokens. The app is designed for local workstation use; do not expose it publicly without adding your own authentication and network controls.
If a real API key was ever committed, rotate the key. Git ignore rules prevent future leaks, but they do not remove secrets from existing Git history.
- Missing provider: add
OPENAI_API_KEYto.envand restart, save a local provider from配置, or completeCodex 登录. - Codex login fails: confirm the machine can reach
https://auth.openai.com, keep the login dialog open, and restart the flow if the user code expires. - Custom endpoint fails: confirm
OPENAI_BASE_URLpoints to an OpenAI-compatible/v1endpoint and supports the configured image model. - Agent cannot plan: save the Agent LLM config separately from the image provider config. If
supportsVisionis enabled and the request fails, try fewer or smaller selected images. - Agent plan cannot execute: confirm the normal image provider is configured; Agent planning and image generation use separate configs.
- Port conflict: set
PORTfor API/Docker. For web dev, stop the process on5173or runpnpm web:dev -- --port 5174. - Docker cannot pull the base image: restore Docker Hub access or set
NODE_IMAGEto an equivalent cached Node24.15.0image. - Docker Prompt Pool is empty: rebuild the image so bundled
prompt-pool-data/prompts-all.jsonis copied into the container; if overridingPROMPT_POOL_DIR, confirm it points to a directory containingprompts-all.json. - SQLite
SQLITE_IOERR_SHMOPENin Docker: keep the Compose SQLite defaults, rebuild, and make sure no local API process is using the same database. - SQLite
SQLITE_CORRUPT: stop all app processes, back updata/, then restore from backup or remove the SQLite files to create a clean database. Files underdata/assets/can be kept. - Stale local state: stop the app and remove files under
data/. This deletes local project state, history, and generated assets.
Before upgrading an older local install, back up runtime data:
Windows PowerShell:
Copy-Item -Recurse data data-backup-before-upgrade
docker compose up --buildmacOS/Linux:
cp -R data data-backup-before-upgrade
docker compose up --buildRebuild the web app and API together after an upgrade.
Codex can work directly in this repository. Let it read AGENTS.md, then use the pinned package manager:
pnpm install
pnpm typecheck
pnpm buildKeep credentials out of prompts and logs. For Ralph-driven work, read docs/ralph-execution.md; keep PRDs under .agents/tasks/, runtime state under .ralph/, and scratch files under .codex-temp/.
MIT
