image-gen

Unified image generation skill that dispatches to the right Cloubic-hosted model — OpenAI gpt-image-2 or Gemini gemini-3-pro-image-preview / gemini-3.1-flash-image-preview — through one parameterised script. Handles text-to-image and image-to-image. Auto-resizes oversized reference images so Cloudflare doesn't 524 on you.

Built as the image-generation backend for skill ecosystems (e.g. lovart-skills) — callers produce a Brief, image-gen turns it into a file on disk.

Install

git clone https://github.com/motiful/image-gen.git
cd image-gen
bash scripts/setup.sh

setup.sh checks python3 + pip, installs requests / Pillow / playwright (+ chromium), and reminds you to set the API key. Idempotent.

API key

Pick one — CLOUBIC_API_KEY env var wins if both exist:

export CLOUBIC_API_KEY='sk-xxxx'
# or
echo 'sk-xxxx' > ~/.cloubic_api_key && chmod 600 ~/.cloubic_api_key

Key never reaches logs or stdout — only the Authorization header.

Use it (CLI)

# text-to-image — defaults to gemini-3-pro-image-preview
python3 scripts/generate.py --prompt "a samurai cat in neon Tokyo" --aspect-ratio 16:9

# image-to-image with multiple references
python3 scripts/generate.py \
    --model gpt-image-2 \
    --prompt "put this product on a marble countertop, soft window light" \
    --reference-image ./bottle.png \
    --output ./hero.png

Output path is printed on success along with elapsed time, file size, and token usage. Defaults: gemini-3-pro-image-preview, auto-generated timestamped filename in the current directory.

Use it (as a skill)

Drop image-gen/ next to your other skills. When another skill needs an image, it invokes scripts/generate.py with the prompt it built. The contract is generate_image(prompt, model, [aspect_ratio], [reference_images], [output_path]) → file_path — see SKILL.md for the full Engagement Principles and Execution Procedure.

What's inside

SKILL.md                       # skill contract — read this first
references/
  model-selection.md           # model family routing + which one to pick
  parameter-spec.md            # argument schema + caller defaults
  reference-image-handling.md  # auto-resize rules (the 524 fix)
  error-handling.md            # HTTP/response error mapping
scripts/
  generate.py                  # the parameterised entry point
  setup.sh                     # dependency installer
  url_screenshot.py            # auxiliary: capture a URL → PNG for i2i input

Design notes

Auto-detect model family. Caller passes --model; skill resolves OpenAI vs Gemini endpoint shape. No "which API is this" questions.
Reference images get auto-resized to ≤1024px / ≤1MB. A 2.5MB PNG + base64 expansion + Gemini processing reliably triggers HTTP 524 at Cloudflare's ~100s edge.
Negative constraints are appended as "Avoid: …", not sent as a separate parameter. No model on Cloubic exposes a true negative_prompt.
Fail loud. A timeout, malformed response, or missing image data exits non-zero with a diagnostic. Caller decides retry strategy.
Cost is logged on every call — gpt-image-2 ≈ ¥0.35/image, gemini-3-pro-image-preview ≈ ¥1/image, gemini-3.1-flash-image-preview ≈ ¥0.20/image.

See SKILL.md §Engagement Principles for the complete list.

Provenance

Scaffolded and audited via skill-forge — passes the 5 must-fix structure findings (bidirectional EP contract, frontmatter on references, repo-root README + LICENSE).

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
references		references
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

image-gen

Install

API key

Use it (CLI)

Use it (as a skill)

What's inside

Design notes

Provenance

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

image-gen

Install

API key

Use it (CLI)

Use it (as a skill)

What's inside

Design notes

Provenance

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages