Fix/generate-image by makiroll1125 · Pull Request #285 · CraftOS-dev/CraftBot

makiroll1125 · 2026-05-26T08:33:02Z

What

Add OpenAI Image 2.0 image generation functionality
Change Gemini image gen model to Nano banana 2

Why

Updated CraftBot to use state-of-the-art image gen models
Closes #219

… generation

…e provided

ahmad-ajmal · 2026-05-27T00:42:51Z

Requesting changes - architectural concern + a few correctness issues

Main blocker: provider branching duplicates infrastructure we already have

This change hardcodes if provider == "openai" / elif "gemini" inside the action, with two SDK imports, two client inits, and two error-mapping blocks. We already have provider abstraction elsewhere in the repo:

MODEL_REGISTRY + InterfaceType enum (used by VLM/LLM)

LLMInterface / VLMInterface wrappers in app/llm_interface.py and app/vlm_interface.py that hide the provider-specific SDK calls

describe_image.py (lines 62–70) is the reference pattern: read the configured provider from MODEL_REGISTRY[provider][InterfaceType.VLM], then delegate

As-is this PR builds a third parallel provider system that future image providers (Stability, Replicate, xAI, OpenRouter image, etc.) will all have to extend by adding another elif branch here. It also introduces a new image_generation.preferred_provider setting that parallels the existing vlm_provider / llm_provider pattern instead of joining it.

Could we route this through MODEL_REGISTRY with a new InterfaceType.IMAGE_GEN and an ImageGenInterface in agent_core, mirroring how VLMInterface is set up, so generate_image.py ends up looking like describe_image.py? Reusing InterfaceType.VLM directly is tempting since some providers serve both through one endpoint, but the capability sets differ (Claude / ByteDance support VLM but not gen) and users will want to pick providers independently for each.

Other issues worth fixing while you're in here

OpenAI aspect-ratio map is wrong. "16:9": "1536x1024" is 3:2, "9:16": "1024x1536" is 2:3. The canvas constraint is real (gpt-image only has 3 sizes), but silently mismapping → at least append to warnings. (I skimmed real quick so please verify)

Silent 4K downgrade for OpenAI. "4K": "high" returns at most 1536×1024. Either reject 4K for OpenAI or warn.

quality dropped on the edit path. images.generate(..., quality=...) is passed, but images.edit(...) isn't - reference-image runs silently render at lower quality.

images.edit ≠ "style reference." The existing reference_images field is documented as style guidance (how Gemini uses them). OpenAI's images.edit treats inputs as compositional/mask inputs. Same input, very different output between providers.

Provider-selection UX doesn't match the PR description. The description says "asks the user" when both keys are present, but the code silently defaults to Gemini - there's no signal in the response telling the calling LLM that a choice is available. Once provider_preference is saved, there's also no way to clear it.

Happy to pair on the ImageGenInterface refactor if it'd help.

@zfoong What do you think about this change? Worth the effort?

ahmad-ajmal · 2026-05-27T00:45:18Z

-            error_message = f"Content blocked by safety filters: {error_message}. Try modifying your prompt."
-        elif "not found" in error_message.lower() or "404" in error_message:
-            error_message = f"Model not available: {error_message}. The gemini-3-pro-image-preview model may not be accessible with your API key. Try using Google AI Studio to verify access."
+        if provider == "gemini":


Follow-up: extract these error strings to a message catalog

Even setting aside this PR, both error-mapping blocks (Gemini + OpenAI) are the same five patterns repeated with provider-specific text spliced in:

Template key Triggers (substrings) Placeholders

provider_rate_limit quota, rate, insufficient_quota, billing {provider_label}, {error}

provider_invalid_key invalid + key, invalid_api_key {provider_label}, {error}

provider_access_denied permission, access {provider_label}, {model}, {error}

provider_safety_block safety, blocked, content_policy {provider_label}, {error}

provider_model_not_found not found, 404 {provider_label}, {model}, {help_url}, {error}

Suggested shape - anchors on the existing get_os_language() setting in app/config.py:341, which currently isn't wired to anything:

app/i18n/errors.en.json - flat catalog of templates with {placeholder} substitution

app/i18n/errors.<lang>.json - locale overrides; missing keys fall back to en

app/i18n/__init__.py exposing two helpers:

t(key, **kwargs) -> str - formatted lookup with locale + en fallback

classify_provider_error(exc, *, provider, model) -> str - does the substring matching against a triggers table in the catalog and returns the formatted message in one call

The error block in this PR collapses to roughly:

except Exception as e: return { "status": "error", ..., "message": classify_provider_error(e, provider=provider, model=model_id), }

Benefits:

one mapping table instead of two near-identical blocks per provider

localization-ready without touching call sites

adding a new provider = adding a row to the catalog, not another if/elif ladder

avoids writing a bespoke message for every case - templates cover ~90% of them

i think with the help of claude we can implement this pretty quick and easy

@zfoong What do you think about this too?
Happy to file this as a separate issue once we figure out which agent_core module owns it (since the same templates will be useful for the LLMInterface / VLMInterface paths too).

makiroll1125 added 2 commits May 26, 2026 14:45

fix: add support for OpenAI Images 2.0 and Gemini Nano Banana 2 image…

6d0bc89

… generation

fix: Gemini is default provider if both OpenAI and Gemini API keys ar…

d326f70

…e provided

makiroll1125 requested review from ahmad-ajmal and zfoong May 26, 2026 08:33

makiroll1125 self-assigned this May 26, 2026

ahmad-ajmal reviewed May 27, 2026

View reviewed changes

ahmad-ajmal requested changes May 27, 2026

View reviewed changes

zfoong merged commit f8f888d into V1.3.2 May 28, 2026

makiroll1125 deleted the fix-generate-image branch May 28, 2026 02:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/generate-image#285

Fix/generate-image#285
zfoong merged 2 commits into
V1.3.2from
fix-generate-image

makiroll1125 commented May 26, 2026 •

edited

Loading

Uh oh!

ahmad-ajmal May 27, 2026

Uh oh!

ahmad-ajmal May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Template key	Triggers (substrings)	Placeholders
`provider_rate_limit`	`quota`, `rate`, `insufficient_quota`, `billing`	`{provider_label}`, `{error}`
`provider_invalid_key`	`invalid` + `key`, `invalid_api_key`	`{provider_label}`, `{error}`
`provider_access_denied`	`permission`, `access`	`{provider_label}`, `{model}`, `{error}`
`provider_safety_block`	`safety`, `blocked`, `content_policy`	`{provider_label}`, `{error}`
`provider_model_not_found`	`not found`, `404`	`{provider_label}`, `{model}`, `{help_url}`, `{error}`

Conversation

makiroll1125 commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Uh oh!

ahmad-ajmal May 27, 2026

Choose a reason for hiding this comment

Requesting changes - architectural concern + a few correctness issues

Main blocker: provider branching duplicates infrastructure we already have

Other issues worth fixing while you're in here

Uh oh!

ahmad-ajmal May 27, 2026

Choose a reason for hiding this comment

Follow-up: extract these error strings to a message catalog

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

makiroll1125 commented May 26, 2026 •

edited

Loading