Skip to content

Add Google Gemma workflow block via OpenRouter#2274

Open
Erol444 wants to merge 3 commits intomainfrom
gemma-4
Open

Add Google Gemma workflow block via OpenRouter#2274
Erol444 wants to merge 3 commits intomainfrom
gemma-4

Conversation

@Erol444
Copy link
Copy Markdown
Contributor

@Erol444 Erol444 commented Apr 25, 2026

Tested locally. ATM user needs to pass openrouter api key manually, in the future we could udpate this so it uses apiproxy.
image

Summary

  • Adds a new roboflow_core/google_gemma@v1 workflow block that exposes Google's Gemma vision-language models (Gemma 4 31B and Gemma 4 26B A4B) through OpenRouter, using the same OpenAI-compatible client pattern as the existing Llama Vision block.
  • Supports the full VLM task surface (matches Google Gemini v3): unconstrained, OCR, structured answering, single/multi-label classification, VQA, short/long captioning, and unprompted object detection.
  • Requires a user-supplied OpenRouter API key — there is no apiproxy/openrouter proxy on the platform side yet, so rf_key:account is intentionally not wired in.
  • Adds 24 unit tests mirroring test_llama_3_2_vision.py, including mocked happy/error paths for the choices=None failure mode that OpenRouter occasionally returns.

Test plan

  • pytest tests/workflows/unit_tests/core_steps/models/foundation/test_google_gemma.py — 24 passed
  • Block loads in the workflow registry (/workflows/blocks/describe shows roboflow_core/google_gemma@v1 with all 9 tasks)
  • End-to-end smoke test against OpenRouter once the Gemma 4 model IDs are live

🤖 Generated with Claude Code

Erol444 and others added 2 commits April 25, 2026 13:44
Exposes Google's Gemma vision-language models (Gemma 4 31B and Gemma 4 26B
A4B) as a workflow block, served via OpenRouter with a user-supplied API key.
Supports the standard VLM task set: unconstrained, OCR, structured answering,
classification, multi-label classification, VQA, captioning, and unprompted
object detection.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirrors the Llama Vision test suite: manifest validation across image,
prompt, model version, api_key, output_structure, temperature, and
task-specific required-field paths (prompt, classes, output_structure).
Adds object-detection to the classes-required parametrize list since
Gemma supports it. Includes mocked execute_gemma_request happy and
error paths to cover the OpenRouter choices-None failure mode.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3dfcc8de9f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

},
},
)
api_key: Union[Selector(kind=[STRING_KIND]), str] = Field(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept secret selectors for API key input

The api_key field only allows Selector(kind=[STRING_KIND]), which makes this block incompatible with secret-provider outputs (for example environment_secrets_store@v1 emits SECRET_KIND). In practice, workflows that keep OpenRouter keys in the secrets pipeline cannot connect that output to google_gemma@v1, so users are forced to pass the key as a plain string selector instead of a secret-typed value.

Useful? React with 👍 / 👎.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant