fix(tools): use /images/generations endpoint for Gemini and OpenAI#9
Merged
viettranx merged 1 commit intonextlevelbuilder:mainfrom Feb 28, 2026
Conversation
…age gen create_image exclusively used /chat/completions with modalities:["image","text"] which only works on OpenRouter. Gemini returns HTTP 400: "Image generation is not yet supported on the chat.completions endpoint" OpenAI's DALL-E models also require /images/generations, not /chat/completions. Fix: route OpenRouter through /chat/completions (supports modalities), route all other providers (Gemini, OpenAI, etc.) through the standard /images/generations endpoint with response_format:"b64_json". Also update default Gemini model from deprecated gemini-2.0-flash-exp to gemini-2.5-flash-image.
5124857 to
6b0fadb
Compare
MiltonSilvaJr
referenced
this pull request
in vellus-ai/argoclaw
Mar 22, 2026
Sprint 0 — Security hardening before feature development. HIGH fixes: - #1: Whitelist table names in execMapUpdate() — prevents SQL injection via dynamic table name (store/pg/helpers.go) - #2: Log invalid groupBy values in snapshot queries (store/pg/snapshot.go) - #3: Validated shellEscape() — single-quote wrapping is correct; added PBT tests for shell injection (tools/dynamic_tool_security_test.go) MEDIUM fixes: - #4-5: Log security warnings for no-token and viewer-fallback auth (gateway/router.go) - #6: Restrict CORS on OpenAPI endpoint — removed wildcard, allow only localhost origins (http/openapi.go) - #7: Add CheckSSRFWithPinning() for DNS rebinding TOCTOU prevention (tools/web_shared.go) - #8: Log warning when TLS verification is disabled (tracing/otelexport/exporter.go) - #9: Pin all Python package versions in Dockerfile — prevents supply chain attacks via unpinned dependencies - #10: Change HOME fallback from /tmp to /app — prevents temp dir abuse (tools/credentialed_exec.go) Also fixes arargoclaw double-rename bug in 356 Go import paths. Tests: PBT tests for table whitelist and shell escaping (testing/quick). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MiltonSilvaJr
referenced
this pull request
in vellus-ai/argoclaw
Mar 22, 2026
Sprint 0 — Security hardening before feature development. HIGH fixes: - #1: Whitelist table names in execMapUpdate() — prevents SQL injection via dynamic table name (store/pg/helpers.go) - #2: Log invalid groupBy values in snapshot queries (store/pg/snapshot.go) - #3: Validated shellEscape() — single-quote wrapping is correct; added PBT tests for shell injection (tools/dynamic_tool_security_test.go) MEDIUM fixes: - #4-5: Log security warnings for no-token and viewer-fallback auth (gateway/router.go) - #6: Restrict CORS on OpenAPI endpoint — removed wildcard, allow only localhost origins (http/openapi.go) - #7: Add CheckSSRFWithPinning() for DNS rebinding TOCTOU prevention (tools/web_shared.go) - #8: Log warning when TLS verification is disabled (tracing/otelexport/exporter.go) - #9: Pin all Python package versions in Dockerfile — prevents supply chain attacks via unpinned dependencies - #10: Change HOME fallback from /tmp to /app — prevents temp dir abuse (tools/credentialed_exec.go) Also fixes arargoclaw double-rename bug in 356 Go import paths. Tests: PBT tests for table whitelist and shell escaping (testing/quick). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
blackbirdzzzz365-gif
pushed a commit
to blackbirdzzzz365-gif/goclaw
that referenced
this pull request
Apr 12, 2026
9 checkpoint documents covering the upgrade from 43% to ~85% pattern matching with Claude Code's architectural patterns. Checkpoints: - CP-00: Current state analysis - CP-01: Context defense 5 layers (Pattern nextlevelbuilder#9) - CP-02: Concurrency-safe partitioning (Pattern nextlevelbuilder#4) - CP-03: Streaming tool execution (Pattern nextlevelbuilder#5) - CP-04: Escalating recovery (Pattern nextlevelbuilder#3) - CP-05: Context modifier chain + fork isolation (Patterns nextlevelbuilder#6, nextlevelbuilder#8) - CP-06: Permission classification pipeline (Pattern nextlevelbuilder#10) - CP-07: Skill system upgrade (Patterns nextlevelbuilder#11-13) - CP-08: Plugin ecosystem (Patterns nextlevelbuilder#14-16) Based on analysis from "Giai phau mot Agentic Operating System" (18 patterns from 513K LOC Claude Code source). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
viettranx
added a commit
that referenced
this pull request
Apr 20, 2026
Phase 4 — final phase of the TTS params/layout/agent-override plan. Adds a 3-key allow-list (`speed`, `emotion`, `style`) per agent stored in `agents.other_config.tts_params`. Backend resolves and merges into `opts.Params` PER ATTEMPT inside the fallback loop so each provider sees its own native shape — never the primary's keys when fallback runs (Finding #1 critical). Backend: - `AgentOverridable bool` on `audio.ParamSchema`. UI filter reads this flag from /v1/tts/capabilities; no separate TS literal mirror — capabilities API is the single source of truth (Finding #9). - `audio.AdaptAgentParams(generic, provider)` maps the 3 generic keys to provider-native paths (e.g. `speed` → `voice_settings.speed` for ElevenLabs, flat `speed` for OpenAI/MiniMax, dropped for Edge/Gemini). - `Manager.SynthesizeWithFallbackAdapted` adapts inside the loop so fallback providers receive correctly-shaped params. - `manager_auto.go` and `tools/tts.go` Execute do per-attempt adaptation on the tenant + direct + fallback call sites. - Drop log bumped to `slog.Info("tts.agent.params.dropped", ...)` for audit trail when a generic key isn't supported by the active provider. - Cross-check test asserts every adapter switch case has at least one capability ParamSchema with `AgentOverridable: true`, and vice versa. Security (red-team findings): - Allow-list ENFORCED at write path: `validateAgentTTSParams` in HTTP `handleUpdate` AND WS `agents_update` rejects any `tts_params` key outside `{speed, emotion, style}` (Finding #5). - 64KB body cap on agent PUT via `http.MaxBytesReader` (Finding #6). - Explicit tenant-scope guard after `agents.GetByID` (Finding #12). - Concurrent-tab clobber: handleSave merges `tts_params` into a fresh copy of `otherConfig` rather than reusing stale state (Finding #13). - Rate-limit verified — RoleAdmin gate sufficient for v1 (Finding #15). Frontend (web + desktop): - `TtsOverrideBlock` rewritten: filters capability params to `agent_overridable === true`, renders via `DynamicParamForm`. Hides entirely for providers with no overridable params (Edge, Gemini). - Bidirectional adapter (generic ↔ capability-native form state) so agent storage stays in generic keys while UI works in native paths. 25 round-trip tests cover all 5 providers. - Desktop `AgentDetailPanel` gains an inline fine-tune section gated on `globalProvider`, reusing the desktop `DynamicParamForm`. i18n: `tts.override.params.title` ("Fine-tune") added to web + desktop en/vi/zh. Tests: all 9 backend suites green (race), web 214/214, desktop build clean, both Go build tags pass.
viettranx
added a commit
that referenced
this pull request
Apr 20, 2026
Post-review cleanup of Phase 4. Closes Finding #9 properly and corrects the Finding #13 documentation lie surfaced in the code-review report. Capability schema: - Replace `AgentOverridable bool` with `AgentOverridableAs string` on ParamSchema. Empty string = not overridable; non-empty = the generic key alias (`"speed"`, `"emotion"`, `"style"`). - Each provider declaration now carries the alias inline, so the generic↔native mapping has a single TS-readable source. Frontend: - Web `tts-override-block.tsx` drops the inline `GENERIC_TO_NATIVE` literal and derives the bidirectional adapter from the filtered capability params (each param self-describes its alias). Adapter tests rewritten around the new shape. - Desktop `AgentDetailPanel.tsx` drops the 45-line inline IIFE in favour of a new `<TtsOverrideFineTune>` component that uses the same alias-based mapping. Backend: - Move `AgentTTSParamsAllowedKeys` + `ValidateAgentTTSParams` to `internal/audio/agent_params_adapter.go`. HTTP `validate.go` and WS `gateway/methods/agents_update.go` both delegate, eliminating the duplicated `{speed, emotion, style}` literal. Cleanup: - Delete orphan i18n keys `MsgTtsParamInvalidJSON` and `MsgTtsParamDependsOn` from `keys.go` + en/vi/zh catalogs (no in-code references; DependsOn is FE-only, JSON parse failures already surface via slog). Documentation: - `prompt-settings-section.tsx` Finding #13 comment rewritten to honestly describe the best-effort merge into a fresh local copy of the cached `otherConfig` prop. Concurrent-tab clobber remains possible — server-side JSON-merge-patch endpoint planned for v2. Tests: 9 backend suites (race), web 217/217, desktop build clean, both Go build tags pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
create_imageexclusively used/chat/completionswithmodalities:["image","text"], which only works on OpenRouter"Image generation is not yet supported on the chat.completions endpoint for this model"/images/generations, not/chat/completions/chat/completions(supports modalities), all other providers →/images/generationsgemini-2.0-flash-exptogemini-2.5-flash-imageRoot Cause
The
callImageGenAPIfunction was designed for OpenRouter's modalities-based chat completions format and used for all providers. But only OpenRouter supports this — Gemini and OpenAI both require the standard/images/generationsendpoint.Gemini error response:
{ "error": { "code": 400, "message": "Image generation is not yet supported on the chat.completions endpoint for this model. Please use the standard client.images.generate method for creation" } }Fix
Added
callStandardImageGenAPIusing the/images/generationsendpoint withresponse_format:"b64_json"— the standard OpenAI-compatible image generation format supported by Gemini, OpenAI, and most other providers./chat/completions+ modalities/images/generations+ b64_json/images/generations+ b64_jsonTest plan
create_imagewithgemini-2.5-flash-imageMEDIA:path returned and image delivered to channel