Skip to content

fix: disable json_object response_format for gemma models#345

Merged
pancacake merged 1 commit intoHKUDS:devfrom
octo-patch:fix/issue-344-gemma-response-format
Apr 20, 2026
Merged

fix: disable json_object response_format for gemma models#345
pancacake merged 1 commit intoHKUDS:devfrom
octo-patch:fix/issue-344-gemma-response-format

Conversation

@octo-patch
Copy link
Copy Markdown
Contributor

Fixes #344

Problem

When using LM Studio with gemma models (e.g. gemma-4-e2b), the
visualization and other JSON-structured features fail with:

openai.BadRequestError: Error code: 400 - {'error': "'response_format.type' must be 'json_schema' or 'text'"}

The root cause: the lm_studio binding has supports_response_format: True
in PROVIDER_CAPABILITIES, so the code sends response_format={"type": "json_object"}.
However, newer Gemma models only accept json_schema or text — they do not
support the legacy json_object type.

Solution

Add "supports_response_format": False to the existing "gemma" entry in
MODEL_OVERRIDES inside deeptutor/services/llm/capabilities.py.

When supports_response_format returns False, the code omits the
response_format parameter entirely. The existing extract_json_object
utilities in the visualize and math-animator agents already parse
structured JSON from plain-text responses, so all callers work correctly
without response_format being set.

This is the same pattern already used for DeepSeek and Anthropic models.

Testing

  • Direct Python assertions covering gemma-4-e2b, gemma-3-4b, gemma-2-9b all return False
  • Non-gemma LM Studio models (mistral-7b, llama-3) continue to return True
  • Added test_gemma_response_format_disabled to tests/services/llm/test_capabilities.py

…S#344)

Gemma models served through LM Studio (and similar local inference
servers) reject response_format={"type": "json_object"}, returning a
400 error: "'response_format.type' must be 'json_schema' or 'text'".

Add supports_response_format: False to the existing "gemma" MODEL_OVERRIDES
entry so these models are excluded from the json_object path. The
existing extract_json_object utilities in the visualize and math-animator
agents already parse JSON from plain text responses, so all callers
continue to work without further changes.
@mr-Lime197
Copy link
Copy Markdown

this problem persists with qwen-like architectures.

@pancacake
Copy link
Copy Markdown
Collaborator

Thanks for your contribution!

@pancacake pancacake merged commit b9c5f5d into HKUDS:dev Apr 20, 2026
2 of 4 checks passed
pancacake added a commit that referenced this pull request Apr 20, 2026
- New `assets/releases/ver1-2-1.md` covering #348 (per-stage chat token
  limits), #349 (Regenerate across CLI/WS/Web UI), the regenerate UI
  harmony polish, and bug fixes #347 / #345 / #352.
- README release-notes block updated to surface v1.2.1 above v1.2.0.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants