fix: disable json_object response_format for gemma models by octo-patch · Pull Request #345 · HKUDS/DeepTutor

octo-patch · 2026-04-20T01:53:46Z

Fixes #344

Problem

When using LM Studio with gemma models (e.g. gemma-4-e2b), the
visualization and other JSON-structured features fail with:

openai.BadRequestError: Error code: 400 - {'error': "'response_format.type' must be 'json_schema' or 'text'"}

The root cause: the lm_studio binding has supports_response_format: True
in PROVIDER_CAPABILITIES, so the code sends response_format={"type": "json_object"}.
However, newer Gemma models only accept json_schema or text — they do not
support the legacy json_object type.

Solution

Add "supports_response_format": False to the existing "gemma" entry in
MODEL_OVERRIDES inside deeptutor/services/llm/capabilities.py.

When supports_response_format returns False, the code omits the
response_format parameter entirely. The existing extract_json_object
utilities in the visualize and math-animator agents already parse
structured JSON from plain-text responses, so all callers work correctly
without response_format being set.

This is the same pattern already used for DeepSeek and Anthropic models.

Testing

Direct Python assertions covering gemma-4-e2b, gemma-3-4b, gemma-2-9b all return False
Non-gemma LM Studio models (mistral-7b, llama-3) continue to return True
Added test_gemma_response_format_disabled to tests/services/llm/test_capabilities.py

…S#344) Gemma models served through LM Studio (and similar local inference servers) reject response_format={"type": "json_object"}, returning a 400 error: "'response_format.type' must be 'json_schema' or 'text'". Add supports_response_format: False to the existing "gemma" MODEL_OVERRIDES entry so these models are excluded from the json_object path. The existing extract_json_object utilities in the visualize and math-animator agents already parse JSON from plain text responses, so all callers continue to work without further changes.

mr-Lime197 · 2026-04-20T08:12:45Z

this problem persists with qwen-like architectures.

pancacake · 2026-04-20T15:24:50Z

Thanks for your contribution!

- New `assets/releases/ver1-2-1.md` covering #348 (per-stage chat token limits), #349 (Regenerate across CLI/WS/Web UI), the regenerate UI harmony polish, and bug fixes #347 / #345 / #352. - README release-notes block updated to surface v1.2.1 above v1.2.0. Made-with: Cursor

pancacake merged commit b9c5f5d into HKUDS:dev Apr 20, 2026
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: disable json_object response_format for gemma models#345

fix: disable json_object response_format for gemma models#345
pancacake merged 1 commit intoHKUDS:devfrom
octo-patch:fix/issue-344-gemma-response-format

octo-patch commented Apr 20, 2026

Uh oh!

mr-Lime197 commented Apr 20, 2026

Uh oh!

pancacake commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

octo-patch commented Apr 20, 2026

Problem

Solution

Testing

Uh oh!

mr-Lime197 commented Apr 20, 2026

Uh oh!

pancacake commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants