Skip to content

fix(functions): validate auto-detected XML tool-call names — robust glm-4.5/Hermes guard (#9722, supersedes #9940)#10059

Merged
mudler merged 1 commit into
masterfrom
fix/9722-glm-xml-name-validation
May 29, 2026
Merged

fix(functions): validate auto-detected XML tool-call names — robust glm-4.5/Hermes guard (#9722, supersedes #9940)#10059
mudler merged 1 commit into
masterfrom
fix/9722-glm-xml-name-validation

Conversation

@localai-bot
Copy link
Copy Markdown
Collaborator

What

Robust fix for the XML tool-call auto-detect misparse behind #9722, and a supersession of #9940.

Problem

ParseXMLIterative (auto-detect, format == nil) tries every preset. glm-4.5's tool block is <tool_call>name…</tool_call>, so when a Hermes/NousResearch model emits:

<tool_call>
{"name": "bash", "arguments": {"script": "ls"}}
</tool_call>

glm-4.5 mis-claims the block and returns the entire JSON object as the function name. That misparse wins over the JSON parser, so streaming clients get a tool call whose name is a JSON blob (and the correct call can get lastEmittedCount-suppressed downstream).

Fix

Guard the auto-detect paths: a returned tool-call name must look like a real function name (^[A-Za-z0-9_.\-]+$). Results that don't are dropped, so auto-detection falls through to the next format and ultimately to JSON parsing, which handles Hermes correctly.

  • Applies to auto-detect only. An explicitly forced format (format != nil, e.g. xml_format_preset: glm-4.5) is trusted verbatim.
  • Covers both the streaming worker and the non-streaming ParseFunctionCall path (both call ParseXMLIterative with auto-detect).

Supersedes #9940

#9940 dropped only names with a leading {. That misses other misparse shapes. The name-shape check rejects them all while still accepting legitimate glm-4.5 calls:

<tool_call>…</tool_call> body mis-extracted name {-filter (#9940) name-shape (this PR)
{"name":"bash",…} {"name":…}
Sure: {…} (leading prose) Sure: {…}
[{…}] (parallel calls) [{…}]
name: bash, … (brace-less) name: bash, …

Tests (TDD)

pkg/functions/parse_glm_9722_test.go:

  • 4 misparse variants (table-driven) — RED before the fix, GREEN after.
  • No-regression: a legitimate <tool_call>get_weather<arg_key>…</arg_key><arg_value>…</arg_value></tool_call> still parses to get_weather.
  • An explicitly forced glm-4.5 format is not filtered.

pkg/functions/... and core/http/endpoints/openai suites pass; golangci-lint --new-from-merge-base=master → 0 issues.

Note on the broader #9722

The streaming double-emission originally reported in #9722 was against pre-2026-05-22 code; the streaming worker has since been refactored (autoparser tool calls are deferred-only and lastEmittedCount-guarded), and #10055 added the autoparser-active skip. This PR addresses the remaining non-autoparser brittleness (the glm-4.5 name misparse) that #9940 targeted.

🤖 Generated with Claude Code

The XML tool-call auto-detector tries every preset, including glm-4.5 whose
tool block is <tool_call>name...</tool_call>. When a Hermes/NousResearch model
emits <tool_call>{"name":"bash","arguments":{...}}</tool_call>, glm-4.5
mis-claims the block and returns the entire JSON object (or leading prose, or a
JSON array) as the function NAME. The misparse then wins over the JSON parser,
so streaming clients receive a tool call whose name is a JSON blob.

Guard the auto-detect paths in ParseXMLIterative: a returned tool name must look
like a real function name ([A-Za-z0-9_.-]+). Results that don't are dropped so
auto-detection falls through to the next format and ultimately to JSON parsing,
which handles Hermes correctly. An explicitly forced format (format != nil) is
left untouched and trusted verbatim.

This supersedes PR #9940, which dropped only names with a leading "{". That
narrower check misses leading prose ("Sure: {...}"), JSON arrays ("[{...}]")
and brace-less garbage ("name: bash, ..."); the name-shape check rejects all of
them while still accepting legitimate glm-4.5 calls. The fix applies to both the
streaming worker and the non-streaming ParseFunctionCall path, which both call
ParseXMLIterative with auto-detection.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
@mudler mudler merged commit 3e22037 into master May 29, 2026
56 of 57 checks passed
@mudler mudler deleted the fix/9722-glm-xml-name-validation branch May 29, 2026 10:03
@mudler mudler added the bug Something isn't working label May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants