Skip to content

docs: add tool-call parser troubleshooting for custom LLM backends#330

Open
mason5052 wants to merge 1 commit into
vxcontrol:mainfrom
mason5052:codex/issue-313-tool-call-parser-troubleshooting
Open

docs: add tool-call parser troubleshooting for custom LLM backends#330
mason5052 wants to merge 1 commit into
vxcontrol:mainfrom
mason5052:codex/issue-313-tool-call-parser-troubleshooting

Conversation

@mason5052
Copy link
Copy Markdown
Contributor

Summary

Add a troubleshooting subsection under Custom LLM Provider Configuration explaining why tool-call (function-call) parser problems on self-hosted OpenAI-compatible backends (llama.cpp / SGLang / vLLM, often behind LiteLLM) cause stalled flows, and how to diagnose them. Docs only.

Problem

Issue #313 reported that flows stop responding after a few steps when running a custom backend configured through LLM_SERVER_* (LiteLLM in front of llama.cpp serving qwen3.6-35b). The logs showed:

Failed to parse tool call arguments as JSON: [json.exception.parse_error.101]
parse error at line 1, column 131: syntax error while parsing value -
unexpected end of input

surfaced through LiteLLM as an HTTP 500, followed by cascading retries and a 429. The maintainer confirmed the stall was fixed in the latest build by sanitizing malformed function-call arguments, and that the root cause was the model side returning corrupted tool-call arguments.

There is currently no documentation that explains this class of failure, even though it is a common pitfall with self-hosted backends and is closely related to the image-chooser failure (a flow's first action is an LLM tool call to pick the container image).

Solution

Add a #### Troubleshooting: tool-call (function-call) parser errors subsection right after the Custom LLM Provider Configuration content, covering:

  • Custom OpenAI-compatible backends must return valid tool-call JSON; llama.cpp, SGLang, and vLLM usually require a specific tool-call parser and a matching chat template, and not every setup produces valid tool calls out of the box (compatibility depends on the backend, not PentAGI alone).
  • Symptoms: Failed to parse tool call arguments as JSON, a flow that stalls after a few steps, looping tool calls, the start-of-flow failed to select primary docker image via llm call error, and unexpected backend 5xx/4xx responses.
  • How to investigate: check both PentAGI and backend/proxy logs, validate the provider with ctester before a full flow, confirm the parser/chat template match the model, and update PentAGI (recent builds sanitize malformed function-call arguments).

The new content links only to the existing Testing LLM Agents section and references the image-chooser error in prose (no new anchor), so it stands on its own against main.

User Impact

  • Users on self-hosted/llama.cpp/SGLang/vLLM backends get a clear explanation of the tool-call parser failure mode and a concrete diagnosis path, instead of an opaque Failed to parse tool call arguments as JSON stall.
  • Points users at ctester for pre-flight validation and at the update that sanitizes malformed arguments.
  • No behavior change.

Test Plan

  • git diff --check clean.
  • Docs-only diff: README.md (+20 lines). No tool-call parser code, provider runtime, schema, migration, or config-default changes.
  • Verified the referenced error string failed to select primary docker image via llm call exists in backend/pkg/providers/providers.go on main.
  • Verified LLM_SERVER_URL / LLM_SERVER_KEY / LLM_SERVER_MODEL / LLM_SERVER_PROVIDER exist in .env.example.
  • Verified the ctester utility exists and tests tool-calling agent types, and that the #testing-llm-agents anchor resolves.
  • Placed away from the README regions touched by open PRs docs: clarify "primary docker image" error is an LLM backend failure #325 and docs: add embedding troubleshooting for stalled flows #327 to avoid conflicts.
  • No unrelated files included.

Refs #313

Issue vxcontrol#313 reported flows that stall after a few steps when running a
custom OpenAI-compatible backend (LiteLLM in front of llama.cpp serving
qwen3.6-35b via LLM_SERVER_*). The backend returned malformed tool-call
arguments, surfaced as 'Failed to parse tool call arguments as JSON'
HTTP 500s and cascading retries. The maintainer fixed the stall in the
latest build by sanitizing wrong function-call arguments.

Add a troubleshooting subsection under Custom LLM Provider Configuration
that explains the root cause and how to diagnose it:

- Custom OpenAI-compatible backends must return valid tool-call
  (function-call) JSON; llama.cpp, SGLang, and vLLM usually require a
  specific tool-call parser and matching chat template, and not every
  setup produces valid tool calls out of the box.
- Symptoms: 'Failed to parse tool call arguments as JSON', flow stalls,
  looping tool calls, the 'failed to select primary docker image via
  llm call' start-of-flow failure, and unexpected backend HTTP errors.
- Investigation: check PentAGI and backend/proxy logs, validate with the
  ctester utility before a full flow, confirm the parser/chat template
  match the model, and update PentAGI (recent builds sanitize malformed
  function-call arguments).

Docs only. No tool-call parser code, provider runtime, schema, migration,
or config-default changes. Wording frames compatibility as dependent on
the backend's OpenAI-compatible tool-call behavior rather than claiming
every llama.cpp backend is supported.
Copilot AI review requested due to automatic review settings June 4, 2026 02:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds documentation to help debug LLM backend/tool-call formatting issues that can stall agent flows when using OpenAI-compatible backends.

Changes:

  • Documented common tool-call (function-call) JSON parsing failure modes.
  • Added investigation steps and pointers to logs and the ctester utility.
  • Clarified that correct parser/chat-template configuration is required for self-hosted inference engines.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants