fix(llm): serialize Responses provider tools#1643
fix(llm): serialize Responses provider tools#1643rosetta-livekit-bot[bot] wants to merge 3 commits into
Conversation
🦋 Changeset detectedLatest commit: 2f40db8 The changes in this PR will be included in the next version bump. This PR includes changesets to release 33 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| const tool = toolCtx[toolCall.name]!; | ||
| if (!isFunctionTool(tool)) { |
There was a problem hiding this comment.
🔴 Missing null guard before isFunctionTool causes uncaught TypeError when tool name is not in toolCtx
executeToolCall at agents/src/llm/utils.ts:263 retrieves the tool via toolCtx[toolCall.name]! which can be undefined at runtime if the tool name doesn't exist. The next line calls isFunctionTool(tool) without a null check. Inside isFunctionTool (agents/src/llm/tool_context.ts:347), line tool[FUNCTION_TOOL_SYMBOL] accesses a property on undefined, throwing an uncaught TypeError. Before this PR, the undefined tool would crash later inside a try-catch block and return a graceful FunctionCallOutput with isError: true. The correct pattern is already used in the same PR at agents/src/llm/utils.ts:177: if (!tool || !isFunctionTool(tool)).
| const tool = toolCtx[toolCall.name]!; | |
| if (!isFunctionTool(tool)) { | |
| const tool = toolCtx[toolCall.name]; | |
| if (!tool || !isFunctionTool(tool)) { |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
While wiring the basic voice agent up to xAI (Grok) with its
WebSearch/XSearchprovider tools, the tools never reached the API — Grok answered from memory instead of searching. This PR fixes the root cause and adds visibility into server-side tool execution.Changes
fix(llm): let the caller specify the provider tool type for responsesto_responses_fnc_ctxhardcodedopenai.tools.OpenAITool, so the xAI LLM (which reuses openai's Responses serializer) had itsWebSearch/XSearchtools silently dropped — they subclassXAITool, notOpenAITool, and never reached the API.provider_tool_type, removing the core→plugin import oflivekit.plugins.openai. The openai Responses LLM defaults it toOpenAITool; the xAI LLM overrides the_provider_tool_typeclass attribute toXAITool.DictProviderToolbase (aProviderTooldeclaringto_dict()). The dict-based plugin tool bases (OpenAITool,XAITool,AnthropicTool,MistralTool) now extend it, so theto_dictcontract is declared once instead of in every plugin, and the serializer is typed without a Protocol ortype: ignore. (Gemini keepsProviderTool+to_tool_config, since its schema is a typedtypes.Tool.)feat(openai): log server-side provider tool execution in responses LLMEmits an info log when the Responses API runs a tool server-side, e.g. xAI
web_searchandx_search(which decomposes intocustom_tool_callsubcalls likex_keyword_search), including the issued query and result. Detection is grounded in openai'sResponseOutputItemdiscriminated union (the authoritative parser): onlymessage,reasoning,function_call, andfunction_call_outputare produced/consumed by the agent itself — every other union member is a server-side tool, so the denylist is provably complete.Test plan
xai.responses.LLM: provider tools now reach the API, searches execute, andprovider tool executedlogs appear on completion.make lintpasses on touched files; runtime check confirms all four dict-based plugin tools subclassDictProviderTooland serialize correctly.