Skip to content

fix(llm): serialize all provider tools for the Responses API + log server-side tool execution#5865

Merged
u9g merged 4 commits into
mainfrom
fix/xai-provider-tools
May 29, 2026
Merged

fix(llm): serialize all provider tools for the Responses API + log server-side tool execution#5865
u9g merged 4 commits into
mainfrom
fix/xai-provider-tools

Conversation

@u9g
Copy link
Copy Markdown
Contributor

@u9g u9g commented May 27, 2026

Summary

While wiring the basic voice agent up to xAI (Grok) with its WebSearch / XSearch provider tools, the tools never reached the API — Grok answered from memory instead of searching. This PR fixes the root cause and adds visibility into server-side tool execution.

Changes

fix(llm): let the caller specify the provider tool type for responses

to_responses_fnc_ctx hardcoded openai.tools.OpenAITool, so the xAI LLM (which reuses openai's Responses serializer) had its WebSearch / XSearch tools silently dropped — they subclass XAITool, not OpenAITool, and never reached the API.

  • The caller now passes its plugin's provider-tool type via provider_tool_type, removing the core→plugin import of livekit.plugins.openai. The openai Responses LLM defaults it to OpenAITool; the xAI LLM overrides the _provider_tool_type class attribute to XAITool.
  • Adds a shared DictProviderTool base (a ProviderTool declaring to_dict()). The dict-based plugin tool bases (OpenAITool, XAITool, AnthropicTool, MistralTool) now extend it, so the to_dict contract is declared once instead of in every plugin, and the serializer is typed without a Protocol or type: ignore. (Gemini keeps ProviderTool + to_tool_config, since its schema is a typed types.Tool.)

feat(openai): log server-side provider tool execution in responses LLM

Emits an info log when the Responses API runs a tool server-side, e.g. xAI web_search and x_search (which decomposes into custom_tool_call subcalls like x_keyword_search), including the issued query and result. Detection is grounded in openai's ResponseOutputItem discriminated union (the authoritative parser): only message, reasoning, function_call, and function_call_output are produced/consumed by the agent itself — every other union member is a server-side tool, so the denylist is provably complete.

Test plan

  • Verified with the basic voice agent on xai.responses.LLM: provider tools now reach the API, searches execute, and provider tool executed logs appear on completion.
  • make lint passes on touched files; runtime check confirms all four dict-based plugin tools subclass DictProviderTool and serialize correctly.

@chenghao-mou chenghao-mou requested a review from a team May 27, 2026 13:33
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

@u9g u9g force-pushed the fix/xai-provider-tools branch 7 times, most recently from a565f20 to f18cc82 Compare May 27, 2026 17:37
u9g added 2 commits May 27, 2026 13:38
to_responses_fnc_ctx hardcoded openai.tools.OpenAITool, so the xAI LLM
(which reuses openai's responses serializer) had its WebSearch / XSearch
tools silently dropped: they subclass XAITool, not OpenAITool, and never
reached the API.

Require the caller to pass its plugin's ProviderTool subclass via
`provider_tool_type` instead, which also removes the core->plugin import
of livekit.plugins.openai. The openai responses LLM defaults it to
OpenAITool; the xAI LLM overrides it to XAITool.
Emit an info log when a provider-executed tool (e.g. xAI web_search /
x_search, file_search) completes, including the issued query and the
provider result carried on the completed output item.
@u9g u9g force-pushed the fix/xai-provider-tools branch from f18cc82 to 289d880 Compare May 27, 2026 17:39
Comment on lines +51 to +55
class DictProviderTool(ProviderTool):
"""A provider tool whose schema serializes to a plain dict (e.g. openai, xAI)."""

@abstractmethod
def to_dict(self) -> dict[str, Any]: ...
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tricky to have an abstract class for that, because it really depends on the provider.

Is there a way to avoid it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

schema = llm.utils.build_legacy_openai_schema(tool, internally_tagged=True)
schemas.append(schema)
elif isinstance(tool, openai.tools.OpenAITool):
elif isinstance(tool, provider_tool_type):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use hasattr?

I'm not sure, but the reasoning is that for some providers, they only require the API name to mention the tool. It’s not even a dict; it’s literally just "web_search"."

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor Author

@u9g u9g May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed DictProviderTool and replaced it with hasattr(tool, "to_dict")

Provider tool classes now subclass ProviderTool directly and declare their
own abstract to_dict(). The responses serializer recognizes server-side
provider tools via isinstance(ProviderTool) + hasattr("to_dict") instead of
a caller-supplied provider_tool_type, removing the per-plugin override.
@u9g u9g force-pushed the fix/xai-provider-tools branch from 01356cc to 8ec259b Compare May 28, 2026 01:30
…isinstance check

The responses serializer again takes a provider_tool_type so each plugin
specifies which ProviderTool subclass to recognize as a server-side tool,
keeping the core helper free of plugin imports.
@u9g u9g force-pushed the fix/xai-provider-tools branch from cae62eb to d34f2e5 Compare May 28, 2026 01:35
format: Literal["openai.responses"],
*,
strict: bool = True,
provider_tool_type: type[ProviderTool],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this adds the provider_tool_type type argument so we can check is_instance against either OpenAIProviderTool or XAIProviderTool so we can avoid adding provider tools from other providers, before the is_instance hardcoded the OpenAIProviderTool type which meant XAIProviderTools were never sent.

@u9g u9g merged commit 6d50c5d into main May 29, 2026
27 checks passed
@u9g u9g deleted the fix/xai-provider-tools branch May 29, 2026 17:41
toubatbrian added a commit that referenced this pull request May 29, 2026
#5865 added a *required* keyword-only `provider_tool_type` to
to_responses_fnc_ctx, but the dispatcher in `ToolContext.parse_function_tools`
just forwards `**kwargs` blindly:

    elif format == "openai.responses":
        return _provider_format.openai.to_responses_fnc_ctx(self, **kwargs)

So any caller that goes through `parse_function_tools("openai.responses")`
without explicitly threading provider_tool_type hits a TypeError. #5884
landed exactly such a caller (the new `test_serialized_tool_order_is_sorted`),
and main CI has been red on `tests/test_tools.py` since both PRs landed.

Make `provider_tool_type` optional with a `None` default and gate the
provider-tool isinstance branch on its presence. Behavior is unchanged
for legitimate callers (the openai.responses plugin always passes
`provider_tool_type=self._llm._provider_tool_type`); the `None` path
just emits function-tool schemas, which is the right thing for generic
serialization where no provider-tool subtype is in scope.

Verified:
- `uv run pytest tests/test_tools.py` -> 49 passed (was 1 failed)
- `uv run ruff check` / `ruff format --check` -> clean
- `uv run mypy -p livekit.agents.llm` -> clean

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants