fix(completions): pass unknown chat tool types through to upstream by Evrard-Nil · Pull Request #676 · nearai/cloud-api

Evrard-Nil · 2026-05-26T17:37:38Z

Problem

extract_tools_from_extra in crates/services/src/completions/mod.rs had two interacting bugs:

let tools = extra.remove("tools").and_then(|v| {
    serde_json::from_value::<Vec<inference_providers::ToolDefinition>>(v).ok()
});

extra.remove(\"tools\") happens unconditionally, before parsing.
.ok() swallows parse errors silently.

Net effect: any chat-completions request whose tools array doesn't fit our typed ToolDefinition shape (i.e. anything other than {type:\"function\", function:{name,description,parameters}}) is silently dropped — removed from extra and parsed to None. The upstream never sees the tool, and the model never knows it could call it.

This blocked the web_context_search agent loop (nearai/inference-proxy#144) from running end-to-end through cloud-api. The inference-proxy expands {\"type\":\"web_context_search\"} into a proper function tool inside the CVM, but cloud-api was stripping it before the request ever got there.

Empirical evidence: POST /v1/chat/completions with tools:[{\"type\":\"web_context_search\"}] against GLM-5.1 returns prompt_tokens=19 (just the user message). The tool schema alone would add ~100 tokens — so cloud-api forwarded the request without it.

Fix

Only consume from extra on successful parse. If parsing fails, the raw value stays in extra and flows through to the upstream request body verbatim via the existing #[serde(flatten)] on ChatCompletionParams.extra. Same treatment for tool_choice for symmetry.

This is a permissive-passthrough change at the cloud-api layer: deciding whether a tool type is supported is the upstream's job (vLLM/SGLang via inference-proxy for our models, the provider for Gemini/Anthropic). External providers already only consume typed tools, so they're unaffected — non-function tools simply don't reach them.

Tests

cargo test -p services --lib extract_tools (4 new tests, all pass):

extract_tools_consumes_function_shaped_tools — regression cover for the existing path: standard function tools parse, get consumed from extra (so we don't end up with both the typed tools field and a flattened extra.tools, which would be a duplicate key in the upstream JSON).
extract_tools_passes_through_unknown_tool_types — the web_context_search regression: array remains in extra.
extract_tools_passes_through_mixed_unknown_in_array — an array mixing function and unknown stays fully opaque. Conservative: don't strip some entries; cloud-api isn't the right layer to half-parse arrays.
extract_tools_passes_through_unknown_tool_choice — symmetric guard for tool_choice.

Full suite: cargo test -p services --lib → 315 existing + 4 new pass. cargo clippy --all-targets -- -D warnings clean. cargo fmt --check clean.

Test plan after deploy

Plain (no-E2EE) POST /v1/chat/completions to cloud-api.near.ai with tools:[{\"type\":\"web_context_search\"}] against a GLM-5.1 deployment that has WEB_CONTEXT_SEARCH_* env set (gpu23/04/03/26 — done in cvm-conf v0.0.191). Expect: nearai_tool_result chunk + data: [DONE].
Same request with E2EE headers (X-Signing-Algo, X-Client-Pub-Key). Expect: encrypted nearai_tool_result.output on the wire, decryptable with the client key.
Existing chat completion + standard function-typed tool — should be byte-identical to today (the regression test covers this in code).

extract_tools_from_extra parses request.extra["tools"] as the typed Vec<ToolDefinition>; on parse failure it called .ok() which silently dropped the value, AND it removed the key from `extra` unconditionally before parsing. The combination meant any chat-completions request carrying a non-function-shaped tool (e.g. NEAR's namespaced `{"type":"web_context_search"}` that inference-proxy expands into a function tool inside the CVM) lost the entire tools array — cloud-api forwarded the request without it, and the model never saw the tool. Fix: only consume from `extra` on successful parse. If parsing fails, leave the raw value in place so it flows through to the upstream request body verbatim via the existing #[serde(flatten)] on ChatCompletionParams.extra. Same treatment for tool_choice for symmetry. This is a permissive-passthrough change at the cloud-api layer; the upstream (vLLM/SGLang via inference-proxy) is the right place to decide whether a tool type is supported. External providers (Gemini, Anthropic) already only see typed tools today, so they're unaffected. Tests: - extract_tools_consumes_function_shaped_tools — regression cover for the existing path: function tools parse, get consumed from extra (no duplicate keys when ChatCompletionParams serializes). - extract_tools_passes_through_unknown_tool_types — the web_context_search regression: array remains in extra. - extract_tools_passes_through_mixed_unknown_in_array — array with both function and unknown stays opaque (conservative: don't strip some entries from a mixed array). - extract_tools_passes_through_unknown_tool_choice — symmetric for tool_choice. Verified: cargo test -p services --lib (315 + 4 new pass), cargo clippy --all-targets, cargo fmt --check all clean.

gemini-code-assist

Code Review

This pull request refactors extract_tools_from_extra to only remove tools and tool_choice from the extra map if they successfully parse into the typed ToolDefinition and ToolChoice shapes. If parsing fails, they are left in extra to flow through to the upstream request body verbatim. It also adds comprehensive unit tests to verify this behavior. The reviewer suggested simplifying the nested match statements in extract_tools_from_extra using idiomatic Rust combinators like and_then and map to reduce boilerplate and improve readability.

gemini-code-assist · 2026-05-26T17:38:20Z

+        let tools = match extra.get("tools").cloned() {
+            Some(raw) => {
+                match serde_json::from_value::<Vec<inference_providers::ToolDefinition>>(raw) {
+                    Ok(parsed) => {
+                        extra.remove("tools");
+                        Some(parsed)
+                    }
+                    Err(_) => None,
+                }
+            }
+            None => None,
+        };

-        let tool_choice = extra
-            .remove("tool_choice")
-            .and_then(|v| serde_json::from_value::<inference_providers::ToolChoice>(v).ok());
+        let tool_choice = match extra.get("tool_choice").cloned() {
+            Some(raw) => match serde_json::from_value::<inference_providers::ToolChoice>(raw) {
+                Ok(parsed) => {
+                    extra.remove("tool_choice");
+                    Some(parsed)
+                }
+                Err(_) => None,
+            },
+            None => None,
+        };


The nested match statements can be simplified and made more idiomatic by using and_then and map combinators. This improves readability and reduces boilerplate.

let tools = extra.get("tools").and_then(|raw| { serde_json::from_value::<Vec<inference_providers::ToolDefinition>>(raw.clone()) .ok() .map(|parsed| { extra.remove("tools"); parsed }) }); let tool_choice = extra.get("tool_choice").and_then(|raw| { serde_json::from_value::<inference_providers::ToolChoice>(raw.clone()) .ok() .map(|parsed| { extra.remove("tool_choice"); parsed }) });

Copilot

Pull request overview

This PR fixes tool forwarding in the completions service by ensuring tools / tool_choice are only removed from extra when they successfully parse into the typed inference_providers shapes, preventing non-standard tool definitions from being silently dropped before reaching upstream.

Changes:

Update extract_tools_from_extra to only consume extra.tools / extra.tool_choice on successful typed parsing (otherwise leave raw JSON intact).
Add unit tests covering: standard function tools are consumed; unknown/mixed tool arrays and unknown tool_choice are preserved in extra.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    /// function-tool shape — the original value is left in `extra` so it
+    /// flows through to the upstream request body verbatim via the
+    /// `#[serde(flatten)]` on `ChatCompletionParams.extra`. This means
+    /// cloud-api never silently drops tool definitions it doesn't
+    /// recognise; the upstream gets to decide.


+        // Symmetric to tools: a non-standard tool_choice string survives
+        // for upstream interpretation.


claude · 2026-05-26T17:41:37Z

Code Review

Scope: extract_tools_from_extra in crates/services/src/completions/mod.rs - fix for silent drop of unknown chat tool types, plus 4 unit tests.

Correctness

The fix is right and the bug is real. tools on ChatCompletionParams has skip_serializing_if = Option::is_none (crates/inference_providers/src/models.rs:217) and extra is serde flatten (models.rs:243). So when parse fails and the raw JSON stays in extra, it flattens into the outbound body without a duplicate tools key. Successful parse removes the key from extra, avoiding a duplicate-key collision on serialization - the first test explicitly guards this. tools and tool_choice are handled independently, so partial success behaves correctly.

Behavior change worth flagging

For OpenAI-compatible external backends (crates/inference_providers/src/external/openai_compatible.rs:151) the entire ChatCompletionParams is serialized as JSON, so the flattened extra does reach upstream. The PR description says external providers only consume typed tools and are unaffected - that holds for Gemini (gemini/mod.rs:142 builds its own request struct) and Anthropic, but not for OpenAI-compatible providers (OpenAI, Together, Groq, Fireworks).

Practical effect: before this PR a client sending a web_context_search tool through an OpenAI-compatible route got the tool silently dropped and a successful response. After this PR that request will surface an upstream 400. Arguably the right behavior (upstream owns the decision) but a user-visible change the PR description understates. Worth a one-line note; no code change required.

Style nit (already raised by gemini-code-assist)

The nested match blocks could collapse to an and_then + ok chain that removes only on success. Equivalent, half the lines. Not blocking - the current version is also readable.

Minor: redundant clone

extra.get(tools).cloned() clones the JSON value in both success and failure paths; the success path then calls remove, throwing away the clone. A remove-first / insert-back-on-failure pattern would avoid it, but the perf delta for a small tools array is negligible.

Tests

Four tests cover: function-shaped happy path, unknown type passthrough, mixed-array conservatism, and tool_choice symmetry. The mixed-array case is the right call - half-parsing arrays in this layer would be worse than passing through opaque.

Privacy / logging

No new log statements. Compliant with CLAUDE.md.

Approved. One ask: add a sentence to the PR description noting that OpenAI-compatible upstreams will now see unknown tools that were previously absorbed silently. Everything else is optional.

PierreLeGuen

Reviewed the live /v1/chat/completions service path plus the relevant outbound provider paths.

I do not see a blocking issue. The helper now only consumes tools / tool_choice when they parse into the typed shape; otherwise the raw values stay in extra, which is what the direct pass-through providers serialize. Provider-native adapters still rebuild their own request bodies from typed fields, so this does not unexpectedly enable unsupported raw tool shapes there.

Verified locally:

cargo fmt --check
git diff --check origin/main...HEAD
cargo test -p services --lib extract_tools
cargo test -p services --lib
cargo test -p inference_providers --lib strip_reasoning_effort
cargo test -p inference_providers --lib flatten

GitHub checks are green. Non-blocking: the PR text should be explicit that pass-through providers may now surface upstream validation errors for tool shapes that were previously silently dropped.

`ChatDelta` is a typed struct with no `#[serde(flatten)]` catch-all, so any field the upstream emits that's not in our explicit field list is silently dropped on deserialize and never reaches the client. This bites the inference-proxy's server-side agent loop (nearai/inference-proxy#144): the proxy emits a synthetic `delta.nearai_tool_result` chunk between iterations carrying the tool's grounded output. Empirically, against a freshly-deployed staging running #676, the model successfully called `web_context_search` (we saw the tool_calls chunks on the wire), but the proxy's `nearai_tool_result` chunks never made it through to the client — cloud-api stripped them on chunk re-serialization. Same root cause as #676 on the request side: typed structs with no catch-all. Fix: add a flattened `extra: HashMap<String, serde_json::Value>` to `ChatDelta`. Unknown delta fields are now preserved verbatim on the deserialize-then-re-serialize round trip. Derived `Default` on the struct so the empty catch-all is `HashMap::new()` by default, then updated the handful of explicit `ChatDelta { ... }` literals across the workspace to set `extra: Default::default()`. Regression test: `chat_delta_preserves_unknown_fields_round_trip` constructs a chunk with `delta.nearai_tool_result`, deserializes it, asserts the catch-all contains the expected payload, and round-trips the re-serialized JSON to confirm the synthetic field survives. `cargo test --workspace --lib` clean (315 + 174 + 106 + 16 + 11 passing, including the new test). `cargo clippy --all-targets -- -D warnings` clean.

Copilot AI review requested due to automatic review settings May 26, 2026 17:37

Evrard-Nil temporarily deployed to Cloud API test env May 26, 2026 17:37 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Evrard-Nil May 26, 2026 17:37 View session

gemini-code-assist Bot reviewed May 26, 2026

View reviewed changes

Copilot AI reviewed May 26, 2026

View reviewed changes

PierreLeGuen approved these changes May 26, 2026

View reviewed changes

Evrard-Nil merged commit f017653 into main May 26, 2026
6 checks passed

This was referenced May 26, 2026

add encrypted_agent_loop_verifier: E2EE web_context_search demo nearai/nearai-cloud-verifier#26

Open

fix(completions): preserve unknown chat delta fields end-to-end #677

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(completions): pass unknown chat tool types through to upstream#676

fix(completions): pass unknown chat tool types through to upstream#676
Evrard-Nil merged 1 commit into
mainfrom
feat/passthrough-unknown-tool-types

Evrard-Nil commented May 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

claude Bot commented May 26, 2026

Uh oh!

PierreLeGuen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// Symmetric to tools: a non-standard tool_choice string survives
		// for upstream interpretation.

Conversation

Evrard-Nil commented May 26, 2026

Problem

Fix

Tests

Test plan after deploy

Related

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

claude Bot commented May 26, 2026

Uh oh!

PierreLeGuen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants