feat: support local refs and defs in tool input schemas#23357
feat: support local refs and defs in tool input schemas#23357celia-oai wants to merge 4 commits into
Conversation
01661aa to
d788814
Compare
ea2d604 to
82b3b74
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4a6f28279e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
01ea580 to
4a6f282
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f9bf071758
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| #[serde(rename = "$defs", skip_serializing_if = "Option::is_none")] | ||
| pub defs: Option<BTreeMap<String, JsonSchema>>, | ||
| #[serde(skip_serializing_if = "Option::is_none")] | ||
| pub definitions: Option<BTreeMap<String, JsonSchema>>, |
There was a problem hiding this comment.
Cap preserved schema definitions before exposing them
For MCP/dynamic tools, these newly serialized $defs/definitions become part of the model-visible tool parameters on every request. A connector can return a reachable definition containing a huge enum, description, or nested schema; prune_unreachable_definitions only checks reachability and does not impose any byte/token cap, so this can inject >1k or >10k tokens into context. The context-review rules require hard caps for injected items, so please truncate, reject, or otherwise bound preserved definitions before serialization.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
maybe we can address the token cap issue separately as a follow-up, but don't think it's blocking this pr
| } | ||
|
|
||
| #[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)] | ||
| enum DefinitionTable { |
There was a problem hiding this comment.
Do we need this? can we pass the string around like above in sanitize_schema_table(map, "$defs");.
pakrym-oai
left a comment
There was a problem hiding this comment.
Let's have a limit on tool size before we merge this one.

Why
Some connector tool input schemas use local JSON Schema references and definition tables to avoid duplicating large nested shapes. Codex previously lowered these schemas into the supported subset in a way that could discard
$ref-only schema objects and lose the corresponding definitions, which made non-strict tool registration less faithful than the original connector schema.This keeps the existing minimal-lowering policy: Codex still does not raw-pass through arbitrary JSON Schema, but it now preserves local reference structure that fits the Responses-compatible subset and prunes definition entries that cannot be reached by following
$refs from the root schema after sanitization, including refs found transitively inside other reachable definitions. The pruning matters because Responses parses definition tables even when entries are unused, so keeping dead definitions wastes prompt tokens.What changed
$ref,$defs, and legacydefinitionsfields to the toolJsonSchemarepresentation.parse_tool_input_schemalowering so$ref-only schema objects survive sanitization instead of becoming{}.#/$defs/Foo~1Bar.Verification
ran local golden-schema probes against representative connector schemas to validate behavior on real generated schemas:
$defsbefore -> after$refbefore -> aftergoogle_calendar/create_spacefigma/apply_file_variable_changesoneOfshapes lower awaysnowflake/list_catalog_integrationsdropbox/create_shared_linkToken increase across golden schema due to this change:
