fix(copilot): tolerate SDK metadata parsing errors when listing models#193
Merged
Conversation
Copilot SDK >=0.3.0 eagerly parses every model in its `models.list`
response with dataclass `from_dict` helpers. When any required field is
missing the parser raises `ValueError` and the entire listing fails.
`claude-opus-4.7-1m-internal` ships a `billing` object without the
required `multiplier` field, which makes `ModelBilling.from_dict`
raise `ValueError("Missing required field 'multiplier' in
ModelBilling")`.
Both `get_max_prompt_tokens` and `_validate_reasoning_effort_for_model`
caught only `(TimeoutError, ProviderError, OSError, RuntimeError)` at
the SDK boundary, so the `ValueError` leaked through and surfaced as
`Dialog turn failed: ...` after exhausting the retry loop. This was
particularly disruptive for workflows that configure
`reasoning.effort` on the Copilot provider, since the validation step
runs before `create_session` and blocks every agent invocation.
Broaden both catches to `Exception` (still excludes
`asyncio.CancelledError`/`KeyboardInterrupt`/`SystemExit`) and
document the rationale at both call sites. The behavior matches the
existing docstring contract: "capability metadata must never block a
workflow that the SDK might otherwise accept."
Updated `test_unexpected_exception_propagates` (which asserted the
opposite contract) into `test_value_error_from_sdk_parser_returns_none`
and added `test_value_error_from_sdk_parser_skips_validation` covering
the reasoning-effort path with the exact upstream error message.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
aae8854 to
9b11908
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Workflows that configure
reasoning.effort(or workflow-wideruntime.default_reasoning_effort) on the Copilot provider were failing with:after exhausting the retry loop. The bug affects every named Copilot model
(see "Blast radius" below) — it just wasn't caught earlier because few
workflows exercise the reasoning-effort path on Copilot.
Root cause
github-copilot-sdk0.3.0 eagerly parses every entry in itsmodels.listresponse with dataclassfrom_dicthelpers. TheModelBilling.from_dictparser(client.py:496-507)
treats
multiplieras required and raisesValueError("Missing required field 'multiplier' in ModelBilling")if it'sabsent — and none of the currently-served models include it. Because the
parsing is done in a list-comprehension, a single malformed entry kills the
whole
list_models()call, and the SDK's response cache is only populatedafter a successful parse, so every call from a fresh session fails the same way.
Both
_validate_reasoning_effort_for_model(called before everycreate_sessionwhenreasoning.effortis set) andget_max_prompt_tokenscaught only
(TimeoutError, ProviderError, OSError, RuntimeError)at theSDK boundary, so the
ValueErrorleaked through. In the reasoning-effortpath it then poisoned the retry loop and surfaced as the user-visible
Dialog turn failed: ...error. (get_max_prompt_tokenswas rescued bythe engine's outer
except Exceptionatengine/workflow.py:1071,
so context-window metadata was silently unavailable rather than fatal.)
Blast radius
Inspected the live response on this machine — 18 of 19 returned models ship
a
billingobject withoutmultiplier:The only "clean" entry is
auto, which hasbilling: Noneso the parsershort-circuits.
Fix
Broaden both catches to
Exception(still excludesasyncio.CancelledError/KeyboardInterrupt/SystemExit, all ofwhich are
BaseExceptionsubclasses) and document the SDK schema-failurecase at both call sites. This matches the existing docstring contract:
"capability metadata must never block a workflow that the SDK might
otherwise accept."
The narrow tuple was written before anyone realized the SDK would raise
non-OSError exceptions in normal operation.
Tests
test_unexpected_exception_propagates(which asserted theopposite of the docstring contract) with
test_value_error_from_sdk_parser_returns_none, using the exactupstream error message for clarity.
test_value_error_from_sdk_parser_skips_validationcovering thereasoning-effort path: confirms the workflow proceeds normally and
that
reasoning_effortis still forwarded tocreate_session.2579 passed, 15 skipped, 29 deselectedin 7m36s.Note
This is an internal-model regression on the Copilot SDK side, not worth
filing upstream — the missing field is a temporary backend quirk and the
SDK will likely make
multiplieroptional in a future release. Our fixhardens the code path regardless.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com