fix(pricing): prevent cross-family fuzzy match + warn on fallback by jrob5756 · Pull Request #143 · microsoft/conductor

jrob5756 · 2026-05-04T14:05:21Z

Summary

Closes #137. Two related changes to get_pricing():

1. Boundary check on prefix matching

The longest-prefix fallback now requires a - delimiter after the matched key (model.startswith(known_model + "-")). Without it, names that share a textual prefix with a known key but belong to a different model family silently inherited the wrong context_window and pricing. The four repro names from #137 now correctly return None and degrade gracefully (dashboard hides the bar; cost is null) rather than reporting confidently wrong data:

Requested	Before	After
`claude-opus-4.7`	matched `claude-opus-4` (200K) ❌	`None` ✓
`claude-opus-4.7-high`	matched `claude-opus-4` (200K) ❌	`None` ✓
`claude-opus-4.7-xhigh`	matched `claude-opus-4` (200K) ❌	`None` ✓
`claude-opus-4.7-1m-internal`	matched `claude-opus-4` (200K) ❌	`None` ✓
`claude-sonnet-4-20250514`	matched `claude-sonnet-4` ✓	matched `claude-sonnet-4` ✓
`claude-3-5-sonnet-latest`	matched `claude-3-5-sonnet` ✓	matched `claude-3-5-sonnet` ✓

2. One-time warning on non-exact match

When a versioned-suffix match still happens (legitimate dated/-latest names), get_pricing() now emits a one-time logging.warning per requested name, naming the requested model, the matched key, and the strategy. Exact matches, overrides, and unknown models (returning None) do not warn. De-duped via a module-level set so hot-loop callers don't spam logs.

3. Dead code removal

The suffix-strip branches in get_pricing() were unreachable — longest-prefix runs first and catches anything they would have simplified. Removed for clarity.

Behavior change note

This is technically a behavior change: any caller relying on the loose prefix matching (e.g. a hypothetical gpt-5.4-mini silently inheriting gpt-5 pricing) will now get None instead. That's the asymmetric "unknown name degrades gracefully" path the issue reporter explicitly prefers — and matches the spirit of one of the alternatives proposed in #137 ("drop the longest-prefix step"; this is a softer version that preserves the dated-version use case).

All existing tests pass unchanged, including tests/test_providers/test_context_window.py::TestPrefixMatch.

Changes

src/conductor/engine/pricing.py
- Added module-level logger, _FUZZY_MATCH_WARNED: set[str] for de-dupe.
- New _warn_fuzzy_match() helper.
- Tightened prefix match with + "-" delimiter check.
- Removed unreachable suffix-strip branches.
- Updated the docstring to describe the new, narrower matching semantics.
tests/test_engine/test_pricing.py — new TestFuzzyMatchWarnings class (7 tests):
- exact match / override / unknown model → no warn
- cross-family names from the issue → return None, no warn
- versioned-suffix match → warns once with model name, matched key, strategy
- same model called repeatedly → warns once
- different fuzzy-matched models → each warns once

Testing

$ uv run pytest tests/test_engine/test_pricing.py -q
27 passed in 3.41s

$ uv run pytest tests/test_engine/ -q
466 passed in 26.90s

$ uv run pytest tests/test_providers/test_context_window.py -q
18 passed in 0.58s

$ uv run ruff check src/conductor/engine/pricing.py tests/test_engine/test_pricing.py
All checks passed!

When get_pricing() resolves a model name via the longest-prefix or suffix-strip fallback paths, it silently returned a sibling model's ModelPricing — including its context_window — with no log line. Names like "claude-opus-4-1m-internal" inherited claude-opus-4's 200K window even though the suffix suggests 1M, and the dashboard / cost calc treated those numbers as authoritative. This change emits a one-time logging.warning per requested model name when get_pricing() returns a non-exact entry, naming both the requested model and the matched key. Exact matches, overrides, and unknown models (None) do not warn. De-duped via a module-level set so hot-loop callers don't spam logs. Behavior is otherwise unchanged — this is the smallest viable change suggested in #137 and is fully backward-compatible. Closes #137 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ix-strip code Tightens the longest-prefix fallback in get_pricing() to require a '-' delimiter after the matched key (`model.startswith(known_model + "-")`). Without the delimiter, names that share a textual prefix with a known key but belong to a different model family — e.g. claude-opus-4.7-high matching claude-opus-4 — silently inherited the wrong context_window and pricing. The four repro names from #137 now correctly return None and degrade gracefully (dashboard hides the bar; cost is null) rather than reporting confidently wrong data. Real versioned names like claude-sonnet-4-20250514 still match claude-sonnet-4 because the date suffix is preceded by '-'. Also removes the suffix-strip and suffix-strip+longest-prefix branches: they were unreachable because longest-prefix runs first and catches every name they would have simplified. Updates the strategy label in fuzzy-match warnings from "longest-prefix" to "versioned-suffix" to reflect the new, narrower matching semantics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…gelog Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ion (#129) * fix(copilot): pass streaming=True to SDK to prevent tool-call truncation The Copilot SDK's create_session accepts a 'streaming' parameter that defaults to false. In non-streaming mode the model must emit its entire turn (text + tool_use blocks + arguments) under a single per-turn output budget. For agents that issue large tool-call arguments — most commonly 'create' with multi-KB 'file_text' — that budget is exhausted mid-JSON and the CLI silently executes the partial tool call (path only, no file_text). The model sees the tool succeed with no content, retries the same broken call, and loops indefinitely until the wall-clock session limit fires (default 1800s). The interactive 'copilot' CLI defaults to streaming, which is why the same model + tool combination works there but not via the SDK without this flag. Empirically verified red→green on the same workflow + model (claude-opus-4.7-1m-internal, single ~50 KB create tool call): - Without streaming=True: 9m08s wall-clock failure, 0 bytes written (ProviderError: tool 'create' was executing). - With streaming=True: 4m57s success, 62 KB written in a single create call. Tests: - tests/test_providers/test_copilot_streaming.py — unit test that verifies create_session is called with streaming=True (and that the existing required kwargs are preserved). - tests/test_integration/test_copilot_large_write.py — opt-in (real_api marker) regression test that builds a workflow inline, asks the writer agent to produce a single large create call, and asserts the file is at least 30 KB. Skips automatically when no copilot CLI is available. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add changelog entry for streaming fix (#129) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add #107 and #109 to unreleased changelog Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add #100, #110, #111, #139, #142, #143, #144 to unreleased changelog Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jrob5756 and others added 2 commits May 4, 2026 10:04

jrob5756 force-pushed the fix/pricing-fuzzy-match-warning branch from fd6cd61 to c4d7b52 Compare May 4, 2026 16:24

jrob5756 changed the title ~~fix(pricing): warn once when get_pricing falls back to fuzzy match~~ fix(pricing): prevent cross-family fuzzy match + warn on fallback May 4, 2026

jrob5756 merged commit a28c8ab into main May 4, 2026
7 checks passed

jrob5756 deleted the fix/pricing-fuzzy-match-warning branch May 4, 2026 16:40

jrob5756 added a commit that referenced this pull request May 4, 2026

docs: add #100, #110, #111, #139, #142, #143, #144 to unreleased chan…

74682ea

…gelog Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jrob5756 mentioned this pull request May 5, 2026

docs: changelog + doc updates for unreleased PRs #147

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pricing): prevent cross-family fuzzy match + warn on fallback#143

fix(pricing): prevent cross-family fuzzy match + warn on fallback#143
jrob5756 merged 2 commits intomainfrom
fix/pricing-fuzzy-match-warning

jrob5756 commented May 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jrob5756 commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Boundary check on prefix matching

2. One-time warning on non-exact match

3. Dead code removal

Behavior change note

Changes

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jrob5756 commented May 4, 2026 •

edited

Loading