Skip to content

fix(pricing): prevent cross-family fuzzy match + warn on fallback#143

Merged
jrob5756 merged 2 commits intomainfrom
fix/pricing-fuzzy-match-warning
May 4, 2026
Merged

fix(pricing): prevent cross-family fuzzy match + warn on fallback#143
jrob5756 merged 2 commits intomainfrom
fix/pricing-fuzzy-match-warning

Conversation

@jrob5756
Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 commented May 4, 2026

Summary

Closes #137. Two related changes to get_pricing():

1. Boundary check on prefix matching

The longest-prefix fallback now requires a - delimiter after the matched key (model.startswith(known_model + "-")). Without it, names that share a textual prefix with a known key but belong to a different model family silently inherited the wrong context_window and pricing. The four repro names from #137 now correctly return None and degrade gracefully (dashboard hides the bar; cost is null) rather than reporting confidently wrong data:

Requested Before After
claude-opus-4.7 matched claude-opus-4 (200K) ❌ None
claude-opus-4.7-high matched claude-opus-4 (200K) ❌ None
claude-opus-4.7-xhigh matched claude-opus-4 (200K) ❌ None
claude-opus-4.7-1m-internal matched claude-opus-4 (200K) ❌ None
claude-sonnet-4-20250514 matched claude-sonnet-4 matched claude-sonnet-4
claude-3-5-sonnet-latest matched claude-3-5-sonnet matched claude-3-5-sonnet

2. One-time warning on non-exact match

When a versioned-suffix match still happens (legitimate dated/-latest names), get_pricing() now emits a one-time logging.warning per requested name, naming the requested model, the matched key, and the strategy. Exact matches, overrides, and unknown models (returning None) do not warn. De-duped via a module-level set so hot-loop callers don't spam logs.

3. Dead code removal

The suffix-strip branches in get_pricing() were unreachable — longest-prefix runs first and catches anything they would have simplified. Removed for clarity.

Behavior change note

This is technically a behavior change: any caller relying on the loose prefix matching (e.g. a hypothetical gpt-5.4-mini silently inheriting gpt-5 pricing) will now get None instead. That's the asymmetric "unknown name degrades gracefully" path the issue reporter explicitly prefers — and matches the spirit of one of the alternatives proposed in #137 ("drop the longest-prefix step"; this is a softer version that preserves the dated-version use case).

All existing tests pass unchanged, including tests/test_providers/test_context_window.py::TestPrefixMatch.

Changes

  • src/conductor/engine/pricing.py
    • Added module-level logger, _FUZZY_MATCH_WARNED: set[str] for de-dupe.
    • New _warn_fuzzy_match() helper.
    • Tightened prefix match with + "-" delimiter check.
    • Removed unreachable suffix-strip branches.
    • Updated the docstring to describe the new, narrower matching semantics.
  • tests/test_engine/test_pricing.py — new TestFuzzyMatchWarnings class (7 tests):
    • exact match / override / unknown model → no warn
    • cross-family names from the issue → return None, no warn
    • versioned-suffix match → warns once with model name, matched key, strategy
    • same model called repeatedly → warns once
    • different fuzzy-matched models → each warns once

Testing

$ uv run pytest tests/test_engine/test_pricing.py -q
27 passed in 3.41s

$ uv run pytest tests/test_engine/ -q
466 passed in 26.90s

$ uv run pytest tests/test_providers/test_context_window.py -q
18 passed in 0.58s

$ uv run ruff check src/conductor/engine/pricing.py tests/test_engine/test_pricing.py
All checks passed!

jrob5756 and others added 2 commits May 4, 2026 10:04
When get_pricing() resolves a model name via the longest-prefix or
suffix-strip fallback paths, it silently returned a sibling model's
ModelPricing — including its context_window — with no log line. Names
like "claude-opus-4-1m-internal" inherited claude-opus-4's 200K window
even though the suffix suggests 1M, and the dashboard / cost calc
treated those numbers as authoritative.

This change emits a one-time logging.warning per requested model name
when get_pricing() returns a non-exact entry, naming both the
requested model and the matched key. Exact matches, overrides, and
unknown models (None) do not warn. De-duped via a module-level set so
hot-loop callers don't spam logs.

Behavior is otherwise unchanged — this is the smallest viable change
suggested in #137 and is fully backward-compatible.

Closes #137

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ix-strip code

Tightens the longest-prefix fallback in get_pricing() to require a '-'
delimiter after the matched key (`model.startswith(known_model + "-")`).
Without the delimiter, names that share a textual prefix with a known
key but belong to a different model family — e.g. claude-opus-4.7-high
matching claude-opus-4 — silently inherited the wrong context_window
and pricing. The four repro names from #137 now correctly return None
and degrade gracefully (dashboard hides the bar; cost is null) rather
than reporting confidently wrong data.

Real versioned names like claude-sonnet-4-20250514 still match
claude-sonnet-4 because the date suffix is preceded by '-'.

Also removes the suffix-strip and suffix-strip+longest-prefix branches:
they were unreachable because longest-prefix runs first and catches
every name they would have simplified.

Updates the strategy label in fuzzy-match warnings from "longest-prefix"
to "versioned-suffix" to reflect the new, narrower matching semantics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jrob5756 jrob5756 force-pushed the fix/pricing-fuzzy-match-warning branch from fd6cd61 to c4d7b52 Compare May 4, 2026 16:24
@jrob5756 jrob5756 changed the title fix(pricing): warn once when get_pricing falls back to fuzzy match fix(pricing): prevent cross-family fuzzy match + warn on fallback May 4, 2026
@jrob5756 jrob5756 merged commit a28c8ab into main May 4, 2026
7 checks passed
@jrob5756 jrob5756 deleted the fix/pricing-fuzzy-match-warning branch May 4, 2026 16:40
jrob5756 added a commit that referenced this pull request May 4, 2026
…gelog

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jrob5756 added a commit that referenced this pull request May 4, 2026
…ion (#129)

* fix(copilot): pass streaming=True to SDK to prevent tool-call truncation

The Copilot SDK's create_session accepts a 'streaming' parameter that
defaults to false. In non-streaming mode the model must emit its entire
turn (text + tool_use blocks + arguments) under a single per-turn output
budget. For agents that issue large tool-call arguments — most commonly
'create' with multi-KB 'file_text' — that budget is exhausted mid-JSON
and the CLI silently executes the partial tool call (path only, no
file_text). The model sees the tool succeed with no content, retries the
same broken call, and loops indefinitely until the wall-clock session
limit fires (default 1800s). The interactive 'copilot' CLI defaults to
streaming, which is why the same model + tool combination works there
but not via the SDK without this flag.

Empirically verified red→green on the same workflow + model
(claude-opus-4.7-1m-internal, single ~50 KB create tool call):
- Without streaming=True: 9m08s wall-clock failure, 0 bytes written
  (ProviderError: tool 'create' was executing).
- With streaming=True: 4m57s success, 62 KB written in a single
  create call.

Tests:
- tests/test_providers/test_copilot_streaming.py — unit test that
  verifies create_session is called with streaming=True (and that the
  existing required kwargs are preserved).
- tests/test_integration/test_copilot_large_write.py — opt-in
  (real_api marker) regression test that builds a workflow inline,
  asks the writer agent to produce a single large create call, and
  asserts the file is at least 30 KB. Skips automatically when no
  copilot CLI is available.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add changelog entry for streaming fix (#129)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add #107 and #109 to unreleased changelog

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: add #100, #110, #111, #139, #142, #143, #144 to unreleased changelog

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Discussion: silent fuzzy-match in get_pricing() can apply wrong context-window metadata

1 participant