fix: 0.8.3 — close silent zero-billing across langgraph + init-ordering#42
Merged
Conversation
Three coordinated defenses against the same class of bug the 0.8.2
audit closed on the httpx path: llm_call events reaching the backend
without a model field were silently recorded as ≈$0 (backend
unwrap_or('default') + DEFAULT_RATE). 0.8.3 closes the langgraph
callback path and the init-ordering hazard, and promotes the
missing-model wire failure from WARN to fail-LOUD.
1. langgraph callback path (instrumentation/langgraph.py)
_extract_model_from_response now consults response.llm_output
FIRST — that's where langchain-openai 1.x puts the
date-suffixed model id (e.g. 'gpt-4.1-mini-2025-04-14'). The
previous chain led with response_metadata, which langchain 1.x
leaves empty on the AIMessage inside generations[0][0].message.
Without this promotion every OpenAI-via-LangChain 1.x call
silently zero-billed. Also adds a 'any key containing model'
sweep inside llm_output so non-OpenAI wrappers (proxies,
custom chat models) still get attributed.
2. Init-ordering hazard (instrumentation/auto.py)
patch_httpx's class-level __init__ wrap only catches Clients
created AFTER it is installed. Users that build
ChatOpenAI(...) before nullrun.init(api_key=...) get a
pre-existing httpx.Client that the patch never sees — those
clients keep the unpatched transport and emit nothing. We now
sweep gc.get_objects() once at patch install time and wrap
any pre-existing Client/AsyncClient whose transport isn't
already a NullRun*Transport. Idempotent via the existing
class-level marker.
3. Fail-LOUD wire tag (runtime.py)
runtime.track() now escalates the missing-model warning from
logger.warning to logger.error, bumps a runtime counter
(dropped_llm_call_no_model) for dashboards, and tags the wire
event with __missing_model: True so the backend's into_track_request
gate can reject with HTTP 422 instead of silently recording a
zero-cost call. The event is still sent (not fail-CLOSED) so the
backend can audit the rejection; the flag is wire-private and
stripped before persisting.
tests/contract/test_llm_call_model_wire.py pins all three
invariants: 7 unit tests for _extract_model_from_response
(every known langchain shape + non-OpenAI wrappers + empty-string
fallthrough), 3 tests for track()'s missing-model wire tagging
(ERROR + counter + __missing_model flag + non-llm_call silence),
and 2 tests for the eager-wrap sweep (pre-existing Client gets
wrapped, idempotent on re-patch).
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
`import gc` was inserted between `hashlib` and `json`; ruff's isort rule wants `gc` before `hashlib` alphabetically.
maltsev-dev
added a commit
that referenced
this pull request
Jun 29, 2026
…l-LOUD wire tag (#43) Version bump after PR #42: the wire-format fixes from 0.8.2 closed the silent zero-billing bug on the httpx path. 0.8.3 closes it on the langgraph callback path (langchain-openai 1.x puts the date-suffixed model on LLMResult.llm_output, not on AIMessage response_metadata), the init-ordering hazard (ChatOpenAI(...) built before nullrun.init() never sees the class-level __init__ wrap), and promotes the missing-model wire failure from WARN to fail-LOUD (ERROR + dropped_llm_call_no_model counter + __missing_model wire flag the backend can reject with HTTP 422). CHANGELOG entry added above 0.8.2. No public-API break.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three coordinated defenses against the same class of bug 0.8.2 closed on the httpx path:
llm_callevents reaching the backend without amodelfield were silently recorded as ≈$0 (backendunwrap_or('default')+DEFAULT_RATE). 0.8.3 closes the langgraph callback path and the init-ordering hazard, and promotes the missing-model wire failure from WARN to fail-LOUD.1. langgraph callback path (
src/nullrun/instrumentation/langgraph.py)_extract_model_from_responsenow consultsresponse.llm_outputFIRST — that's where langchain-openai 1.x puts the date-suffixed model id (gpt-4.1-mini-2025-04-14). The previous chain led withresponse_metadata, which langchain 1.x leaves empty on the AIMessage insidegenerations[0][0].message. Without this promotion every OpenAI-via-LangChain 1.x call silently zero-billed. Also adds a "any key containing model" sweep insidellm_outputfor non-OpenAI wrappers (proxies, custom chat models).2. Init-ordering hazard (
src/nullrun/instrumentation/auto.py)patch_httpx's class-level__init__wrap only catches Clients created AFTER it is installed. Users that buildChatOpenAI(...)beforenullrun.init(api_key=...)get a pre-existinghttpx.Clientthat the patch never sees. We now sweepgc.get_objects()once at patch install and wrap any pre-existingClient/AsyncClientwhose transport isn't already aNullRun*Transport. Idempotent via the existing class-level marker.3. Fail-LOUD wire tag (
src/nullrun/runtime.py)runtime.track()now escalates the missing-model warning fromlogger.warningtologger.error, bumpsdropped_llm_call_no_modelfor dashboards, and tags the wire event with__missing_model: Trueso the backend'sinto_track_requestgate can reject with HTTP 422 instead of silently recording a zero-cost call. The event is still sent (not fail-CLOSED) so the backend can audit; the flag is wire-private and stripped before persisting.Files
src/nullrun/instrumentation/langgraph.py— llm_output-first chain + non-OpenAI sweepsrc/nullrun/instrumentation/auto.py— eager gc sweep for pre-existing Clientssrc/nullrun/runtime.py— ERROR + counter +__missing_modelflagtests/contract/__init__.py(new package)tests/contract/test_llm_call_model_wire.py— 12 tests pinning all three invariantsTest surface
7 unit tests for
_extract_model_from_response(langchain-openai 1.x,modelkey, "any key containing model" sweep, llm_output-before-response_metadata ordering, response_metadata fallback, generations-message fallback, all-empty None + DEBUG log, empty-string fallthrough).3 tests for
track()missing-model handling (ERROR + counter + tag; tag silent when model set; tag silent for non-llm_call types).2 tests for the eager gc sweep (pre-existing Client gets wrapped; idempotent on re-patch).
Branch base:
master@d4884a7(after the 0.8.2 version bump from PR #41).