release: 0.7.7 — forward real model + tools to /gate pre-flight (T4)#36
Merged
Conversation
Additive patch on top of the 0.7.0 thin-client refactor. No
breaking changes.
Added
-----
* nullrun.integrations.fastapi — one-line FastAPI integration
that turns every NullRunDecision / NullRunInfrastructureError
thrown by @nullrun.protect endpoints into a clean JSON
response with the right HTTP status code. No per-endpoint
except blocks required.
Response shape:
{"error_code": "NR-B004",
"user_message": "You've reached the usage limit...",
"category": "decision"}
HTTP status mapping:
* NR-B004 (budget), NR-L001 (loop), NR-R001 (rate) -> 429
with optional Retry-After
* NR-T001 (tool blocked), NR-X001 (generic block) -> 403
* NR-W003 (paused) -> 503 with Retry-After
* NR-W002 (killed) -> 503; WorkflowKilledInterrupt is a
BaseException subclass so Starlette's
add_exception_handler refuses it — handled via ASGI
middleware instead (hybrid pattern, documented in
module docstring).
* NullRunInfrastructureError subclasses -> 503 (our side,
not user's).
* nullrun.messages — default user-facing message catalog.
Every NR-* error code has an English default message owned
by NULLRUN, not customer code. Customer Support Bots hitting
a budget cap show the same wording across every NullRun-backed
application.
* format_user_message(exc) — render exception as user-facing
string
* set_user_message(code, text) — per-process override for
branded variants
* get_user_message(code) — raw lookup
* reset_overrides() — clear all overrides (for tests)
Changed
-------
* Transport._send_batch canonical JSON serialization — route the
/track/batch body through _signed_request_body for consistent
compact-separator serialization. HMAC itself is unaffected,
but consistent serialization removes a special-case from the
wire-format contract tests.
* Transport._send_batch actions response handling — backend
renamed BatchTrackResponse.actions_taken (debug names) ->
BatchTrackResponse.actions (ActionTaken structs). Read both
for forward-compat; per-element try/except so one malformed
entry doesn't abort the whole loop.
* pyproject.toml metadata — long-form description with search
keywords, Maintainer: populated via maintainers=[...],
expanded classifiers (Linux / Windows / macOS, Python 3.13,
CPython, Security / AI / WWW/HTTP topics), project URL
expander.
Tests
-----
* tests/test_messages.py (new, 282 lines) — catalog
completeness (every NR-* code has a default message),
override / reset behavior, render path.
* tests/test_integrations_fastapi.py (new, 289 lines) — HTTP
status mapping per error code, response shape, ASGI
middleware path for WorkflowKilledInterrupt, hybrid
composition.
* tests/test_decision_split.py (new, 199 lines) — pins the
decision / infrastructure error split.
* Updates to tests/test_runtime.py, tests/test_extractors.py
reflecting transport canonical-JSON + actions-renamed
changes.
Release plumbing
----------------
* pyproject.toml: version bumped 0.7.0 -> 0.7.6
* src/nullrun/__version__.py: __version__ = "0.7.6"
* CHANGELOG.md: full 0.7.6 entry covering additions,
transport changes, metadata improvements
Tests pass locally (per session log) — pytest on Windows /
Python 3.14.2 is green.
…padding PR #35 (release/0.7.6) failed all four CI jobs (test 3.10/3.11/3.12, coverage, codecov/patch) on the same root cause + one latent bug masked by it. This commit lands the fixes plus the last-mile tests that bring coverage above the 82% threshold. CI failure root --------------- * tests/test_integrations_fastapi.py does from fastapi import ... at module top-level. CI installs only pip install -e '.[dev]', and fastapi was declared as an *optional* [fastapi] extra, NOT in [dev]. Pytest collection aborted with ModuleNotFoundError: No module named 'fastapi' → all 4 jobs red. * Fix: add fastapi>=0.100,<1.0 to [dev]. Same precedent as langchain-core (already in [dev] for the same import-time contract: nullrun.instrumentation.langgraph is eager-imported from nullrun.decorators at collection time, so the test extras must cover the import chain). Latent bug surfaced by the first fix ------------------------------------ The same PR refactored Transport._send_batch_with_retry_info to route the /track/batch body through _signed_request_body for canonical-JSON serialization (matching /gate and /execute). The two sibling call sites use the module-level helper _signed_request_body (no self.); this one used self._signed_request_body by typo. Result: AttributeError on every batch flush, breaking 15 existing tests across test_transport.py / test_track_batch_retry.py / test_integration_contract.py / test_signal_safety.py. As long as the fastapi collection error aborted pytest, this was hidden. Fixed to _signed_request_body(...) with a docstring noting why it is module-level and what the bug looked like. Coverage padding (codecov/patch was failing on this too) -------------------------------------------------------- Total coverage on the failing CI run was 81.98% — 0.02pp under the fail-under=82 gate. After the two fixes above it would have recovered to ~82.0% on the dot, so I added minimal tests for the cheapest-to-cover gaps: * tests/test_breaker_main.py (new) — covers the 5 statements in nullrun.breaker.__main__.main() (0% → 100%). The module exists so python -m nullrun.breaker exits cleanly instead of failing with No module named nullrun.breaker.__main__; the previous fix-mechanism was return 0 after a print, but no test was exercising it. * tests/test_status.py — extends TestSummary with seven scenarios covering each conditional branch of NullRunStatus.summary() (organization_id, workflow_id, workflow_state != Normal, backend_reachable=False, ws_connected=False, recent_errors). status.py jumps 84.52% → 98.81%. * tests/test_integrations_fastapi.py — four tests on _build_headers covering non-numeric, zero, negative, and resume_after (the WorkflowPausedException code path). integrations/fastapi.py jumps 90.22% → 94.57%. After all three: TOTAL 81.98% → 82.46%, comfortably above the gate. Verification ------------ * Local pytest: 997 passed, 13 skipped, 0 failed (Windows / Python 3.14.2, 8m47s — same env the original commit was validated in). * python -m coverage report — 82.46%, no fail-under complaint.
…ng/tools Patch coverage on PR #35 was 62.38% against a 65% threshold (codecov target 70% / threshold 5pp). The two biggest delta-holders against master were auto.py (+286) and langgraph.py (+221), both dominated by Phase 4.1 additions: * auto._normalize_finish_reason + _FINISH_REASON_MAP * auto._openai_extractor second-tier fields (cache_read_tokens, cache_write_tokens, reasoning_tokens, finish_reason, tool_names) * auto._anthropic_extractor cache_read / cache_write * langgraph._safe_get_gen_message * langgraph._get_finish_reason (5-source fallback chain) * langgraph.extract_usage_from_response second-tier fields These are pure / near-pure functions with no network or vendor SDK calls. Coverage padding is cheap — pin the canonical wire shapes once and the backend ingest contract gets a free live spec. Local numbers: * auto.py 63.44% -> 64.01% (file-level, +57 statements) * langgraph.py 78.50% -> 86.01% (file-level, +32 statements) * TOTAL 82.46% -> 83.13% (already above 82% gate) 41 tests, all green. Existing test_extractors.py and test_langgraph_callback.py left untouched — these tests deliberately target the Phase 4.1 fields (cache_read / cache_write / reasoning / finish_reason / tool_names) that the older tests didn't pin.
Pre-0.7.7 every SDK /gate call for any workflow with a budget was hard-blocked because the runtime hard-coded the literal string "budget-precheck" as the model. The backend's PolicyEvaluationGraph treated any synthetic cost_limit rule with score > 0.8 as Block, so the pricing lookup never landed on a real model and the rule fired with the wrong score. This commit: * Adds nullrun.set_call_context(model=..., tools=[...]) plus get_call_model / get_call_tools helpers (and the underlying _call_model_var / _call_tools_var contextvars in nullrun.context). * Wires the call context into check_workflow_budget: the /gate payload now carries the real model name (or None when unset) and the user-supplied tool list. tools=[] vs missing-None are distinguished on the wire per gate/internal.rs::check_tool_block. * Transport.check forwards the tools key when set (it was silently dropped pre-fix). * tests/conftest.py reset_runtime clears the new contextvars so a test's set_call_context(...) doesn't leak into the next test's wire payload. * New tests/test_gate_real_path.py pins down the regression: default request allows a clean workflow, real block still honored, no policy-N residue on the wire, set_call_context flows into the body, no-context means no tools key, and the helpers are reachable from nullrun.*. Bumps version to 0.7.7. No breaking changes - new helpers default to None / empty so existing call sites keep working.
Conflict resolution between release/0.7.7 (T4 per-call context for /gate) and origin/master (Release/0.7.6 #35, which bumped the SDK to 0.7.6): * pyproject.toml: keep 0.7.7 (the HEAD side). 0.7.6 on master is superseded by 0.7.7 once this merges. * CHANGELOG.md: keep BOTH the new 0.7.7 block (from HEAD) and the 0.7.6 block (from master). They document different releases and are listed in chronological order with the older 0.7.6 block below. * src/nullrun/{__init__.py, runtime.py, transport.py}: auto-merged cleanly - master doesn't touch the T4 hunks. Auto-merge result equals HEAD, but the merge commit is still needed to record the parent relationship and clear the conflict state on the PR.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
maltsev-dev
added a commit
that referenced
this pull request
Jun 28, 2026
Conflict resolution between release/0.7.8 (fail-loud on deprecated surface) and origin/master (release: 0.7.7 #36, the squash-merge of PR #36 which bumped the SDK to 0.7.7): * pyproject.toml: keep 0.7.8 (the HEAD side). 0.7.7 on master is superseded by 0.7.8 once this merges. * src/nullrun/__version__.py: keep 0.7.8 (same reasoning). * CHANGELOG.md: keep BOTH the new 0.7.8 block (from HEAD) and the 0.7.7 block (from master). They document different releases and are listed in chronological order with the older 0.7.7 block below. * src/nullrun/runtime.py and src/nullrun/transport.py: auto-merged cleanly - master doesn't touch the 0.7.8 hunks. * Test files: auto-merged cleanly - master doesn't touch the 0.7.8 test changes either. Auto-merge result equals HEAD, but the merge commit is still needed to record the parent relationship and clear the conflict state on the PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
0.7.7 — forward real model + tools to /gate pre-flight (T4)
Bug: Pre-0.7.7 every SDK
/gatecall for any workflow with a budget was hard-blocked withbecause the runtime hard-coded
model="budget-precheck"as a sentinel placeholder. The backend'sPolicyEvaluationGraph.evaluate()stub treated any syntheticcost_limitrule with score > 0.8 asBlock(backend/src/policy/graph.rs:448-462,backend/src/proxy/http/gate/internal.rs:619-628), so the pricing lookup never landed on a real model and the rule fired with the wrong score.Fix: Forward the real model name and tool list to
/gatevia a new per-call context API.Added
nullrun.set_call_context(model=..., tools=[...])— per-call context the SDK forwards to/gateso the backend can enforce budget tiers and tool-block on real values.model(optional) — backend uses it to look up the per-model rate fromtool_pricing(Postgres) soprojected_costmatches what/trackwill compute from real token counts.None→ backend falls back to theclaude-sonnet-4default rate.tools(optional) — backend matches each against the workflow's effectiveblocked_toolsaggregate and returnsblockon any match.Noneleaves whatever was previously set;[]clears.nullrun.get_call_model()/nullrun.get_call_tools()are the read-side helpers (also reachable vianullrun.context.*).Fixed
/gatepre-flight no longer sendsmodel="budget-precheck".check_workflow_budgetnow readsget_call_model()(orNonewhen unset) instead of the placeholder. Default workflows with a budget now returnallowinstead of blanket-block./gatepre-flight now forwards the per-calltoolslist.Transport.checkpreviously dropped thetoolskey from the wire payload, so evenset_call_context(tools=[...])had no effect on/gate. The transport now propagatestoolswhen set;[]vs missing-Noneare distinguished on the wire pergate/internal.rs::check_tool_block("no tools will be called" is different from "I did not tell you what tools").Tests
tests/test_gate_real_path.py(226 lines) pins the fix:TestGateRealPathRegression— default request allows a clean workflow (not the old blanket block), wire payload has nopolicy-Nresidue from the old graph plumbing, realdecision="block"still raisesWorkflowKilledInterrupt(so the fix didn't accidentally remove the real-block path).TestSetCallContext—set_call_context(model=...)flows into the body,set_call_context(tools=[...])flows into the body, no-context means notoolskey at all (not[]), andset_call_context(tools=[])clears a previously-set tool list.TestPackageExports— the new helpers are reachable fromnullrun.*.tests/conftest.py—reset_runtimefixture now also clears_call_model_varand_call_tools_varso a test'sset_call_context(...)doesn't leak into the next test's wire payload.Migration
No breaking changes. New helpers default to
None/ empty, so existing call sites (and every test in the suite) keep working without modification.Diff stats