Skip to content

Potential settlement gap: missing usage/cost block can leave delivered LLM output uncharged #93

@chenshj73

Description

@chenshj73

Potential settlement gap: missing usage/cost block can leave delivered LLM output uncharged

Hi, I noticed a possible payment-flow issue in the current repository state. This is a conservative report based on the current code path, and I may be missing deployment-specific guards outside this repository.

Reviewed HEAD: afa7966

What I observed

The x402 charging path depends on an opengradient cost block derived from usage. Local tests/comments state that if this block is absent, x402 close swallows the cost-calculator error and the client is never charged after output delivery.

Source scan notes:

  • scan_uncovered1000_20260604_batch24: True x402 TEE gateway. Paid routes use x402 upto sessions, but actual charge is computed from an opengradient response block after the controller returns output. If provider usage/cost is absent, responses can omit opengradient; comments/tests state x402 close swallows the cost-calculator error and the client is not charged after paid LLM output has been delivered.

Relevant code excerpts:

tee_gateway/controllers/chat_controller.py:455-466

455	        # TODO: If no usage is returned, we should compute it here.
456	        usage = extract_usage(response)
457	        if usage:
458	            openai_response["usage"] = usage
459	            web_search_count = (
460	                extract_web_search_count(response) if chat_request.web_search else 0
461	            )
462	            cost = compute_session_cost(
463	                chat_request.model, usage, web_search_count=web_search_count
464	            )
465	            if cost is not None:
466	                openai_response["opengradient"] = cost.model_dump(mode="json")

tee_gateway/controllers/chat_controller.py:847-868

847	                # TODO: If no usage is returned, we should compute it here.
848	                if final_usage:
849	                    final_data["usage"] = {
850	                        "prompt_tokens": final_usage.get("input_tokens", 0),
851	                        "completion_tokens": final_usage.get("output_tokens", 0),
852	                        "total_tokens": final_usage.get("total_tokens", 0),
853	                    }
854	                    web_search_count = (
855	                        extract_web_search_count(merged_chunk)
856	                        if chat_request.web_search
857	                        else 0
858	                    )
859	                    cost = compute_session_cost(
860	                        chat_request.model,
861	                        final_data["usage"],
862	                        web_search_count=web_search_count,
863	                    )
864	                    if cost is not None:
865	                        # final_data is hand-serialized to SSE via json.dumps below,
866	                        # which doesn't go through Flask's JSONEncoder — so do the
867	                        # serialization ourselves here.
868	                        final_data["opengradient"] = cost.model_dump(mode="json")

tests/test_opengradient_field.py:1-8

1	"""Verify the `opengradient` cost block is embedded on responses.
2	
3	These tests are the only thing keeping `compute_session_cost`'s result from
4	silently going missing on a controller response — if that block is absent,
5	x402's `_session_cost_calculator` swallows the error and the client is never
6	charged. The runtime CRITICAL log is the safety net; this is the unit-test
7	catch.
8	"""

Why this may matter

For paid API, agent, MCP, x402, AP2/UCP, or subscription-gated flows, the payment proof needs to dominate the protected release path. If verification is only structural, settlement is best-effort after release, payment fields are not rebound to server requirements, or the protected resource is reachable outside the paid route, a caller may receive the paid result without a completed payment for the intended resource, amount, asset, or recipient.

Suggested check

Consider making the paid-resource release depend on a payment state that is both verified and settled, or otherwise cryptographically bound to the current server-side requirements (resource, amount, asset, payTo, payer, nonce/idempotency key, and route/session). For intentionally asynchronous settlement, it may be worth failing closed on settlement errors or recording a durable pending state with explicit recovery/reconciliation semantics before returning the protected result.

Conservative caveat

This report is based on the repository code at the reviewed HEAD. If production uses an external gateway, webhook, facilitator, deployment setting, or middleware not present here that enforces the missing binding/settlement step before release, the practical impact may be lower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions