Skip to content

[aw][test audit] _estimate_premium_cost and render_live_sessions lack tests for the 6× ultra-premium multiplier (`claude-opu [Content truncated due to length] #1186

@microsasa

Description

@microsasa

Root Cause

src/copilot_usage/report.py calls _estimate_premium_cost(s.model, stats.model_calls) to produce the Est. Cost column in render_live_sessions and in render_cost_view. That function multiplies calls × pricing.multiplier, where the multiplier comes from lookup_model_pricing.

claude-opus-4.6-1m has a 6.0× multiplier — distinct from the standard 3.0× assigned to claude-opus-4.6. Both models are in the PREMIUM tier (≥ 3.0), so they share the same PricingTier.PREMIUM tag, but their raw multipliers differ. _estimate_premium_cost uses the raw multiplier, not the tier enum, so 6.0× and 3.0× produce different outputs.

Coverage gap

Model Multiplier _estimate_premium_cost direct test render_live_sessions integration test
claude-opus-4.6 3.0 test_known_model_returns_estimate test_est_cost_premium_model
claude-opus-4.6-1m 6.0 missing missing
gpt-5-mini 0.0 ❌ (only in TestEstimatePremiumCost) test_est_cost_free_model

The 6× case is not exercised anywhere in TestEstimatePremiumCost or in TestRenderLiveSessions. A regression that accidentally caps the multiplier at 3× (e.g., returning pricing.tier.value instead of pricing.multiplier) would be invisible to the current test suite.

Note: test_pricing.py does verify that lookup_model_pricing("claude-opus-4.6-1m").multiplier == 6.0, but there is no test that threads that value through _estimate_premium_cost to verify the final displayed output.

Expected Behavior to Assert

TestEstimatePremiumCost in tests/copilot_usage/test_report.py

def test_ultra_premium_tier_model_uses_6x_multiplier(self) -> None:
    """claude-opus-4.6-1m has a 6× multiplier — must not be capped at 3×."""
    assert _estimate_premium_cost("claude-opus-4.6-1m", 5) == "~30"   # round(5 * 6.0)
    assert _estimate_premium_cost("claude-opus-4.6-1m", 1) == "~6"    # round(1 * 6.0)
    assert _estimate_premium_cost("claude-opus-4.6-1m", 0) == "~0"    # round(0 * 6.0)

TestRenderLiveSessions in tests/copilot_usage/test_report.py

def test_est_cost_ultra_premium_model(self) -> None:
    """Live session with claude-opus-4.6-1m (6× multiplier) shows correct estimate."""
    now = datetime.now(tz=UTC)
    session = SessionSummary(
        session_id="live-ultraprem-1234",
        name="Ultra Premium",
        model="claude-opus-4.6-1m",
        is_active=True,
        start_time=now - timedelta(minutes=10),
        user_messages=5,
        model_calls=5,
        model_metrics={
            "claude-opus-4.6-1m": ModelMetrics(
                usage=TokenUsage(outputTokens=2000),
            )
        },
    )
    output = _capture_output([session])
    assert "~30" in output   # 5 calls × 6.0 = ~30
    assert "~15" not in output  # must not be capped at 3×

Regression Scenario

  1. _estimate_premium_cost is refactored to use pricing.tier instead of pricing.multiplier for the cost calculation.
  2. claude-opus-4.6-1m sessions display ~15 instead of ~30.
  3. No test fails — the 3× test passes because claude-opus-4.6 still shows ~9 for 3 calls.
  4. Users with the 1M-context opus model silently see an underestimated premium cost.

Generated by Test Suite Analysis · ● 8.3M ·

Metadata

Metadata

Assignees

No one assigned

    Labels

    awCreated by agentic workflowaw-dispatchedIssue has been dispatched to implementertest-auditTest coverage gaps

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions