Skip to content

fix: deflake test_budget_cap_policy timezone-boundary test#372

Merged
anilmurty merged 1 commit into
mainfrom
fix/budget-cap-policy-flaky-tz-test
Jul 2, 2026
Merged

fix: deflake test_budget_cap_policy timezone-boundary test#372
anilmurty merged 1 commit into
mainfrom
fix/budget-cap-policy-flaky-tz-test

Conversation

@anilmurty

Copy link
Copy Markdown
Contributor

Extracts the standalone test fix from #363 so it can land in main immediately, independent of the async-hooks work (which has open review items).

Why

tests/unit/test_budget_cap_policy.py::test_cycle_spend_read_from_duckdb (and test_old_spend_outside_cycle_is_excluded) seed spans relative to utcnow() and assert on cycle-scoped spend. When CI runs near a billing-cycle/month boundary in UTC, the "this cycle" spans fall into the previous cycle, the ceiling check flips would_blocknoop, and the test fails. It's a real flaky test living in main — it just red-flagged the CI on unrelated PR #360 (a Docker-workflow-only change).

What

  • Seed a fixed mid-month datetime (2026-06-15 12:00 UTC) instead of utcnow() in the two cycle-spend tests.
  • Pass a mocked now_fn=lambda: now into the policy engine so the evaluator's clock is pinned to the same instant. (PolicyContext already supports now_fn on maintokenjam/proxy/engine.py:111 — so this is test-only; no production change.)

Tests / Verification

  • ruff check tests/unit/test_budget_cap_policy.py — clean
  • pytest tests/unit/test_budget_cap_policy.py — 12 passed
  • The tests are now deterministic regardless of wall-clock time.

Scope

Test-only, one file. Credit to @tarun73, who wrote this fix as part of #363 — this PR just cherry-picks it out so we can deflake main now.

🤖 Generated with Claude Code

Co-Authored-By: tarun73 noreply@github.com

Seed a fixed mid-month datetime (2026-06-15) and pass a mocked now_fn
to the policy engine evaluation in cycle-spend unit tests. This prevents
test failures caused by relative times crossing the billing cycle/month
boundary when run at the start of a month in UTC.

Co-Authored-By: Gemini 3.5 Flash (High) <noreply@google.com>
@anilmurty

Copy link
Copy Markdown
Contributor Author

Pulling this out of #363 on its own so we can deflake main right away. The async-hooks work in #363 still has open review items, but this test fix is independent and doesn't need to wait on them.

Context on why it's worth landing now: this exact test just failed CI on #360 — a PR that only adds a Docker workflow and touches zero Python. The flaky boundary condition in main is generating false negatives on unrelated PRs, so getting it in stops the noise.

Full credit to @tarun73 — this is their fix, cherry-picked verbatim (authorship preserved). Verified locally: ruff clean, 12/12 tests pass, and it's now deterministic instead of clock-dependent.

@anilmurty anilmurty merged commit 563a155 into main Jul 2, 2026
4 checks passed
@anilmurty anilmurty deleted the fix/budget-cap-policy-flaky-tz-test branch July 2, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant