fix: deflake test_budget_cap_policy timezone-boundary test#372
Merged
Conversation
Seed a fixed mid-month datetime (2026-06-15) and pass a mocked now_fn to the policy engine evaluation in cycle-spend unit tests. This prevents test failures caused by relative times crossing the billing cycle/month boundary when run at the start of a month in UTC. Co-Authored-By: Gemini 3.5 Flash (High) <noreply@google.com>
Contributor
Author
|
Pulling this out of #363 on its own so we can deflake Context on why it's worth landing now: this exact test just failed CI on #360 — a PR that only adds a Docker workflow and touches zero Python. The flaky boundary condition in Full credit to @tarun73 — this is their fix, cherry-picked verbatim (authorship preserved). Verified locally: ruff clean, 12/12 tests pass, and it's now deterministic instead of clock-dependent. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extracts the standalone test fix from #363 so it can land in
mainimmediately, independent of the async-hooks work (which has open review items).Why
tests/unit/test_budget_cap_policy.py::test_cycle_spend_read_from_duckdb(andtest_old_spend_outside_cycle_is_excluded) seed spans relative toutcnow()and assert on cycle-scoped spend. When CI runs near a billing-cycle/month boundary in UTC, the "this cycle" spans fall into the previous cycle, the ceiling check flipswould_block→noop, and the test fails. It's a real flaky test living inmain— it just red-flagged the CI on unrelated PR #360 (a Docker-workflow-only change).What
2026-06-15 12:00 UTC) instead ofutcnow()in the two cycle-spend tests.now_fn=lambda: nowinto the policy engine so the evaluator's clock is pinned to the same instant. (PolicyContextalready supportsnow_fnonmain—tokenjam/proxy/engine.py:111— so this is test-only; no production change.)Tests / Verification
ruff check tests/unit/test_budget_cap_policy.py— cleanpytest tests/unit/test_budget_cap_policy.py— 12 passedScope
Test-only, one file. Credit to @tarun73, who wrote this fix as part of #363 — this PR just cherry-picks it out so we can deflake
mainnow.🤖 Generated with Claude Code
Co-Authored-By: tarun73 noreply@github.com