fix: stop quota scheduler cooldown drift by ndycode · Pull Request #371 · ndycode/codex-multi-auth

ndycode · 2026-04-06T02:01:17Z

Problem

PreemptiveQuotaScheduler.getDeferral() was computing cooldowns from the nearest reset window.

That causes cooldown drift when a later primary or secondary quota window is still active: the scheduler can resume traffic early even though another active quota window has not reset yet.

Fix

Use the longest relevant reset window instead of the nearest one.

For 429 snapshots, defer until the latest active reset across primary and secondary windows.
For quota-near-exhaustion snapshots, defer until the latest reset among the windows that are actually near exhaustion.

Changes

lib/preemptive-quota-scheduler.ts
- compute per-window waits once
- use the longest active wait for 429 deferrals
- use the longest near-exhausted wait for quota deferrals
test/preemptive-quota-scheduler.test.ts
- added regression coverage for 429 cooldowns using the longer reset window
- added regression coverage for near-exhaustion cooldowns using the longer reset window

Validation

npx vitest run test/preemptive-quota-scheduler.test.ts
Test Files  1 passed (1)
Tests      16 passed (16)

npx vitest run
Test Files  222 passed (222)
Tests      3315 passed (3315)

note: greptile review for oc-chatgpt-multi-auth. cite files like `lib/foo.ts:123`. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

replaces nearest-window cooldown selection with longest-window in getDeferral(), fixing premature quota resumption when a longer secondary reset window is still active. both the 429 and near-exhaustion paths now correctly defer until all relevant windows have reset, backed by two new regression tests.

Confidence Score: 5/5

safe to merge — the fix is correct, all 16 tests pass, and the change is well-scoped

both findings are p2 style/coverage; the core logic correctly uses Math.max across windows for both the 429 and near-exhaustion paths, which is exactly the stated intent. no concurrency issues — all scheduler methods are synchronous with no async gaps in the map mutations. no filesystem or token handling touched.

no files require special attention; redundancy in getDeferral is cosmetic only

Important Files Changed

Filename	Overview
lib/preemptive-quota-scheduler.ts	fixes cooldown drift by using Math.max across windows for both 429 and near-exhaustion paths; minor redundancy between waitCandidates and primaryWait/secondaryWait is cosmetic only
test/preemptive-quota-scheduler.test.ts	adds two targeted regression tests for the new longest-window logic; missing edge case for expired 429 snapshots before prune fires

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["getDeferral(key, now)"] --> B{scheduler enabled?}
    B -- no --> Z[defer: false, waitMs: 0]
    B -- yes --> C{snapshot exists?}
    C -- no --> Z
    C -- yes --> D["compute primaryWait = max(0, primary.resetAtMs - now)\ncompute secondaryWait = max(0, secondary.resetAtMs - now)\nlongestWait = max(primaryWait, secondaryWait)"]
    D --> E{status === 429\nAND longestWait > 0?}
    E -- yes --> F["bounded = min(longestWait, maxDeferralMs)\ndefer: true, reason: rate-limit"]
    E -- no --> G{primaryNearExhausted\nOR secondaryNearExhausted?}
    G -- yes --> H["nearExhaustedWait = max(\n  primaryNearExhausted ? primaryWait : 0,\n  secondaryNearExhausted ? secondaryWait : 0\n)"]
    H --> I{nearExhaustedWait > 0?}
    I -- yes --> J["bounded = min(nearExhaustedWait, maxDeferralMs)\ndefer: true, reason: quota-near-exhaustion"]
    I -- no --> Z
    G -- no --> Z

Prompt To Fix All With AI

This is a comment left during a code review.
Path: lib/preemptive-quota-scheduler.ts
Line: 254-258

Comment:
**redundant `waitCandidates` array**

`longestWait` is equivalent to `Math.max(primaryWait, secondaryWait)` — both values are already computed with identical semantics above. the `waitCandidates` intermediate array and the second `.filter(v => v > 0)` are redundant (the first filter already guarantees `v > now`, so `v - now > 0` is always true).

```suggestion
		const longestWait = Math.max(primaryWait, secondaryWait);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: test/preemptive-quota-scheduler.test.ts
Line: 63-71

Comment:
**missing vitest regression: expired 429 cooldown**

there's no test asserting `getDeferral` returns `{ defer: false }` once both reset windows on a 429 snapshot have elapsed but before `prune()` runs. a future regression dropping the `> now` guard would silently defer on a stale snapshot and this would go undetected.

```ts
it("does not defer after 429 reset windows have elapsed", () => {
  const scheduler = new PreemptiveQuotaScheduler();
  scheduler.update("acc:model", {
    status: 429,
    primary: { resetAtMs: 5_000 },
    secondary: { resetAtMs: 8_000 },
    updatedAt: 1_000,
  });
  // both windows are now in the past
  expect(scheduler.getDeferral("acc:model", 10_000).defer).toBe(false);
});
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix: stop quota scheduler cooldown drift" | Re-trigger Greptile}

The preemptive quota scheduler calculated deferrals from the nearest reset window. When a later primary or secondary reset was still active, that shortened the effective cooldown and let requests resume too early. Use the longest relevant reset window for 429 deferrals and for near-exhaustion quota deferrals so cooldowns stay aligned with the latest active quota window.

chatgpt-codex-connector · 2026-04-06T02:01:23Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

coderabbitai · 2026-04-06T02:01:24Z

Warning

Rate limit exceeded

@ndycode has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 25 minutes and 8 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 25 minutes and 8 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 075b9277-3083-4202-a9a5-663bb080218a

📥 Commits

Reviewing files that changed from the base of the PR and between 478f44c and e70a2ed.

📒 Files selected for processing (2)

lib/preemptive-quota-scheduler.ts
test/preemptive-quota-scheduler.test.ts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/preemptive-quota-cooldown-drift

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch fix/preemptive-quota-cooldown-drift

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ndycode · 2026-04-06T04:21:04Z

Superseded by #387, which rebuilds the full open PR stack onto one reviewed integration branch.

ndycode · 2026-04-06T04:21:06Z

Closing in favor of #387.

ndycode closed this Apr 6, 2026

ndycode deleted the fix/preemptive-quota-cooldown-drift branch April 12, 2026 06:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: stop quota scheduler cooldown drift#371

fix: stop quota scheduler cooldown drift#371
ndycode wants to merge 1 commit intomainfrom
fix/preemptive-quota-cooldown-drift

ndycode commented Apr 6, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

chatgpt-codex-connector Bot commented Apr 6, 2026

Uh oh!

coderabbitai Bot commented Apr 6, 2026

Rate limit exceeded

Uh oh!

ndycode commented Apr 6, 2026

Uh oh!

ndycode commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ndycode commented Apr 6, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Changes

Validation

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

chatgpt-codex-connector Bot commented Apr 6, 2026

Uh oh!

coderabbitai Bot commented Apr 6, 2026

Rate limit exceeded

Uh oh!

ndycode commented Apr 6, 2026

Uh oh!

ndycode commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ndycode commented Apr 6, 2026 •

edited by greptile-apps Bot

Loading

note: greptile review for oc-chatgpt-multi-auth. cite files like `lib/foo.ts:123`. confirm regression tests + windows concurrency/token redaction coverage.