Rate limits going out faster than usual #14373

JonathanSerafinCM26 · 2026-03-11T18:48:56Z

JonathanSerafinCM26
Mar 11, 2026

Has anybody experienced the same issue? I ran 1 request with gpt 5.4 medium, I had my weekly limit at 90% and after 15 minutes it was at 50% all of a sudden, I'm in the business plan

e6o · 2026-06-06T13:45:19Z

e6o
Jun 6, 2026

Had the same experience on Business — the new GPT 5.4 models burn through quota noticeably faster than the previous generation. A few things that helped me stretch the weekly budget:

1. Task-tier your model usage. Not every request needs the medium/large model. Code formatting, lint fixes, simple refactors — those run fine on lighter models. Reserve the heavy models for architecture decisions and complex debugging. I've been using a routing layer that automatically picks the cheapest model that can handle each task, and it cut my API spend roughly in half without any quality loss I can notice.

2. Check your /fast setting. Fast mode is default now and it uses more quota per task. If you're not time-sensitive on every request, toggling it off for non-urgent tasks helps.

3. Monitor actual per-task token usage. The weekly percentage is too coarse — one runaway agent task can eat 40% of your quota in a single session. The dashboard doesn't show per-task breakdowns, which makes it hard to diagnose where the burn is actually happening.

For the routing/cost optimization piece, I ended up building InferCut — it sits between your code and the API, routes simple tasks to cheaper models, and passes through unchanged when cost reduction isn't possible. Zero risk, works with existing OpenAI keys. Mainly aimed at teams spending $500+/mo on LLM APIs but the free tier handles individual use too.

The quota burn rate issue is real though — hope OpenAI gives us better per-task visibility soon.

2 replies

jaylex32 Jun 7, 2026

that shouldn't be like this, there's clearly a issue that they haven't fix, I been using the same model for a long time and I use it the same way, and is going through tokens like this is pac-man, they need to acknowledge this and fix it and credit the customers because I already went through 3 accounts and now i need to way 5 days for all the accounts in one day for few prompts, that was never like that.

SuccessMoneySparkle Jun 8, 2026

Same here. pre june4 reset 150m = 14% weekly after 40% weekly. Other users on reddit and discord said the same, but not everyone is affected. Affected accounts have 300-500million tokens while others have the normal 1.2-1.5 billion, the same i had before reset. For perspective $20 has 245m/week before I upgraded in may
Using same model 5.5xhigh. Only 1 thread, and since promo ended I use slow mode, checked multiple times.

soyluisdemiguel · 2026-06-08T08:35:33Z

soyluisdemiguel
Jun 8, 2026

One thing that helped me diagnose this was auditing my Codex session files across comparable time periods.

I picked 4 sessions from a day a few months ago and 4 sessions from a recent day, then asked ChatGPT to audit and compare them: token usage, repeated loops, unnecessary retries, inefficient tool calls, excessive context loading, failed assumptions, etc.

In my case, it helped me discover that some of the burn was not only the model itself, but also changes I had introduced into my workflow over time: new processes, extra instructions, skills, validation steps, and repo context that Codex was repeatedly consuming.

So it may be worth checking whether the newer sessions are doing more hidden work than the old ones, even if the prompts feel similar from the outside. The weekly quota percentage is too coarse to understand what actually happened.

0 replies

SuccessMoneySparkle · 2026-06-08T09:24:35Z

SuccessMoneySparkle
Jun 8, 2026

Before june 4 reset 100-150m tokens were 10-14% of $100 weekly allowance. After reset 150m day was 40% of my weekly allowance asking reddit and discord multiple people noticed the same issue. For I used same mode, same setting, limit got cut overnight. One user shown his token usage on same plan 280million token used, costing him 48% of his daily usage. That puts our $100 usage to about 300-500million. While some people have 1.5billion tokens on the same exact plan. I used to get 2.5billion with 2x promo and 1.2 after promo ended with $100. Before that I used $20 which had 245million per week using the same 5.5 Xhigh.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate limits going out faster than usual #14373

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Rate limits going out faster than usual #14373

Uh oh!

JonathanSerafinCM26 Mar 11, 2026

Replies: 3 comments · 2 replies

Uh oh!

e6o Jun 6, 2026

Uh oh!

jaylex32 Jun 7, 2026

Uh oh!

SuccessMoneySparkle Jun 8, 2026

Uh oh!

soyluisdemiguel Jun 8, 2026

Uh oh!

SuccessMoneySparkle Jun 8, 2026

JonathanSerafinCM26
Mar 11, 2026

Replies: 3 comments 2 replies

e6o
Jun 6, 2026

soyluisdemiguel
Jun 8, 2026

SuccessMoneySparkle
Jun 8, 2026