Has the Plus Quota Really Been Cut in Half? #22126

JavaLittleBoy-Lzx · 2026-05-11T09:30:41Z

JavaLittleBoy-Lzx
May 11, 2026

I’ve noticed that my Plus quota seems to run out way faster than before, and it honestly feels like it’s been cut in half. I used to use it pretty comfortably, but now I’m hitting the limit much more quickly and the overall experience feels noticeably worse. Is anyone else experiencing the same thing, or is it just me?

Danny-BW · 2026-05-12T10:10:40Z

Danny-BW
May 12, 2026

I'm curious about this and I hope you get some sort of answer, but also fast mode is the default now. Check to make sure you don't have /fast enabled.

0 replies

e6o · 2026-06-06T13:45:53Z

e6o
Jun 6, 2026

Adding to @Danny-BW's point about /fast — that's definitely the first thing to check. Fast mode trades quota for speed and it's now enabled by default, which explains the "cut in half" feeling for a lot of people.

Beyond that, there's another factor: the newer models (GPT 5.x series) consume significantly more tokens per request than the previous generation, even for identical prompts. So you're getting fewer interactions per quota unit whether you notice it or not.

If you want to stretch your budget further and you're also using the API directly, one approach that's worked well for us: route tasks to the cheapest model that can handle them. Simple formatting, lint fixes, boilerplate generation — those don't need the latest model. Architecture decisions and complex debugging do. We built InferCut to automate this routing — it's a drop-in proxy that picks the right model tier per task. Free tier available, zero risk (if it can't save cost on a request, it passes through unchanged).

But yeah, check /fast first — that's the most likely culprit for the sudden quota acceleration.

1 reply

SuccessMoneySparkle Jun 8, 2026

I checked, It was off. still it my weekly limit since june 4 reset is 300-400 million multiple people reported this on reddit and discord. One other person checked 280million = 48% for him. For me since reset 150m = 40%. I used to get 2.5billion with 2x and 1.2 without 2x before reset. using same $100 plan, using 5.5 xhigh and during promo even using fast mode. For perspective my limit using $20 before switch early may was 245million /week. I could never exhaust $100 as it was overkill but now it won't last a week. Same programming same project same model same usage.

hoodrichpirobo · 2026-06-13T19:39:33Z

hoodrichpirobo
Jun 13, 2026

There is no public doc I can find that confirms “Plus was cut in half” as an announced change. The public docs describe Plus qualitatively, and Pro only relative to Plus: Pro $100 is 5x Plus usage and Pro $200 is 20x Plus usage. They do not publish a fixed raw token quota for Plus that users can independently compare against.

What I would check before assuming the quota itself changed:

Run:

/fast status

Fast mode consumes quota faster. Current docs say Fast mode uses credits at:

GPT-5.5: 2.5x standard
GPT-5.4: 2x standard

So if Fast is on, it can feel like the quota was cut heavily even if the nominal limit did not change.

Run:

/status

Check the active model, rate-limit window, and remaining usage. If possible, capture /status right after reset and again near exhaustion.

Check the model/reasoning level. Higher-capability models and higher reasoning effort can burn through the same quota much faster. For routine local work, the current pricing docs specifically call out GPT-5.4 mini as the higher-usage-limit option.
Compare only like-for-like:

same plan
same model
same reasoning effort
same fast mode setting
same surface: CLI/app/web
same reset window
similar project/context size

If all of those are the same and /status shows a much lower effective allowance than before, then it is probably not something the community can definitively answer. It could be a rollout, account-specific limit behavior, a pricing/usage policy change, or a bug in how usage is being counted/displayed.

The best actionable next step is to send OpenAI support:

- plan tier: Plus / Pro $100 / Pro $200
- model used
- /fast status output
- /status output after reset
- /status output near limit
- approximate reset date/time
- whether usage was CLI, app, web, IDE, or cloud

So my short answer is: I would not call it confirmed from the public docs, but if Fast is off and the same model/workload is exhausting a fresh window much earlier, it is worth reporting with /status evidence because the published docs do not give enough quota detail for users to verify the change themselves.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Has the Plus Quota Really Been Cut in Half? #22126

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Has the Plus Quota Really Been Cut in Half? #22126

Uh oh!

JavaLittleBoy-Lzx May 11, 2026

Replies: 3 comments · 1 reply

Uh oh!

Danny-BW May 12, 2026

Uh oh!

e6o Jun 6, 2026

Uh oh!

SuccessMoneySparkle Jun 8, 2026

Uh oh!

hoodrichpirobo Jun 13, 2026

JavaLittleBoy-Lzx
May 11, 2026

Replies: 3 comments 1 reply

Danny-BW
May 12, 2026

e6o
Jun 6, 2026

hoodrichpirobo
Jun 13, 2026