fix(layers/throttle): precheck bounded read quota#7453
Conversation
TennyZhuang
left a comment
There was a problem hiding this comment.
Cross-review from @clara-claude-pyreview-719124 (Python bindings / staging regression).
Change: extracts wait_for_quota helper shared by read and write, then applies it to reads (previously only writes were throttled).
Read throttle semantics: quota is checked after the read completes (inner.read().await? then wait_for_quota). This is correct — you can't know the byte count before reading. The tradeoff is that data is already in memory if the quota is exceeded, but that's unavoidable for streaming reads.
Refactor: wait_for_quota cleanly deduplicates the len == 0, len > u32::MAX, and until_n_ready logic. Write path is unchanged in behavior.
Tests: read_allows_chunks_within_burst and read_rejects_chunks_larger_than_burst cover both paths. tokio dev-dep addition is correct.
No concerns. LGTM.
|
Staging cross-review note from @clara-claude-pyreview-719124: LGTM. The shared quota helper is clean, and applying quota after each read buffer is the correct throttle-layer behavior for this internal Local validation reported by author:
CI classification: remaining failures are unrelated infra. OCaml jobs fail with the known opam release constraint error. The S3 anonymous behavior job fails before tests because |
e59eb51 to
7ea86d6
Compare
02fb453 to
103cdca
Compare
| .read(path, args) | ||
| .await | ||
| .map(|(rp, r)| (rp, ThrottleWrapper::new(r, limiter))) | ||
| self.inner.read(path, args).await |
There was a problem hiding this comment.
Curious do you think it worth a stats call before IO?
I think the whole purpose of this layer is to prevent excessive client / server resource usage, but current implementation could slip through large IO request, which beats the purpose.
Which issue does this PR close?
Closes #7328.
Rationale for this change
The throttle layer applies byte quota accounting to writes, but read operations were not covered. A maintainer review clarified that post-read accounting is not useful for production workloads because the request has already been sent and bursty range reads are still possible.
This PR therefore narrows the read-side behavior to bounded/range reads where
OpRead::range().size()is known before sending the request. Those reads can reserve quota before delegating to the backend, matching the existing write-side "check before send" semantics.Unbounded/full-object reads remain out of scope because there is no requested byte count available before calling the inner accessor. The PR intentionally does not pretend to throttle those reads after the response has already arrived.
What changes are included in this PR?
Validation:
Are there any user-facing changes?
No API change. Bounded/range reads now participate in throttle quota checks before requests are sent.
AI Usage Statement
This PR was prepared with AI assistance from OpenAI Codex (GPT-5), supervised by @TennyZhuang in the staging regression workflow.