router: add request_body_buffer_limit for large request buffering by agrawroh · Pull Request #40254 · envoyproxy/envoy

agrawroh · 2025-07-16T19:32:37Z

Description

ML/inference requests often require buffering the entire request body to determine routing destination based on content rather than headers, and to support retries of failed requests. The existing per_request_buffer_limit_bytes (32-bit) is insufficient for large ML payloads that can exceed 4GB.

This PR adds request_body_buffer_limit configuration to VirtualHost and Route for buffering large request bodies beyond connection buffer limits. This enables support for ML/inference workloads that require buffering entire request bodies for processing and retries.

When request_body_buffer_limit is not configured, the existing per_request_buffer_limit_bytes behavior is preserved. Routes inherit from virtual hosts when not explicitly configured.

See #40028

Commit Message: router: add request_body_buffer_limit for large request buffering
Additional Description: Added request_body_buffer_limit configuration to VirtualHost and Route for buffering large request bodies beyond connection buffer limits.
Risk Level: Low
Testing: Added Unit + Integration Tests
Docs Changes: Added
Release Notes: Added

repokitteh-read-only · 2025-07-16T19:32:43Z

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #40254 was opened by agrawroh.

see: more, trace.

repokitteh-read-only · 2025-07-16T19:32:48Z

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @abeyad
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #40254 was opened by agrawroh.

see: more, trace.

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

abeyad · 2025-07-16T21:15:15Z

+  // The maximum bytes which will be buffered for request bodies to support large request body
+  // buffering beyond the ``per_connection_buffer_limit_bytes``.
+  //
+  // This limit is specifically for request body buffering and allows buffering larger inference


what is meant by inference payloads here?

It refers to the Model Serving/Inference workloads which are typically large. See the Background section here. If it's confusing then I'm happy to change it to what you prefer.

Since this is general API (not specific to ML), perhaps just saying "larger payloads while maintaining..." would be better (i.e. remove "inference")? In any case, I'm fine either way. Thanks!

abeyad

/lgtm api

wbpcode · 2025-07-17T08:48:30Z

+
+  // The maximum bytes which will be buffered for request bodies to support large request body
+  // buffering beyond the ``per_connection_buffer_limit_bytes``.
+  //
+  // This limit is specifically for request body buffering and allows buffering larger inference
+  // payloads while maintaining the flow control.
+  //
+  // If not set, defaults to ``per_request_buffer_limit_bytes`` behavior.
+  //
+  // When set, this limit supersedes ``per_connection_buffer_limit_bytes`` for request body buffering
+  // but ``per_request_buffer_limit_bytes`` still controls flow control chunk sizes.
+  google.protobuf.UInt64Value request_body_buffer_limit = 20;


What's the difference between the per_request_buffer_limit_bytes and request_body_buffer_limit?

/wait-any

I get the context. IMO, this new field and previous per_request_buffer_limit_bytes are pretty confusing. The @yanavlasov may want the per_request_buffer_limit_bytes to help with flow control. But IMO, it actually change the semantic of the per_request_buffer_limit_bytes and make the API hard to understand/use.

So, my suggestion here is deprecate the per_request_buffer_limit_bytes and ensure only one of the new field and the old field could be set.

If the flow control is necessary, we could use another number like the connection buffer size etc. This ensure the both fields (per_request_buffer_limit_bytes and request_buffer_limit_bytes) share same semantic except the new field extend the number range.

/wait

I agree we can deprecate per_request_buffer_limit_bytes.
If neither per_request_buffer_limit_bytes nor request_body_buffer_limit are set then the buffering limit is listeners per_connection_buffer_limit_bytes

If per_request_buffer_limit_bytes is set but request_body_buffer_limit is not set the body buffer is min(per_request_buffer_limit_bytes, per_connection_buffer_limit_bytes) - the behavior does not change.

If per_request_buffer_limit_bytes is NOT set but request_body_buffer_limit is set - then the buffer limit is request_body_buffer_limit.

If both per_request_buffer_limit_bytes AND request_body_buffer_limit set, then the buffer limit is request_body_buffer_limit.

For flow control we can use min(per_connection_buffer_limit_bytes, 16Kb) chunk sizes. 16Kb is somewhat arbitrary, we can make it configurable later on.

yanavlasov

/wait-any

yanavlasov · 2025-07-17T13:37:23Z

+
+  // The maximum bytes which will be buffered for request bodies to support large request body
+  // buffering beyond the ``per_connection_buffer_limit_bytes``.
+  //
+  // This limit is specifically for request body buffering and allows buffering larger inference
+  // payloads while maintaining the flow control.
+  //
+  // If not set, defaults to ``per_request_buffer_limit_bytes`` behavior.
+  //
+  // When set, this limit supersedes ``per_connection_buffer_limit_bytes`` for request body buffering
+  // but ``per_request_buffer_limit_bytes`` still controls flow control chunk sizes.
+  google.protobuf.UInt64Value request_body_buffer_limit = 20;


I agree we can deprecate per_request_buffer_limit_bytes.
If neither per_request_buffer_limit_bytes nor request_body_buffer_limit are set then the buffering limit is listeners per_connection_buffer_limit_bytes

If per_request_buffer_limit_bytes is set but request_body_buffer_limit is not set the body buffer is min(per_request_buffer_limit_bytes, per_connection_buffer_limit_bytes) - the behavior does not change.

If per_request_buffer_limit_bytes is NOT set but request_body_buffer_limit is set - then the buffer limit is request_body_buffer_limit.

If both per_request_buffer_limit_bytes AND request_body_buffer_limit set, then the buffer limit is request_body_buffer_limit.

For flow control we can use min(per_connection_buffer_limit_bytes, 16Kb) chunk sizes. 16Kb is somewhat arbitrary, we can make it configurable later on.

yanavlasov · 2025-07-17T13:54:13Z

@@ -686,6 +686,15 @@ class VirtualHost {
   */
  virtual uint32_t retryShadowBufferLimit() const PURE;


Is this the per_request_buffer_limit_bytes value? If so can we just rename this method to decrease confusion?

…ng-1

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

…ng-1 Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

…ng-1

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

wbpcode · 2025-07-18T08:41:35Z

+  virtual uint32_t perRequestBufferLimit() const PURE;
+
+  /**
+   * @return uint64_t the maximum bytes which should be buffered for request bodies. This enables
+   *         buffering larger request bodies beyond the connection buffer limit for use cases
+   *         with large payloads. If not set, falls back to perRequestBufferLimit() behavior.
+   *         When set, this limit supersedes per_connection_buffer_limit_bytes for request body
+   *         buffering but perRequestBufferLimit() still controls flow control.
+   */
+  virtual uint64_t requestBodyBufferLimit() const PURE;


What's difference between the perRequestBufferLimit and the requestBodyBufferLimit?

I renamed retryShadowBufferLimit() and retry_shadow_buffer_limit which were tracking per_request_buffer_limit_bytes from the route to perRequestBufferLimit() and per_request_buffer_limit per the suggestion from @yanavlasov as the existing naming is a bit confusing.

I mean should we keep only one requestBodyBufferLimit as in the API, the new request_body_buffer_limit field should be used to replace the old per_request_buffer_limit_bytes?

That's say we will only have one buffer limit the the route. It's unnecessary to keep perRequestBufferLimit() and requestBodyBufferLimit().

/wait

Thanks, I tried consolidating it. Please take another look.

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

…ng-1

agrawroh · 2025-07-18T22:16:04Z

/retest

agrawroh · 2025-07-18T22:34:51Z

/retest

…ng-1

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

…ng-1

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

Signed-off-by: yanavlasov <yavlasov@google.com>

abeyad · 2025-08-11T21:30:11Z

/lgtm api

agrawroh · 2025-08-11T22:12:44Z

/retest

…2611) We introduced a bug at #40254 where the legacy vhost buffer limit will take precedence over the legacy route buffer limit. Risk Level: low. Testing: unit. Docs Changes: n/a. Release Notes: added. Platform Specific Features: n/a. --------- Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>

…voyproxy#42611) We introduced a bug at envoyproxy#40254 where the legacy vhost buffer limit will take precedence over the legacy route buffer limit. Risk Level: low. Testing: unit. Docs Changes: n/a. Release Notes: added. Platform Specific Features: n/a. --------- Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com> Signed-off-by: Gustavo <grnmeira@gmail.com>

repokitteh-read-only Bot added the api label Jul 16, 2025

repokitteh-read-only Bot assigned abeyad Jul 16, 2025

router: add request_body_buffer_limit for large request buffering

46fc331

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

agrawroh force-pushed the large-buffering-1 branch from cd62d01 to 46fc331 Compare July 16, 2025 21:05

abeyad reviewed Jul 16, 2025

View reviewed changes

agrawroh assigned yanavlasov Jul 16, 2025

agrawroh requested a review from yanavlasov July 16, 2025 21:22

agrawroh marked this pull request as ready for review July 16, 2025 21:23

abeyad reviewed Jul 17, 2025

View reviewed changes

repokitteh-read-only Bot removed the api label Jul 17, 2025

wbpcode reviewed Jul 17, 2025

View reviewed changes

repokitteh-read-only Bot added the waiting:any label Jul 17, 2025

wbpcode self-assigned this Jul 17, 2025

repokitteh-read-only Bot added waiting and removed waiting:any labels Jul 17, 2025

yanavlasov reviewed Jul 17, 2025

View reviewed changes

repokitteh-read-only Bot added waiting:any and removed waiting labels Jul 17, 2025

Merge branch 'main' of github.com:envoyproxy/envoy into large-bufferi…

3e5eca1

…ng-1

repokitteh-read-only Bot added api and removed waiting:any labels Jul 17, 2025

addressed comments from @wbpcode and @yanavlasov

01e98af

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

agrawroh force-pushed the large-buffering-1 branch from ca31e21 to fecf789 Compare July 17, 2025 19:12

agrawroh added 3 commits July 17, 2025 13:06

Merge branch 'main' of github.com:envoyproxy/envoy into large-bufferi…

4fb393b

…ng-1 Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

fixes

f3c7838

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

Merge branch 'main' of github.com:envoyproxy/envoy into large-bufferi…

0a099ea

…ng-1

agrawroh force-pushed the large-buffering-1 branch from fecf789 to 0a099ea Compare July 17, 2025 21:30

fixes

d652c1f

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

wbpcode reviewed Jul 18, 2025

View reviewed changes

agrawroh added 4 commits July 18, 2025 01:44

fixes

fcba055

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

fixes

e881676

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

fixes

f5498c2

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

Merge branch 'main' of github.com:envoyproxy/envoy into large-bufferi…

14317e1

…ng-1

agrawroh added 4 commits July 19, 2025 10:04

Merge branch 'main' of github.com:envoyproxy/envoy into large-bufferi…

02ae2df

…ng-1

fixes

89a7e9f

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

fixes

8b37196

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

fixes

0fc6caf

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

agrawroh requested a review from RyanTheOptimist as a code owner July 19, 2025 19:21

repokitteh-read-only Bot added the waiting label Jul 22, 2025

agrawroh added 2 commits August 8, 2025 01:19

Merge branch 'main' of github.com:envoyproxy/envoy into large-bufferi…

abde97b

…ng-1

addressed comments from @wbpcode

5f3ef6e

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

repokitteh-read-only Bot removed the waiting label Aug 8, 2025

Merge branch 'main' into large-buffering-1

dbc017a

Signed-off-by: yanavlasov <yavlasov@google.com>

yanavlasov approved these changes Aug 11, 2025

View reviewed changes

yanavlasov enabled auto-merge (squash) August 11, 2025 20:02

repokitteh-read-only Bot removed the api label Aug 11, 2025

abeyad approved these changes Aug 11, 2025

View reviewed changes

yanavlasov merged commit 369ace2 into envoyproxy:main Aug 11, 2025
26 checks passed

wbpcode mentioned this pull request Dec 14, 2025

router: fix a bug where the buffer limit has incorrect precedence #42611

Merged

wbpcode mentioned this pull request Mar 28, 2026

router: fix a bug where internal redirect will hang up request or unexpected redirect #44154

Merged

jukie mentioned this pull request Apr 2, 2026

Strange internal_redirect / buffer interactions #44128

Closed

		@@ -686,6 +686,15 @@ class VirtualHost {
		*/
		virtual uint32_t retryShadowBufferLimit() const PURE;

Conversation

agrawroh commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

repokitteh-read-only Bot commented Jul 16, 2025

Uh oh!

repokitteh-read-only Bot commented Jul 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abeyad left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wbpcode Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yanavlasov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agrawroh commented Jul 18, 2025

Uh oh!

agrawroh commented Jul 18, 2025

Uh oh!

abeyad commented Aug 11, 2025

Uh oh!

agrawroh commented Aug 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

agrawroh commented Jul 16, 2025 •

edited

Loading

wbpcode Jul 17, 2025 •

edited

Loading