Skip to content

router: fix a bug where internal redirect will hang up request or unexpected redirect#44154

Merged
wbpcode merged 4 commits intoenvoyproxy:mainfrom
wbpcode:dev-fix-router
Apr 1, 2026
Merged

router: fix a bug where internal redirect will hang up request or unexpected redirect#44154
wbpcode merged 4 commits intoenvoyproxy:mainfrom
wbpcode:dev-fix-router

Conversation

@wbpcode
Copy link
Copy Markdown
Member

@wbpcode wbpcode commented Mar 28, 2026

Commit Message: router: fix a bug where internal redirect will hang up request or unexpected redirect
Additional Description:

To close #44114

The PR fixed some bugs that introduced in the #40254 and also other old bug existed for a while.

In previous implementation:

  1. It's possible the size of first chunk of request is larger than the buffer limit and if retry and internal redirect are enabled at same time (if version <= 1.35, only need the internal redirect be enabled), no any data will be added to filter chain buffer and the decodingBuffer() will always return nullptr. This will finally result in unexpected redirection because decodingBuffer() == nullptr be used as a flag that could redirect.
  2. If version > 1.35, if only internal redirect is enabled, the request body will always be added to filter chain buffer (buffering flag is not set correctly) and will hang up the request (because the readDisable(true) and TCP flow control) if the buffer is overflowed and the request is still not completed.
  3. If version > 1.35 and both internal redirect and retry policy are enabled, then the chunk X that trigger the buffer overflow will be skipped, but all other chunks (the chunks before chunk X and chunks after chunk X) will be added to filter chain buffer and finally may hang up the request. This is a little more complex:
    • If chunk X triggered overflow, then the retry state will be reset and buffering flag will be false, so chunk X will be skipped correctly.
    • After chunk X, if new chunks coming, because the retry state was reset, then the request will enter the internal redirect only's logic and the buffering flag will not be set correctly and result in problem.

Risk Level: mid, core logic change.
Testing: unit.
Docs Changes: n/a.
Release Notes: added.
Platform Specific Features: n/a.

wbpcode added 3 commits March 28, 2026 13:16
Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
@wbpcode wbpcode added the backport/review Request to backport to stable releases label Mar 28, 2026
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 28, 2026

This at least need to be back ported to 1.36 and 1.37.

Comment on lines -2031 to +2007
if (downstream_end_stream_ && (!request_buffer_overflowed_ || !callbacks_->decodingBuffer()) &&
location != nullptr &&
if (downstream_end_stream_ && (!request_buffer_overflowed_) && location != nullptr &&
Copy link
Copy Markdown
Member Author

@wbpcode wbpcode Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The !request_buffer_overflowed_ || !callbacks_->decodingBuffer() will result in unexpected redirection because it's possible the the request has body but first chunk overflow and the decodingBuffer() still be nullptr.

Comment on lines -1452 to -1457
// Check if buffer overflow occurred and override error details accordingly
if (request_buffer_overflowed_) {
code = Http::Code::InsufficientStorage;
body = "exceeded request buffer limit while retrying upstream";
details = StreamInfo::ResponseCodeDetails::get().RequestPayloadExceededRetryBufferLimit;
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer overflow will affect retry and redirection. But if upstream abort (reset, connection failure), we should expose the actual error from upstream to client. This is also introduced at #40254

Comment on lines -3338 to -3347
EXPECT_CALL(callbacks_.route_->route_entry_, requestBodyBufferLimit()).WillOnce(Return(10));

NiceMock<Http::MockRequestEncoder> encoder1;
Http::ResponseDecoder* response_decoder = nullptr;
expectNewStreamWithImmediateEncoder(encoder1, &response_decoder, Http::Protocol::Http10);

Http::TestRequestHeaderMapImpl headers{
{"x-envoy-retry-on", "5xx"}, {"x-envoy-internal", "true"}, {"myheader", "present"}};
HttpTestUtility::addDefaultHeaders(headers);
router_->decodeHeaders(headers, false);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR also removed some repeated tests for buffer/retry.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Mar 30, 2026

/retest

agrawroh
agrawroh previously approved these changes Mar 30, 2026
Copy link
Copy Markdown
Member

@agrawroh agrawroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thank You for fixing it.


// In production, the HCM recreateStream would have called this.
router_->onDestroy();
EXPECT_FALSE(callbacks_.streamInfo().filterState()->hasDataWithName("num_internal_redirects"));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also verify that retry_or_shadow_abandoned gets incremented as we would increment that now per the new flow?

EXPECT_EQ(1U, cm_.thread_local_cluster_.cluster_.info_->stats_store_.counter("retry_or_shadow_abandoned").value());
EXPECT_EQ(1U, cm_.thread_local_cluster_.cluster_.info_->stats_store_.counter("upstream_internal_redirect_failed_total").value());

Comment thread source/common/router/router.cc Outdated
// request.
if (would_exceed_buffer && retry_enabled && !is_redirect_only && !request_buffer_overflowed_) {
// Handle buffer overflow.
if (buffering && would_exceed_buffer && !request_buffer_overflowed_) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could just remove !request_buffer_overflowed_ from here as it's already covered as part of buffering?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. 👍

Comment thread test/common/router/router_test.cc Outdated
Http::ResponseDecoder* response_decoder = nullptr;
expectNewStreamWithImmediateEncoder(encoder1, &response_decoder, Http::Protocol::Http10);

// Enable redirects also. This feature will not be used but will affects buffer logic.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Enable redirects also. This feature will not be used but will affects buffer logic.
// Enable redirects also. This feature will not be used but will affect buffer logic.

Comment thread test/common/router/router_test.cc Outdated
Buffer::OwnedImpl body("t");
router_->decodeData(body, false);
Buffer::OwnedImpl body2("t");
// Ensure the second chunk also isn't buffered and triggers the retry logic again.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Ensure the second chunk also isn't buffered and triggers the retry logic again.
// Ensure the second chunk also isn't buffered and triggers the retry logic again.
// Ensure subsequent chunks after overflow are also not buffered.

Signed-off-by: wbpcode/wangbaiping <wbphub@gmail.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Apr 1, 2026

gently ping @yanavlasov

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Apr 1, 2026

Hah, I found @yanavlasov recently no activity on github. Let me ping @ggreenway

@ggreenway ggreenway removed their assignment Apr 1, 2026
@wbpcode wbpcode merged commit 4f56daf into envoyproxy:main Apr 1, 2026
29 checks passed
@wbpcode wbpcode deleted the dev-fix-router branch April 2, 2026 10:12
@phlax
Copy link
Copy Markdown
Member

phlax commented Apr 2, 2026

/backport

wbpcode added a commit to wbpcode/envoy that referenced this pull request Apr 7, 2026
wbpcode added a commit to wbpcode/envoy that referenced this pull request Apr 7, 2026
wbpcode added a commit to wbpcode/envoy that referenced this pull request Apr 7, 2026
…xpected redirect (envoyproxy#44154)

Signed-off-by: wbpcode <wbphub@gmail.com>
wbpcode added a commit to wbpcode/envoy that referenced this pull request Apr 7, 2026
…xpected redirect (envoyproxy#44154)

Signed-off-by: wbpcode <wbphub@gmail.com>
jwendell pushed a commit that referenced this pull request Apr 7, 2026
…xpected redirect (#44154) (#44304)

Commit Message: router: fix a bug where internal redirect will hang up
request or unexpected redirect (#44154)
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional [API
Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):]

Signed-off-by: wbpcode <wbphub@gmail.com>
jwendell pushed a commit that referenced this pull request Apr 7, 2026
…xpected redirect (#44154) (#44303)

Commit Message: router: fix a bug where internal redirect will hang up
request or unexpected redirect (#44154)
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional [API
Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):]

Signed-off-by: wbpcode <wbphub@gmail.com>
@phlax phlax removed the backport/review Request to backport to stable releases label Apr 9, 2026
nshipilov pushed a commit to nshipilov/envoy that referenced this pull request Apr 13, 2026
…xpected redirect (envoyproxy#44154)

Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
krinkinmu pushed a commit to grnmeira/envoy that referenced this pull request Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

envoy-1.36+: request_body_buffer_limit with internal_redirect_policy regression

6 participants