Skip to content

reverse_tunnels: intercept RPING on upstream reverse connections#43666

Merged
agrawroh merged 1 commit intoenvoyproxy:mainfrom
aakugan:fix/upstream-rc-races
Mar 6, 2026
Merged

reverse_tunnels: intercept RPING on upstream reverse connections#43666
agrawroh merged 1 commit intoenvoyproxy:mainfrom
aakugan:fix/upstream-rc-races

Conversation

@aakugan
Copy link
Contributor

@aakugan aakugan commented Feb 26, 2026

Commit Message

Intercept RPING on upstream reverse connections

Additional Description

If a RPING is sent and then the connection is upgraded to http2 via the conn pool the downstream reverse connection io handle's rping response causes a 502 protocol error.

More specifically: Remote peer returned unexpected data while we expected SETTINGS frame

Additional logs:

2026-02-26 17:01:22.937 trace connection[thread=41 func=onReadReady file=source/common/network/connection_impl.cc:721] [Tags: "ConnectionId":"10614"] read ready. dispatch_buffered_data=0
2026-02-26 17:01:22.937 info misc[thread=41 func=read file=./source/extensions/bootstrap/reverse_tunnel/upstream_socket_interface/reverse_connection_io_handle.h:89] ALERT GOT SOME RPING Bytes: RPING
2026-02-26 17:01:22.937 trace connection[thread=41 func=doRead file=source/common/network/raw_buffer_socket.cc:25] [Tags: "ConnectionId":"10614"] read returns: 20745
2026-02-26 17:01:22.937 info misc[thread=41 func=read file=./source/extensions/bootstrap/reverse_tunnel/upstream_socket_interface/reverse_connection_io_handle.h:89] ALERT GOT SOME RPING Bytes: RPING
2026-02-26 17:01:22.937 trace connection[thread=41 func=doRead file=source/common/network/raw_buffer_socket.cc:39] [Tags: "ConnectionId":"10614"] read error: Resource temporarily unavailable, code: 0
2026-02-26 17:01:22.937 trace http2[thread=41 func=dispatch file=source/common/http/http2/codec_impl.cc:1077] [Tags: "ConnectionId":"10614"] dispatching 20745 bytes
2026-02-26 17:01:22.937 debug http2[thread=41 func=onError file=source/common/http/http2/codec_impl.cc:1395] [Tags: "ConnectionId":"10614"] invalid http2: Remote peer returned unexpected data while we expected SETTINGS frame.  Perhaps, peer does not support HTTP/2 properly.

The Alert was a custom log line added by me for easier debugging.
Its better for both the downstream and upstream rc io handles to share a common read implementation.

Testing

Unit tests.
Additionally, manually validated the example in the docs which also works after the change.

Signed-off-by: aakugan <aakashganapathy2@gmail.com>
@aakugan aakugan force-pushed the fix/upstream-rc-races branch from e5dd925 to f248e1e Compare February 26, 2026 21:16
@basundhara-c
Copy link
Contributor

@aakugan can you detail a little more on the steps leading to this issue? Is this a race between upstream receiving a RPONG in plaintext from downstream, after having sent a data request on a HTTP/2 stream?

@aakugan
Copy link
Contributor Author

aakugan commented Feb 27, 2026

Yes @basundhara-c that is correct under heavy load / latency we observed that the connection may get upgraded ie PRI... was sent and the codec is waiting for the settings frame before the downstream RPONG is received which is then directly passed to the client_codec_ causing this issue.

@RyanTheOptimist
Copy link
Contributor

/assign @botengyao

@botengyao botengyao assigned agrawroh and unassigned botengyao Mar 5, 2026
@agrawroh
Copy link
Member

agrawroh commented Mar 6, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the RPING handling logic into a new RpingInterceptor class, centralizing the logic and fixing an issue where RPING messages could interfere with HTTP/2 connection upgrades on upstream reverse connections. However, the RpingInterceptor::read() implementation introduces critical flaws that lead to protocol desynchronization and data corruption. Specifically, it leaks partial internal protocol bytes to the caller and returns incorrect byte counts after draining messages, which can cause connection failures and potentially be exploited to bypass protocol-level security checks. Additionally, a pre-existing correctness issue regarding partial RPING message handling has been carried over, which could lead to further protocol errors in downstream filters.

Comment on lines +32 to +58
if (buffer.length() == 0) {
return Api::IoCallUint64Result{expected, Api::IoError::none()};
}

// RPING followed by application data. Disable echo and return the remaining data.
ENVOY_LOG(trace,
"RpingInterceptor: received application data after RPING, "
"disabling RPING echo for FD: {}",
fd_);
ping_echo_active_ = false;
// The adjusted return value is the number of bytes excluding the drained RPING. It should be
// transparent to upper layers that the RPING was processed.
const uint64_t adjusted =
(result.return_value_ >= expected) ? (result.return_value_ - expected) : 0;
return Api::IoCallUint64Result{adjusted, Api::IoError::none()};
}

// If partial data could be the start of RPING (only when fewer than expected bytes).
if (len < expected) {
const absl::string_view rping_prefix =
ReverseConnectionUtility::PING_MESSAGE.substr(0, static_cast<size_t>(len));
if (peek_sv == rping_prefix) {
ENVOY_LOG(trace,
"RpingInterceptor: partial RPING received ({} bytes), waiting "
"for more.",
len);
return result; // Wait for more data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The RpingInterceptor::read() implementation contains critical flaws that lead to protocol desynchronization and data corruption.

Specifically, at line 58, when a partial RPING is read, the interceptor incorrectly returns the result to the caller. This exposes internal protocol bytes ("RPI...") to upper-layer protocols (like HTTP/2 or TLS), causing protocol errors and connection termination. The interceptor should completely hide RPING messages, including partial ones, by returning an EAGAIN error to signal to the caller that it should not process the buffer and should wait for more data. The partial data will remain in the buffer for the next read operation to complete the message.

Additionally, there are issues with incorrect return values on drain (lines 32-34) where the function incorrectly informs the caller that data was added when the buffer is empty, and data desynchronization on adjusted return values (lines 44-46) where the adjusted return value calculation does not account for bytes read in previous partial reads. These issues collectively cause the Envoy network stack to lose synchronization with the data stream, leading to connection failures and potential exploitation of protocol-level vulnerabilities.

Applying the fix for partial RPING handling will require updating tests that check for partial RPING handling, such as ChoppedRpingCompletesAndDrainsInSingleBuffer in rping_interceptor_test.cc and ReadPartialDataAndStateTransitions in downstream_reverse_connection_io_handle_test.cc.

Suggested change
if (buffer.length() == 0) {
return Api::IoCallUint64Result{expected, Api::IoError::none()};
}
// RPING followed by application data. Disable echo and return the remaining data.
ENVOY_LOG(trace,
"RpingInterceptor: received application data after RPING, "
"disabling RPING echo for FD: {}",
fd_);
ping_echo_active_ = false;
// The adjusted return value is the number of bytes excluding the drained RPING. It should be
// transparent to upper layers that the RPING was processed.
const uint64_t adjusted =
(result.return_value_ >= expected) ? (result.return_value_ - expected) : 0;
return Api::IoCallUint64Result{adjusted, Api::IoError::none()};
}
// If partial data could be the start of RPING (only when fewer than expected bytes).
if (len < expected) {
const absl::string_view rping_prefix =
ReverseConnectionUtility::PING_MESSAGE.substr(0, static_cast<size_t>(len));
if (peek_sv == rping_prefix) {
ENVOY_LOG(trace,
"RpingInterceptor: partial RPING received ({} bytes), waiting "
"for more.",
len);
return result; // Wait for more data.
// Return EAGAIN to prevent the caller from processing the partial RPING data.
// The data remains in the buffer and will be processed with more data on the next read.
return Api::IoCallUint64Result(0, IoSocketError::create(SOCKET_ERROR_AGAIN));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RPING is only 5 bytes and is usually sent in a single TCP segment, so partial reads are unlikely in practice. Returning EAGAIN would need e2e testing and possibly upstream Envoy changes, so I’d prefer to handle that in a follow-up enhancement and don’t think we should block this PR on it.

Copy link
Member

@agrawroh agrawroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basundhara-c We should do a follow-up to remove the partial case if we think it can't happen in practice and see if we could return EAGAIN as suggested.

@agrawroh agrawroh merged commit 50c47e9 into envoyproxy:main Mar 6, 2026
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants