listener: remove the peek from the listener filters #17395

soulxu · 2021-07-18T11:40:03Z

Commit Message: listener: remove the peek from the listener filters
Additional Description:
To avoid the listener filters to peek the data from the socket directly, allow the ActiveTcpSocket to peek the data
into the buffer. The ActiveTcpSocket will get the max data expected from the filters through the
filter interface size_t maxReadBytes(). Then ActiveTcpSocket will listen on the file event on the socket.
And peek the data into the buffer. And calling the listener filter callback onData(Buffer::Instance&).

If the filter doesn't need any data from the connection, it should return 0 on the interface 'maxReadBytes()'.
The ActiveTcpSocket won't listen on the file event if there is no filter expecting any data.

The buffer will be shared across the listener filter. no duplicated peek need anymore.

The buffer is implemented by ListenerFilterBufferImpl. It will store the data peeked from the socket, and provide
the const pointer to the listener filter to access the data. It also provides the interface for listener filter to drain the data,
and the data will be drained on the socket in the end.

Risk Level: high
Testing: unit tests and integration tests
Docs Changes: add document for the new stats
Release Notes: add release note for new stats
Fixes: #17229

Signed-off-by: He Jie Xu hejie.xu@intel.com

soulxu · 2021-07-18T11:41:38Z

cc @mattklein123 @ggreenway @antoniovicente @lambdai

I created this PoC for showing the interface design, so I didn't work out the unittest/functional test since I thought there will be some discussion on the interface design. But if you guys think the test should be work out, I will work on it.

soulxu · 2021-07-18T11:49:12Z

/wait

ggreenway · 2021-07-20T00:04:58Z

I like the direction this is going. I'd like to take it even further, and do a read, not a peek. Then the listener filter can drain data from the buffer if it wants to consume the data (proxy proto listener filter needs to do this), or just look at the contents of the buffer if it only wants to inspect. At the end of the listener filters, pass the buffer (probably non-empty) along with already-read data.

If we went that route, then callback name just becomes onData() like in a network filter.

soulxu · 2021-07-20T07:02:46Z

I like the direction this is going. I'd like to take it even further, and do a read, not a peek. Then the listener filter can drain data from the buffer if it wants to consume the data (proxy proto listener filter needs to do this), or just look at the contents of the buffer if it only wants to inspect. At the end of the listener filters, pass the buffer (probably non-empty) along with already-read data.

If we went that route, then callback name just becomes onData() like in a network filter.

thanks for your review! l will deep into the way of read instead of peek

lambdai · 2021-07-21T16:37:13Z

I like the direction this is going. I'd like to take it even further, and do a read, not a peek. Then the listener filter can drain data from the buffer if it wants to consume the data (proxy proto listener filter needs to do this), or just look at the contents of the buffer if it only wants to inspect. At the end of the listener filters, pass the buffer (probably non-empty) along with already-read data.

If we went that route, then callback name just becomes onData() like in a network filter.

+1

The current duplicate peek includes independent peek among listener filters which you are addressing.
Another potential huge part is the duplicated peek attempt within each listener filter.
The latter one even introduce the risk of 1 extra byte per peek. Not an vulnerability yet but it's can be mitigated with the buffer shared by listener filter attempt.

soulxu · 2021-08-06T00:32:11Z

@ggreenway @lambdai I update the PoC, using read instead of peek. But it is the same as previous the buffer kept in the ActiveTcpSocket. In the end, the buffer will transfer to the ConnectionImpl.

ggreenway

I like this approach a lot! This will make the listener filter interface so much nicer.

The biggest thing left to figure out is getting the data that is read for listener filters to go as input to the Transport Socket, not directly into the ConnectionImpl read buffer.

/wait

envoy/event/dispatcher.h

envoy/network/filter.h

source/common/event/dispatcher_impl.h

source/common/network/connection_impl.cc

source/server/active_tcp_socket.cc

soulxu · 2021-08-06T22:31:39Z

I like this approach a lot! This will make the listener filter interface so much nicer.

The biggest thing left to figure out is getting the data that is read for listener filters to go as input to the Transport Socket, not directly into the ConnectionImpl read buffer.

/wait

Thanks for the feedback! I will dig into the transport socket.

soulxu · 2021-08-13T04:11:56Z

I like this approach a lot! This will make the listener filter interface so much nicer.

The biggest thing left to figure out is getting the data that is read for listener filters to go as input to the Transport Socket, not directly into the ConnectionImpl read buffer.

I update the PoC. I found we have a custom BIO for the SSL socket. And I think all the transport sockets are using ioHandle to read the data.

envoy/source/extensions/transport_sockets/tls/ssl_socket.cc

Lines 75 to 76 in 95038fe

    
           BIO* bio = BIO_new_io_handle(&callbacks_->ioHandle()); 
        
           SSL_set_bio(rawSsl(), bio, bio);

So I inject the data into the ioHandle. When read the data from ioHandle, it will return the injected data first, then ready the data from the real socket.

Then the data read by the listener filter will be as an input to all the transport socket.

source/common/network/io_socket_handle_impl.cc

soulxu · 2021-08-17T00:58:55Z

@mattklein123 @ggreenway I find out a way to move the inspect data into transport, looking for your feedback when you have time, thanks in advance!

ggreenway

Sorry for the delay on this; I've been thinking through these changes.

I think putting the data into the IoHandle makes sense, because then none of the transport sockets need to change. But I'm not thrilled with any IoHandle being able to have data injected to it at any time. Maybe just documenting the usage and adding ASSERTs is good enough. Another idea (not sure if this would work or not) is to have an IoHandle that wraps the real IoHandle, stores the listener-filter data, returns that data from read operations, and passes all other interface calls to the real/wrapped IoHandle (including reads when the buffer is empty). I'm not sure if that will actually be an improvement or not, but it would at least mean that all of these changes aren't present for outgoing/upstream connections.

@mattklein123 what are your thoughts on these interfaces?

There are some other comments I'll have regarding the implementation, but I want to settle the interfaces somewhat before spending time on that.

/wait-any

envoy/network/connection.h

mattklein123 · 2021-08-19T19:51:00Z

Thanks for working on this. I think this will be a great improvement. Here are my high level thoughts, riffing a bit on some of @ggreenway ideas. I think the way I would think about this at a high level would look something like:

I think I would modify the virtual ConnectionSocket& socket() PURE; ListenerFilterCallbacks function to return something other than ConnectionSocket, which is really the root of the problem (we are giving listener filters way too much power). I think I would create a new class called something like ListenerFilterSocket with a very limited set of functions to access the underlying socket/connection.
I like your idea around having each listener filter say how much data it needs to look at, but I would go a step further and have each listener filter say whether it only needs to peek the data, or if it needs to modify it. Most listener filters only need to peek the data. In these cases, you don't actually have to read the data out of the socket and do the injection dance. You can just do the peeking outside the listener filter iteration and provide a const buffer of the peeked data up to the limit, much like each filter is doing today.
In the cases where the listener filter asks for data to modify up to a limit, we would read out into a buffer, but I think we can do this entirely outside of the IoHandle() interface, and then provide this data to listener filters in their read functions (potentially both a const and non-const version depending on what they ask for).

It's possible that we can greatly simplify this if we really only have 2 cases which are:

Peeking only.
Removal only (proxy proto)

Do we have any cases of straight up modification? If not I think we can punt this and at that point we don't need to inject at all. We either peek only read and remove only.

If we do have to modify I think we could then have the logic to send the modified data + potential peeked data to further filters, and then finally inject any extra data directly into the connection when it is created.

^ is not fully formed but this is how I would think about it personally. @ggreenway wdyt?

ggreenway · 2021-08-19T20:17:59Z

I'm not sure we can make removing data work without passing some read data to the next filters, because we need to read the data to know exactly how many bytes we need to consume.

But I suppose a hybrid of these approaches is to always pass data to the listener filter via peek, but give the listener filter a callback to drain some number of bytes from the connection. The implementation of that would be a read() call, but discarding the data (since it has already been inspected).

I've never really liked using PEEK because if the data arrives in multiple segments, we have to re-read data multiple times. For something like a large TLS clienthello, sent pathologically (1 byte at a time), this can add up to a substantial amount of wasted effort. But it does avoid having to muck with IoHandle at all, which is nice. I think the most important thing at this point is to get the interface to the ListenerFilter correct; the interface shouldn't depend on our underlying implementation of how we read from sockets.

soulxu · 2021-08-20T01:13:09Z

I think putting the data into the IoHandle makes sense, because then none of the transport sockets need to change. But I'm not thrilled with any IoHandle being able to have data injected to it at any time. Maybe just documenting the usage and adding ASSERTs is good enough. Another idea (not sure if this would work or not) is to have an IoHandle that wraps the real IoHandle, stores the listener-filter data, returns that data from read operations, and passes all other interface calls to the real/wrapped IoHandle (including reads when the buffer is empty). I'm not sure if that will actually be an improvement or not, but it would at least mean that all of these changes aren't present for outgoing/upstream connections.

Yes, totally agree with you. It is bad we have injection but we don't use it for some place. It is a great idea to have wrapped IoHandle for listener. If we are going to the road to use read, I will try the way you suggested.

soulxu · 2021-08-20T01:23:32Z

I think I would modify the virtual ConnectionSocket& socket() PURE; ListenerFilterCallbacks function to return something other than ConnectionSocket, which is really the root of the problem (we are giving listener filters way too much power). I think I would create a new class called something like ListenerFilterSocket with a very limited set of functions to access the underlying socket/connection.

+1, totally agree that we shouldn't expose the underlayer to the listner filter

I like your idea around having each listener filter say how much data it needs to look at, but I would go a step further and have each listener filter say whether it only needs to peek the data, or if it needs to modify it. Most listener filters only need to peek the data. In these cases, you don't actually have to read the data out of the socket and do the injection dance. You can just do the peeking outside the listener filter iteration and provide a const buffer of the peeked data up to the limit, much like each filter is doing today.

In the cases where the listener filter asks for data to modify up to a limit, we would read out into a buffer, but I think we can do this entirely outside of the IoHandle() interface, and then provide this data to listener filters in their read functions (potentially both a const and non-const version depending on what they ask for).

If the listener filter ask for data modifying, then we still need the data injection dance, right?

It's possible that we can greatly simplify this if we really only have 2 cases which are:

Peeking only.

Removal only (proxy proto)

Do we have any cases of straight up modification? If not I think we can punt this and at that point we don't need to inject at all. We either peek only read and remove only.

@lambdai mentioned the data modification in the issue, maybe he has any usecase.

If no data modification usecase, I guess you mean it back to the initial version of this PoC, peek the data, and an interface for listener to tell how much data they want to remove from the socket. Then using the read to drain the data from the socket.

I kind of like the current interface onData(Buffer::Instance), since it feels simple and clear for the listener filter writer. We needn't another interface for how much the data we want to remove. The filter only operates on the Buffer::Instance, then that is what will get for network filters, I feel it is straight forward for the listener filter developer.

If we do have to modify I think we could then have the logic to send the modified data + potential peeked data to further filters, and then finally inject any extra data directly into the connection when it is created.

emm...so we still need the inject data for modifying. I'm not sure about the benefit of hybrid the peek and read here, it feels like a more complex interface and implementation. maybe I missed some points here.

soulxu · 2022-03-25T04:03:29Z

@soulxu, @ggreenway is out this week. We will get this merged next week. Thank you for your patience.

got it, thanks!

adisuissa · 2022-03-29T13:41:07Z

@ggreenway @yanavlasov gentle ping :)

@soulxu DCO seems to have some issue, can you PTAL?

soulxu · 2022-03-29T13:57:31Z

Seems stuck at somewhere

/retest

repokitteh-read-only · 2022-03-29T13:57:34Z

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #17395 (comment) was created by @soulxu.

see: more, trace.

ggreenway · 2022-03-29T20:15:23Z

I don't think you can re-trigger DCO-bot. Can you push an empty commit, or merge main and push, to re-trigger DCO-bot?

ggreenway

This looks great! Thanks for the all the work you've put into this!

mattklein123 · 2022-03-29T22:52:31Z

I can just force merge as all the commits have sign off. Thank you!

soulxu · 2022-03-30T00:06:02Z

This looks great! Thanks for the all the work you've put into this!

@ggreenway @mattklein123 thanks for the guidance again!

moderation · 2022-03-30T04:49:04Z

I thought it might be due to my upgrade to Bazel 5.1.0 but I'm now getting a MacOS M1 aarch compilation error on a file changed by this PR

ERROR: /Users/moderation/Applications/envoyproxy/envoy/source/extensions/filters/listener/tls_inspector/BUILD:15:17: Compiling source/extensions/filters/listener/tls_inspector/tls_inspector.cc failed: (Aborted): wrapped_clang_pp failed: error executing command external/local_config_cc/wrapped_clang_pp '-D_FORTIFY_SOURCE=1' -fstack-protector -fcolor-diagnostics -Wall -Wthread-safety -Wself-assign -fno-omit-frame-pointer -g0 -O2 -DNDEBUG ... (remaining 141 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox
In file included from source/extensions/filters/listener/tls_inspector/tls_inspector.cc:1:
./source/extensions/filters/listener/tls_inspector/tls_inspector.h:83:12: error: virtual function 'maxReadBytes' has a different return type ('uint64_t' (aka 'unsigned long long')) than the function it overrides (which has return type 'size_t' (aka 'unsigned long'))
  uint64_t maxReadBytes() const override { return config_->maxClientHelloSize(); }
  ~~~~~~~~ ^
./envoy/network/filter.h:347:18: note: overridden virtual function is here
  virtual size_t maxReadBytes() const PURE;
          ~~~~~~ ^
1 error generated.
Error in child process '/usr/bin/xcrun'. 1
Target //source/exe:envoy-static.stripped failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 407.055s, Critical Path: 50.81s
INFO: 4027 processes: 1165 internal, 2862 darwin-sandbox.
FAILED: Build did NOT complete successfully

soulxu · 2022-03-30T04:53:07Z

source/extensions/filters/listener/tls_inspector/tls_inspector.cc

Yea, it should be the size_t, let me fix it. Thanks for pointing this out!

kyessenov · 2022-04-07T16:31:21Z

I suspect this is causing issues with Istio import. The following minimal bootstrap is now not performing TCP proxying:

node:
  id: test
static_resources:
  listeners:
    name: client
    traffic_direction: OUTBOUND
    address:
      socket_address:
        address: 127.0.0.1
        port_value: 9090
    listener_filters:
    - name: "envoy.filters.listener.tls_inspector"
    - name: "envoy.filters.listener.http_inspector"
    filter_chains:
    - filters:
      - name: tcp_proxy
        typed_config:
          "@type": type.googleapis.com/udpa.type.v1.TypedStruct
          type_url: envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
          value:
            stat_prefix: outbound
            cluster: outbound
  clusters:
  - name: outbound
    type: STATIC
    load_assignment:
      cluster_name: outbound
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 9091

Every connection is terminated. Seems like some bug got introduced into HTTP inspector.

mattklein123 · 2022-04-07T16:37:32Z

@kyessenov not surprised there would be a regression here. Feel free to revert until we sort it out. Thank you. cc @soulxu

kyessenov · 2022-04-07T16:41:00Z

I looked at the PR, and it didn't change cb.socket().close() statement after unsuccessful parse so I'm puzzled how this worked before. Will try to compare more.

ggreenway · 2022-04-07T17:00:46Z

Given this is something that istio depends on, can you create an integration test that captures the failure? Both to help diagnose this issue, and to prevent future regressions. It may be a failure when there are two listener filters; i don't think we have much coverage for that situation at all.

kyessenov · 2022-04-07T17:03:02Z

I can confirm it's this PR for the test case above. It's just one filter I think - before the socket close wasn't doing anything somehow.

kyessenov · 2022-04-07T17:18:09Z

I think I understand what happened - prior version was signaling ParseError on read errors. The refactor conflated the actual parse error and continued closing the socket.

mattklein123 · 2022-04-07T17:20:03Z

OK if you want to fix forward we can do that or revert for now. Up to you.

kyessenov · 2022-04-07T18:09:24Z

Although this looks like a one liner fix to remove socket.close() and return Continue, let's revert, and an integration test for HTTP (which is missing) and then validate again.

soulxu · 2022-04-07T23:42:44Z

@kyessenov Thanks for diagnosing this, let me fix it.

Signed-off-by: He Jie Xu <hejie.xu@intel.com>

soulxu · 2022-04-08T05:15:25Z

source/extensions/filters/listener/http_inspector/http_inspector.cc

    // As per discussion in https://github.com/envoyproxy/envoy/issues/7864
    // we don't add new enum in FilterStatus so we have to signal the caller
    // the new condition.
-    cb.socket().close();
+    cb_->socket().close();
    return Network::FilterStatus::StopIteration;


This should be continue, and not close the socket.

soulxu · 2022-04-08T05:18:13Z

test/extensions/filters/listener/http_inspector/http_inspector_test.cc

+    EXPECT_CALL(socket_, close());
+    file_event_callback_(Event::FileReadyType::Read);
+    auto status = filter_->onData(*buffer_);
+    EXPECT_EQ(status, Network::FilterStatus::StopIteration);


This is wrong, I changed the original test behavior.

Signed-off-by: He Jie Xu <hejie.xu@intel.com>

soulxu requested review from asraa, ggreenway, lizan and snowp as code owners July 18, 2021 11:40

repokitteh-read-only bot added the waiting label Jul 18, 2021

mattklein123 self-assigned this Jul 19, 2021

ggreenway self-assigned this Jul 20, 2021

repokitteh-read-only bot removed the waiting label Aug 5, 2021

ggreenway requested changes Aug 6, 2021

View reviewed changes

repokitteh-read-only bot added the waiting label Aug 6, 2021

repokitteh-read-only bot removed the waiting label Aug 13, 2021

soulxu commented Aug 13, 2021

View reviewed changes

source/common/network/io_socket_handle_impl.cc Outdated Show resolved Hide resolved

soulxu commented Aug 14, 2021

View reviewed changes

source/common/network/io_socket_handle_impl.cc Outdated Show resolved Hide resolved

soulxu commented Aug 14, 2021

View reviewed changes

source/common/network/io_socket_handle_impl.cc Outdated Show resolved Hide resolved

ggreenway reviewed Aug 19, 2021

View reviewed changes

envoy/network/connection.h Outdated Show resolved Hide resolved

repokitteh-read-only bot added the waiting:any label Aug 19, 2021

repokitteh-read-only bot removed the waiting:any label Aug 19, 2021

soulxu mentioned this pull request Mar 28, 2022

docs: updating Codeowners #20512

Merged

ggreenway approved these changes Mar 29, 2022

View reviewed changes

mattklein123 approved these changes Mar 29, 2022

View reviewed changes

mattklein123 merged commit 93b0a4e into envoyproxy:main Mar 29, 2022

soulxu mentioned this pull request Mar 30, 2022

listener: fix tls inspector's maxReadBytes() return type #20579

Merged

kyessenov mentioned this pull request Apr 7, 2022

build fixes istio/proxy#3801

Merged

This was referenced Apr 7, 2022

Revert: listener: remove the peek from the listener filters #20718

Merged

http_inspector: add basic integration test #20720

Merged

soulxu added a commit to soulxu/envoy that referenced this pull request Apr 8, 2022

listener: remove the peek from the listener filters (envoyproxy#17395)

c143c69

Signed-off-by: He Jie Xu <hejie.xu@intel.com>

soulxu commented Apr 8, 2022

View reviewed changes

ravenblackx pushed a commit to ravenblackx/envoy that referenced this pull request Jun 8, 2022

listener: remove the peek from the listener filters (envoyproxy#17395)

3d35a28

Signed-off-by: He Jie Xu <hejie.xu@intel.com>

listener: remove the peek from the listener filters #17395

listener: remove the peek from the listener filters #17395

Conversation

soulxu commented Jul 18, 2021 • edited

soulxu commented Jul 18, 2021

soulxu commented Jul 18, 2021

ggreenway commented Jul 20, 2021

soulxu commented Jul 20, 2021

lambdai commented Jul 21, 2021

soulxu commented Aug 6, 2021

ggreenway left a comment

Choose a reason for hiding this comment

soulxu commented Aug 6, 2021

soulxu commented Aug 13, 2021 • edited

soulxu commented Aug 17, 2021

ggreenway left a comment

Choose a reason for hiding this comment

mattklein123 commented Aug 19, 2021 • edited

ggreenway commented Aug 19, 2021

soulxu commented Aug 20, 2021 • edited

soulxu commented Aug 20, 2021 • edited

soulxu commented Mar 25, 2022

adisuissa commented Mar 29, 2022

soulxu commented Mar 29, 2022

repokitteh-read-only bot commented Mar 29, 2022

ggreenway commented Mar 29, 2022

ggreenway left a comment

Choose a reason for hiding this comment

mattklein123 commented Mar 29, 2022

soulxu commented Mar 30, 2022

moderation commented Mar 30, 2022

soulxu commented Mar 30, 2022

kyessenov commented Apr 7, 2022

mattklein123 commented Apr 7, 2022

kyessenov commented Apr 7, 2022

ggreenway commented Apr 7, 2022

kyessenov commented Apr 7, 2022

kyessenov commented Apr 7, 2022

mattklein123 commented Apr 7, 2022

kyessenov commented Apr 7, 2022

soulxu commented Apr 7, 2022

soulxu Apr 8, 2022

Choose a reason for hiding this comment

soulxu Apr 8, 2022

Choose a reason for hiding this comment

soulxu commented Jul 18, 2021 •

edited

soulxu commented Aug 13, 2021 •

edited

mattklein123 commented Aug 19, 2021 •

edited

soulxu commented Aug 20, 2021 •

edited

soulxu commented Aug 20, 2021 •

edited