New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate mux parser into stirling #367
Integrate mux parser into stirling #367
Conversation
Can one of the admins verify this patch? |
28389ae
to
b719d55
Compare
src/stirling/source_connectors/socket_tracer/mux_trace_bpf_test.cc
Outdated
Show resolved
Hide resolved
src/stirling/source_connectors/socket_tracer/bcc_bpf/protocol_inference_test.cc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good. Let's talk about how to test the Mux protocol inference rules on slack.
@@ -487,6 +487,66 @@ static __inline bool is_redis_message(const char* buf, size_t count) { | |||
return true; | |||
} | |||
|
|||
static __inline enum message_type_t infer_mux_message(const char* buf, size_t count) { | |||
static const int8_t kTreq = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking for a mux_type of 1 or -1 is probably not robust enough, because it's likely that another protocol can have a 1 at that location. We do have a network traffic dataset that we use internally to test how protocol inference rules affect the accuracy of classifying different protocols. I suggest we get some Mux data from a real application, add it to the dataset, and test its effect on the other protocols, before merging the protocol inference code. See experimental/stirling/protocol_inference
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attempted to enhance the mux protocol inference by reducing the number of types (R/Tinit, R/Tdispatch, Rerr(Old)) and by parsing 'special' strings that are present in those frame types. Ideally we would support all mux types, but there are some that aren't as high value.
If I remember correctly, the protocol inference runs on a connection until it's classified and is only provided the data from the syscall, correct? If that's the case I was hoping that having stronger verification on Rerr, Tinit and Rinit would help reduce false positives since it's earlier in the connection lifecycle. Let me know what you think of my latest update.
I am definitely interested in trying the protocol inference against the network traffic dataset. I shared the mux pcap file that I generated myself with you in slack. Unfortunately I don't know if I'll be able to provide real application traffic, but let me know how I can help with this testing.
namespace stirling { | ||
|
||
// clang-format off | ||
static constexpr DataElement kMuxElements[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to add a body field in the future? If so, let's add a TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it depends how we handle the nested thrift protocol data in thriftmux. I was picturing that the "body" field would be stored in a different table populated by a thrift protocol parser. That would allow non thriftmux users to get thrift protocol parsing.
However, it's not clear to me how we would join the mux and thrift procotol data in that situation. In an ideal world, the mux and thrift data would be easily joined. Has there been any discussion on how to handle protocols that are nested like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven't had nested protocols like this before, so we'll be treading new ground. The problem is that we have only one ConnTracker per connection, and there's an assumption that the ConnTracker will output to one table.
I suppose we could detect the more specific inner protocol and choose the more specific one when possible. Needs more discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expected this would require some discussion :) and good point regarding the ConnTracker 1:1 mapping. For now I've added a TODO comment explaining the future work.
src/stirling/source_connectors/socket_tracer/socket_trace_connector.cc
Outdated
Show resolved
Hide resolved
e3bceb9
to
79b228a
Compare
namespace stirling { | ||
|
||
// clang-format off | ||
static constexpr DataElement kMuxElements[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven't had nested protocols like this before, so we'll be treading new ground. The problem is that we have only one ConnTracker per connection, and there's an assumption that the ConnTracker will output to one table.
I suppose we could detect the more specific inner protocol and choose the more specific one when possible. Needs more discussion.
src/stirling/source_connectors/socket_tracer/socket_trace_connector.h
Outdated
Show resolved
Hide resolved
…in a bug fix), remove unused client image and add class to use image Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
…er image Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
…es removes the buffer data correctly Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
…ng for mux's 24 bit tag field Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
…ire frame is provided Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
…_test Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
…d inside the mux protocol) Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
…g postgres traffic Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
5e2c268
to
84767f8
Compare
@chengruizhe I removed the part of the mux protocol inference that was misclassifying postgres traffic. |
Codecov Report
@@ Coverage Diff @@
## main #367 +/- ##
==========================================
- Coverage 63.44% 63.42% -0.02%
==========================================
Files 1044 1048 +4
Lines 65051 65254 +203
==========================================
+ Hits 41271 41390 +119
- Misses 22553 22631 +78
- Partials 1227 1233 +6
Continue to review full report at Codecov.
|
Marking this as closed since the majority has been pulled in. The last piece (eBPF mux protocol inference) will be submitted as a separate PR. |
This is my second attempt at integrating the mux parser into stirling. I'm opening this preemptively to show that the testing for #366 was successful, but we need to merge that PR first and rebase to pull out the docker image related changes.
Todo