-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDMA sniffing support for pcap #585
Conversation
45f98f4
to
7473a0d
Compare
The Travis CI automated builds show that my changes don't regress - but because the Linux build environments are old (Ubuntu 12.04 includes libibverbs 1.1.5), the configure script does not enable my new module, and so the C code isn't compiled. |
configure.ac
Outdated
|
||
if test "xxx_only" = yes; then | ||
# User requested something-else-only pcap, so they don't | ||
# want D-Bus support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably you mean "RDMA capture support" rather than "D-Bus support" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you can guess where I copy and pasted that code snipped from :)
Fixed in the updated commit.
pcap-rdmasniff.c
Outdated
if (strlen(dev_list[i]->name) == namelen && | ||
!strncmp(device, dev_list[i]->name, namelen)) { | ||
p = pcap_create_common(ebuf, sizeof (struct pcap_rdmasniff)); | ||
if (p) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If p is NULL, you should just return NULL, with *is_ours set to 1 - pcap_create_common() failing isn't "this isn't an RDMA capture device", it's "this is an RDMA capture device, but something went wrong trying to create the pcap_t
for it".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks for clarifying the pcap interface for me.
} | ||
|
||
for (i = 0; i < numdev; ++i) { | ||
if (!add_dev(devlistp, dev_list[i]->name, 0, "RDMA sniffer", err_str)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all the RDMA sniffer devices the same, or do different devices have different purposes? If they have different purposes, the description string should probably indicate the purpose of the device in question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference in devices is analogous to the differences between network interfaces ("ethX" and "ethY"). For example, on my system I see:
# tcpdump --list-interfaces | grep RDMA
11.mlx5_3 (RDMA sniffer)
12.mlx5_2 (RDMA sniffer)
13.mlx5_1 (RDMA sniffer)
14.mlx5_0 (RDMA sniffer)
because I have two 2-port adapters installed - each device corresponds to capturing packets on one port of an adapter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the names should probably be more than just "RDMA sniffer", they should include some indication of the port on which they'll capture traffic.
Other than the questions/comments I added, it looks OK. |
A couple of questions:
|
dc4ac72
to
b08dff8
Compare
With regards to the questions:
|
Roland Dreier <notifications@github.com> wrote:
The Travis CI automated builds show that my changes don't regress -
but because the Linux build environments are old (Ubuntu 12.04
includes libibverbs 1.1.5), the configure script does not enable my
new module, and so the C code isn't compiled.
Yes, it's a bit of a pain in the ass.
Supposedly, travis is working on updates to 14.x or 16.x, but last time I
looked it wasn't done. They also used to make EC2/etc. VMs configured
identically to theirs in which one could debug on, but that isn't the case
anymore.
If your dependancies are available as dpkg, you could attempt to add them to
the packages; if not then you could configure them with a build script.
See what I do here, where I want the latest libpcap in unstrung:
https://github.com/mcr/unstrung/blob/master/.travis.yml
https://github.com/mcr/unstrung/blob/master/build-setup-travis.sh
I build things, and then ask Travis to cache them.
…--
] Never tell me the odds! | ipv6 mesh networks [
] Michael Richardson, Sandelman Software Works | network architect [
] mcr@sandelman.ca http://www.sandelman.ca/ | ruby on rails [
|
I looked on the code, seems crazily simple and cool +static const int RDMASNIFF_RECEIVE_SIZE = 10000; I haven't use jumbo frames with mlx5, but I guess the max is 9k, so you're fine We (Mellanox) made some runs with your patch, it turns out that it even works for ConnectX3!! |
As far as I know, this code is ready to go. @guyharris is there anything further you need? How does this pull request actually get landed? Thanks! |
As per my comment, should the descriptions for the "RDMA sniffing" devices include an indication of the port on which the device in question will sniff? The description, if available, is supposed to say more than what the device does, e.g. USB sniffing devices indicate on which bus they'll sniff. |
Implement capture support for offloaded RDMA traffic. This uses the RDMA verbs "flow steering" interface, which is available in the Linux kernel since version 3.12. The userspace interface is ibv_create_flow() - so building this support in pcap adds a new dependency on libibverbs. I added a new "rdmasniff" pcap module, which exposes RDMA devices under an interface name equal to their libibverbs name. The module uses the RDMA verbs interface to create a receive queue with a flow steering rule that gets a copy of all packets, even offloaded packets generated by or consumed by the hardware. The autoconf test for a usable version of libibverbs is a bit complicated because ibv_create_flow() is defined as an inline function in the header file, so we need to find the library and header and then try to link a program to check if the API is usable (it appeared in libibverbs 1.1.8).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any updates on merging this request?
Any updates on my question about device names? |
Hi Guy, If device has multiple ports, sniffing capability is generally on both all the ports. From the code it appears that to sniff on particular port, the format is something like below. Default it sniff on port 0. |
OK, so:
|
|
So, for ethX and ethY devices, different devices are normally on different networks; if you want to see the traffic on network X, you sniff on ethX, and if you want to see the traffic on network Y, you sniff on ethY. Do the RDMA devices differ in a similar fashion, so somebody would know that they'd want to sniff on |
Your description is correct. User would know whether he/she wants to sniff which device. Similar to network device which are identified based on device name, RDMA devices are also identified based on "device name" such as mlx5_0, mlx4_0, rxe0 etc. At this point sniffing traffic on particular network_port or connection are not supported. Its limited to sniff only at device level. But possibly it can extend in future for such extra sniffing parameters depending on users need. |
So it looks as if there's no human-readable description that could be given for a particular device identifying not only that it's an RDMA device but also that it's a device corresponding to whatever a particular RDMA device would correspond. It also looks as if you wouldn't necessarily want to open, for example, |
I think Wireshark GUI should let user choose device and its port, so pcap_findalldevs() should create a object that represents mlx5_0:0, mlx5_0:1 and sniffing should happen on it. I haven't reviewed the code frankly. But pcap_findalldevs() should provide information of device and port both, so that Wireshark (and human user) can pick appropriate port of it. Not sure if this can be done incrementally without breaking compatibility between pcap and Wireshark. |
I think the Wireshark GUI should let you do more than just select devices from the list it gets from
for netmap devices, so we have to change the way the Wireshark GUI works to allow arbitrary strings to be specified as interfaces, or to allow a particular pcap module to indicate module-specific parameters that can be specified by the program. I have a project to support those module-specific parameters; they would be specified with a command-line flag for command-line programs such as tcpdump or TShark, and GUI programs would have to ask libpcap what parameters a particular device supports and offer them in the GUI. In the case of RDMA sniffing devices, the port number would be one such parameter. But that doesn't yet exist, so you would either have to
or
|
I would go with (2) for now given the need of this feature. In future rdmasniff_findalldevs() can be enhanced to add per port sniffing device. |
Implement capture support for offloaded RDMA traffic. This uses the RDMA
verbs "flow steering" interface, which is available in the Linux kernel
since version 3.12. The userspace interface is ibv_create_flow() - so
building this support in pcap adds a new dependency on libibverbs.
I added a new "rdmasniff" pcap module, which exposes RDMA devices under an
interface name equal to their libibverbs name. The module uses the RDMA
verbs interface to create a receive queue with a flow steering rule that
gets a copy of all packets, even offloaded packets generated by or consumed
by the hardware.
I'm definitely not an expert on pcap, so please point out all the places
where I misused APIs or something could be improved.