-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Netlink sockets #3198
Netlink sockets #3198
Conversation
Hi @ppopth! Wow, thanks for submitting the PR! It might take us a bit to review this, but before we do I think we will want you to rebase against the main branch so that (1) you can take care of fixing the conflicts, and (2) then we can run the shadow CI tests against it to help us during review. Once you rebase, fix the conflicts, and have the tests passing locally, go ahead and force-push the update onto the same branch which should auto-update this PR. Then we'll find some time to take a look more closely at it. Thanks again! |
I rebased it to the main branch already. |
After seeing the lint errors, I decided to change many things. It's better to close this and I will open it again soon. |
Reopened this PR, since I have already changed what I think I need to. I think the code is now far better. It's now ready for review. Thanks for your patience. The commits before 26927f6 will be untouched. |
f0da39e
to
26927f6
Compare
Hi @ppopth ! Thanks for the updates. I see that you added tests which is awesome! We will do a more thorough review, but for now two quick points about the tests:
This PR is already big enough, so I wouldn't change anything here. But would you be willing to work on the above two points in follow-up PRs to really help solidify NetLink as a more roundly supported and better-tested socket type? |
Not actually true. What I thought is that it's good to write the test in C because all of the libc and POSIX syscall APIs are documented and fully supported in C, and all the examples in the man pages are in C as well. If the tests are implemented in Rust, it's quite hard to see how the Rust code actually calls the syscall. In addition, sometimes we will probably find it hard to find a proper crate to do the syscall. While, in C, you can just use the POSIX syscall APIs provided with Linux. I think it makes sense to implement everything that is supposed to be in Shadow and the shim in Rust, but I think that's not necessary for the tests for the reason described above. Do you disagree with this? I'm quite curious why the tests should be all in Rust instead.
Yeah, it's a good point. I agree with that. I didn't know and didn't notice that previously. I will look at the code in |
In places where we're specifically trying to invoke a specific syscall or libc function under test, we usually do it through the libc crate for this reason. We still take advantage of Rust's higher-level APIs for other things, like setup and teardown of state, threads, etc |
Just building a bit on what @sporksmith wrote: Rust's libc crate provides the sort of "direct syscall access" that is available to C code. So if you are writing a test that is designed to test say But there are many other places in a test where you just want some functionality and you don't necessarily care which syscall is being invoked, because that specific test case isn't trying to test the ancillary functionality. For that we found it much nicer (easier to write the code and reason about its correctness) when using the higher level Rust APIs. We also found it much easier to write the more rigorous tests that I described above (let's call them fuzz tests) using Rust simply because the language has more ready-to-use constructs and crates that help make it easier. We already have some nice helper functions in rust that can make the fuzz-type testing easier. I wouldn't want us to have to write and maintain those helpers for every language. |
Oh also, thanks for starting this conversation about testing! ❤️ You are helping me realize that we don't have enough documentation about our expectations around testing, where to look for examples of good fuzz testing practices, etc. I will work on updating our documentation to be clear about our expectations to help clear this up for future work too :) |
I'll also add that we usually only test the libc syscall wrappers (using the libc library as mentioned above), but sometimes it can be useful to test the actual kernel syscalls themselves. For example Go doesn't typically use the libc wrappers, so sometimes will make the syscalls in a different way than C code does with the libc wrappers. In this case you can use the linux-api library to test the actual syscall without any modifications that glibc makes. The linux-api library only implements a handful of syscalls, and we've been adding more as we need them. We don't test the actual kernel syscalls very often (although maybe we should), so I would just test the libc wrapper unless the libc API is significantly different from the kernel API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great!
I'm going to mark as "request changes" for now and then I'll re-review the updates.
I did not repeat my prior testing-related comments here, because I assume we'll handle Rust testing and fuzz testing in separate PRs.
I'm hoping we can get @sporksmith's thoughts to help clear up some of my uncertainty around my linux-api
-related comments to make sure we are doing the right thing by preferring to add parts to linux-api
rather than adding a dependency on the linux-raw-sys
crate.
Does this mean that I should remove the tests for capset/capget? since they are only invoked by In addition, it also seems hard for me to find a library in Rust or Go that invokes those syscalls directly. |
@robgjansen @sporksmith @stevenengler All the comments have been addressed already. I haven't resolved the conflicts yet because I don't want to lose the history. I think I will do it after you review the changes. Thanks. |
This commit allows us to peek the packets stored inside the buffer without removing them from the buffer.
Since we implemented peek_packet in SharedBuf already, we can now support MSG_PEEK in the recvmsg syscall for Netlink sockets.
Previously we handle only RTM_GETADDR, this commit now enables us to handle RTM_GETLINK. This commit implements the RTM_GETLINK handling part in a method called handle_ifinfomsg. In handle_ifinfomsg, it works as follows: - Check that ifi_family of the message is either AF_UNSPEC or AF_INET because we support only IPv4 addresses - Check that the rest of the fields (except ifi_change) are zeroes because they are unsupported - Send the message containing all the interfaces with the following attributes: IFLA_IFNAME, IFLA_TXQLEN, and IFLA_MTU We can also send the attributes IFLA_ADDRESS and IFLA_BROADCAST for the MAC address, but we mark it as a todo instead. The ifi_flags of the loopback interface is IFF_UP, IFF_LOOPBACK, and IFF_RUNNING. The ifi_flags of the ethernet interface is IFF_UP, IFF_BROADCAST, IFF_RUNNING, and IFF_MULTICAST. All the flags listed above are from what I observe in the `ip addr` command.
Previously we have only the is_bound field in InitialState which is used to indicate whether the socket is already bound or not. Now we want to implement getsockname which requires us to return the bound address, so we need to change from the boolean is_bound to the concrete bound_addr field.
In capget, we will always return the empty capability bitfield because we don't allow any capability in the plugin. In capset, we will allow to set with the empty capability bitfield.
This commits adds both the linux tests and the shadow tests for both capget and capset. We assign them as extra tests because they require the local machine to install libcap-dev.
Currently we handle only (SOL_SOCKET, SO_SNDBUF) and (SOL_SOCKET, SO_RCVBUF). In SO_SNDBUF, we set the send buffer limit, while, in SO_RCVBUF, we do nothing.
Since `ip addr` uses a netlink socket to get the information of the interfaces and addresses, we use it to test our Netlink implementation. We check that the output of the `ip addr` command matches the ip-addr.stdout file. We mark the test as an extra because it requirs the iproute2 package.
The user can set the send buffer limit by setting SO_SNDBUF using setsockopt. We test this for Netlink sockets by setting this option and send a lot of messages to the socket.
This commit installs libcap and ip to the CI because capget,capset require libcap and ip is used to test the Netlink sockets. In Debian/Ubuntu, the corresponding packages are libcap-dev and iproute2. In Fedora, the corresponding packages are libcap-devel and iproute.
Replace some log::warn! with warn_once_then_debug!
Define everything that is neeeded from linux-raw-sys by Netlink sockets in linux-api instead.
Don't return an error if the socket is bound to the non-zero groups, since it's not fatal and it's better allow it to do so.
If it tries to bind to Netlink groups other than RTMGRP_IPV4_IFADDR or RTMGRP_IPV6_IFADDR, return an error.
It turned out that it's not as hard as I thought. I squashed all the fix-up commits already. See this branch https://github.com/ppopth/shadow/tree/netlink2 Please let me know when you are ready to see it in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hard work and for sticking with us during the long review process!
I think you can go ahead and rebase to squash the fixups now. If you want to go further and squash into just 3 commits as suggested by @stevenengler, I think that would make sense. The code from some of the earlier commits was removed in later commits, so squashing down to 3 commits would keep things cleaner. After that, force push the cleaned branch and we'll work on merging. Thanks! |
Following #2980
This PR implements Netlink sockets. Please look at netlink(7) and rtnetlink(7) before reviewing the PR so that you can understand it better.
I use the
ip addr
command as a baseline. This PR implements anything that is required by that command:capget
,catpset
,setsockopt
, andgetsockname
.How it works
When the user calls
sendmsg
we check only the metadata, not the content of the message, like the destination address, the supported sendmsg flags. If it passes all the checks, we simply put the message into the buffer.When the user calls
recvmsg
we get the message from the buffer, process the message, and return the response. Currently we support onlyRTM_GETLINK
andRTM_GETADDR
. For each message, we return a bunch ofRTM_NEWLINK
s andRTM_NEWADDR
s respectively.Commit messages
Please also look at the commit messages. I think it explains things in detail.
commit 26927f6
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Wed Oct 4 21:22:45 2023 +0700
commit b79a26c
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Mon Oct 2 14:41:30 2023 +0700
commit f4c33f7
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Sun Oct 1 23:11:19 2023 +0700
commit 9778bd3
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Fri Sep 22 13:23:04 2023 +0700
commit 0cd5678
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Sat Sep 30 17:38:50 2023 +0700
commit ec4d843
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Wed Sep 20 14:59:19 2023 +0700
commit 25223ea
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Wed Sep 20 16:59:05 2023 +0700
commit 65ea91a
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Fri Sep 22 12:55:43 2023 +0700
commit 15a2073
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Tue Sep 26 00:01:25 2023 +0700
commit 7c8b20a
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Mon Sep 25 22:23:09 2023 +0700
commit 9d84b69
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Fri Sep 22 20:03:38 2023 +0700
commit 8db9d96
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Tue Sep 19 11:11:16 2023 +0700
commit 01385f3
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Sat Sep 16 23:43:10 2023 +0700
commit 70068ae
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Thu Sep 14 14:11:31 2023 +0700
commit 5a40f1c
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Mon Aug 14 20:03:57 2023 +0700
commit 7464ab5
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Sat Aug 12 04:22:35 2023 +0700
commit 7de08d3
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Fri Aug 11 13:24:02 2023 +0700
commit 4825e9a
Author: Suphanat Chunhapanya haxx.pop@gmail.com
Date: Thu Aug 10 14:37:51 2023 +0700