-
Notifications
You must be signed in to change notification settings - Fork 356
cgroup, systemd: use BPFProgram=device if supported #1795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is a proof of concept to check and/or play with. It appears to work for me (runc's dev.bats tests work). CI may fail. TODO: - support device updates; - add tests; - add documentation. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Reviewer's GuideThis PR refactors device cgroup handling to build and load BPF programs when the BPF filesystem is available, and integrates BPFProgram support into the systemd path by pinning and appending a BPFProgram property in lieu of explicit device rules. Sequence diagram for systemd cgroup device BPF program integrationsequenceDiagram
participant systemd_integration as Systemd Integration
participant cgroup_resources as cgroup-resources
participant ebpf as ebpf
participant bpf_fs as BPF Filesystem
systemd_integration->>cgroup_resources: create_dev_bpf(devs, devs_len, err)
cgroup_resources->>ebpf: bpf_program_new/init/append/complete
cgroup_resources-->>systemd_integration: bpf_program
systemd_integration->>ebpf: libcrun_ebpf_load(program, -1, path, err)
ebpf->>bpf_fs: Pin BPF program to /sys/fs/bpf/crun/<id>
ebpf-->>systemd_integration: Success/Failure
systemd_integration->>systemd_integration: sd_bus_message_append("BPFProgram", "device", path)
Note right of systemd_integration: If BPFProgram is set, skip explicit device rules
Class diagram for BPF program integration and device cgroup handlingclassDiagram
class bpf_program {
+bpf_program_new(size_t size)
+bpf_program_init_dev(program, err)
+bpf_program_append_dev(program, access, type, major, minor, allow, err)
+bpf_program_complete_dev(program, err)
}
class cgroup_resources {
+create_dev_bpf(devs, devs_len, err)
+write_devices_resources_v2_internal(dirfd, devs, devs_len, err)
+write_devices_resources_v2(dirfd, devs, devs_len, err)
+can_skip_devices(can_skip, devs, devs_len, err)
}
class ebpf {
+libcrun_ebpf_load(program, dirfd, pin, err)
+has_bpf_fs()
}
bpf_program <.. cgroup_resources : uses
ebpf <.. cgroup_resources : uses
ebpf <.. cgroup_resources : uses
cgroup_resources <.. systemd_integration : used by
class systemd_integration {
+append_resources(m, is_update, state_dir, resources, cgroup_mode, id, err)
+enter_systemd_cgroup_scope(resources, cgroup_mode, annotations, state_root, scope, slice, pid, id, can_retry, err)
+libcrun_cgroup_enter_systemd(args, ...)
+libcrun_update_resources_systemd(cgroup_status, ...)
}
Class diagram for new and modified functions in ebpf and cgroup-resourcesclassDiagram
class ebpf {
+libcrun_ebpf_load(program, dirfd, pin, err)
+has_bpf_fs()
}
class cgroup_resources {
+create_dev_bpf(devs, devs_len, err)
+can_skip_devices(can_skip, devs, devs_len, err)
}
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Ephemeral COPR build failed. @containers/packit-build please check. |
TMT tests failed. @containers/packit-build please check. |
|
||
mkdir(CRUN_BPF_DIR, 0700); // Best effort. | ||
|
||
ret = append_paths (&path, err, CRUN_BPF_DIR, id, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it really stinks that systemd requires us to install a program instead of just reading a file, but definitely an improvement compared to the current status.
What we have in place is enough to make sure it is deleted when the container is deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it really stinks that systemd requires us to install a program instead of just reading a file
Yep.
What we have in place is enough to make sure it is deleted when the container is deleted?
We don't have anything yet, but this is easy to add, just haven't done it yet. In general this is still very raw code.
I was also thinking about using a shared program for a common case of default rules, but maybe it's too much hassle without any sizable benefit.
Another concern is using container ID for the program file name. Container ID is only unique in a given state ( WDYT @giuseppe ? |
what do you think if we use some hashing of the runroot? Could be some basic stuff, like adding all the chars to a uint64_t, so we don't have to worry for paths that are too long? @keszybz do you think there is any hope systemd would accept a path for |
This is a proof of concept to take a look at and/or play with.
It appears to work for me (runc's dev.bats tests work) but was not tested much yet. CI may fail.
TODO:
Related to #1765.
Summary by Sourcery
Use eBPF programs to apply device cgroup v2 policies and inject them into systemd via the BPFProgram property when supported, while preserving compatibility with legacy device permissions.
New Features:
Enhancements: