CI: verifier-test.sh always skips cgroup programs #17125

pchaigno · 2021-08-10T15:11:34Z

Commit 696012f changed the section names for the cgroup BPF programs (e.g., from from-sock4 to connect4), but didn't change the corresponding code in verifier-test.sh:

cilium/test/bpf/verifier-test.sh

Lines 125 to 144 in c6d25fc

    
           if check_macro CGROUP_INET4_CONNECT; then 
        
           	prog_types["from-sock4"]="sockaddr" 
        
           	attach_types["from-sock4"]="connect4" 
        
           fi 
        
           if check_macro CGROUP_INET6_CONNECT; then 
        
           	prog_types["from-sock6"]="sockaddr" 
        
           	attach_types["from-sock6"]="connect6" 
        
           fi 
        
           if check_macro UDP4_RECVMSG; then 
        
           	prog_types["snd-sock4"]="sockaddr" 
        
           	prog_types["rcv-sock4"]="sockaddr" 
        
           	attach_types["snd-sock4"]="sendmsg4" 
        
           	attach_types["rcv-sock4"]="recvmsg4" 
        
           fi 
        
           if check_macro UDP6_RECVMSG; then 
        
           	prog_types["snd-sock6"]="sockaddr" 
        
           	prog_types["rcv-sock6"]="sockaddr" 
        
           	attach_types["snd-sock6"]="sendmsg6" 
        
           	attach_types["rcv-sock6"]="recvmsg6" 
        
           fi

As a result we always skip those programs in the K8sVerifier test. Maybe we should also find a way to error out when skipping something that should be tested.

/cc @borkmann

The text was updated successfully, but these errors were encountered:

borkmann · 2021-08-10T15:17:15Z

Sigh, good catch, I was not aware we had these; we should definitely find a way to error.

joestringer · 2021-08-10T19:36:19Z

I think the reason we don't error out right now is that depending on kernel support, it's expected that we skip the CG tests (why error out if you can't run those tests?). That's the first case in the load_cg() conditional checks:

cilium/test/bpf/verifier-test.sh

Lines 193 to 195 in c6d25fc

    
           if [ "${prog_types[$section]}" == "" ]; then 
        
           	echo "=> Skipping ${p}.c:$section" 
        
           	continue

pchaigno · 2021-08-10T20:40:44Z

Is there a way to error out other than passing the KERNEL version in the script? I was hoping to avoid that, but I don't really see another solution :-(

joestringer · 2021-08-10T21:06:03Z

Maybe bpftool feature probe | grep ...CGROUP... for the most recent cgroup BPF feature we rely on?

pchaigno · 2021-08-11T10:56:29Z

What do you mean? We already rely on bpftool feature probe:

cilium/test/bpf/verifier-test.sh

Line 125 in c6d25fc

if check_macro CGROUP_INET4_CONNECT; then

joestringer · 2021-08-11T17:44:44Z

Assuming that UDP6_RECVMSG is the most recent cgroups bpf hook we use in the script, I was thinking something like this:

$ git di
diff --git a/test/bpf/verifier-test.sh b/test/bpf/verifier-test.sh
index 582e2229c9d5..ae07e846ecbd 100755
--- a/test/bpf/verifier-test.sh
+++ b/test/bpf/verifier-test.sh
@@ -191,6 +191,11 @@ function load_cg {
                ELF_SECTIONS="$(readelf -S $DIR/${p}.o)"
                for section in $(get_section ${p}.c); do
                        if [ "${prog_types[$section]}" == "" ]; then
+                               if check_macro UDP6_RECVMSG; then
+                                       echo "=> CGroups BPF programs should be supported, but this script does not understand how to load ${p} (section $section)." 2>&1
+                                       echo "   Please fix me 🥺." 2>&1
+                                       exit 1
+                               fi
                                echo "=> Skipping ${p}.c:$section"
                                continue
                        elif [ "${prog_types[$section]}" == "SKIP" ]; then

pchaigno · 2021-08-11T19:20:59Z

Ah, I see. That's a bit specific to our current issue, but it's probably enough 👍

joestringer · 2021-08-11T21:28:10Z

Yeah true. Maybe something more reliable would be to verify that there are no "Skipping" lines in the output when running this test on net-next...?

pchaigno · 2021-08-11T21:29:30Z

Ah, of course! 🤦 We can easily do that from K8sVerifier, where there's already per-version code.

aditighag · 2021-08-11T21:43:56Z

I'm just curious - why don't we run bpftool cgroup show ... to verify that all the expected programs were loaded on net-next?

joestringer · 2021-09-08T18:06:55Z

@aditighag I think the question is how we determine the "all the expected programs" from your question above. If we need to manually maintain a list of programs somewhere (eg to check whether they are present in bpftool cgroup show), then that list can still get out of sync compared to other parts of the codebase.

From that perspective, the right approach to me is one step more abstract: Assemble a list of all possible programs from the tree programmatically, and ensure that every one of them is loaded. If any of the programs in the tree are not loaded, then the script needs fixing and it should quit out. The script should be doing 90% of this already, it just prints "Skipping ..." if it cannot load a program. The missing piece is that we don't force the test to fail in that case.

pchaigno added kind/bug/CI This is a bug in the testing code. area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me! labels Aug 10, 2021

pchaigno mentioned this issue Aug 10, 2021

datapath: Use iproute2+libbpf #16727

Merged

joestringer removed the ci/flake This is a known failure that occurs in the tree. Please investigate me! label Sep 8, 2021

This comment was marked as outdated.

Sign in to view

github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 9, 2022

pchaigno added pinned These issues are not marked stale by our issue bot. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Jul 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: verifier-test.sh always skips cgroup programs #17125

CI: verifier-test.sh always skips cgroup programs #17125

pchaigno commented Aug 10, 2021

borkmann commented Aug 10, 2021

joestringer commented Aug 10, 2021 •

edited

pchaigno commented Aug 10, 2021

joestringer commented Aug 10, 2021

pchaigno commented Aug 11, 2021

joestringer commented Aug 11, 2021

pchaigno commented Aug 11, 2021

joestringer commented Aug 11, 2021

pchaigno commented Aug 11, 2021

aditighag commented Aug 11, 2021

joestringer commented Sep 8, 2021

This comment was marked as outdated.

CI: verifier-test.sh always skips cgroup programs #17125

CI: verifier-test.sh always skips cgroup programs #17125

Comments

pchaigno commented Aug 10, 2021

borkmann commented Aug 10, 2021

joestringer commented Aug 10, 2021 • edited

pchaigno commented Aug 10, 2021

joestringer commented Aug 10, 2021

pchaigno commented Aug 11, 2021

joestringer commented Aug 11, 2021

pchaigno commented Aug 11, 2021

joestringer commented Aug 11, 2021

pchaigno commented Aug 11, 2021

aditighag commented Aug 11, 2021

joestringer commented Sep 8, 2021

This comment was marked as outdated.

joestringer commented Aug 10, 2021 •

edited