-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test/verifier: Temporarily reenable complexity tests on K8sDatapathVerifier #24335
Closed
gentoo-root
wants to merge
1
commit into
cilium:master
from
gentoo-root:verify-bpf-next-by-k8sverifier
Closed
test/verifier: Temporarily reenable complexity tests on K8sDatapathVerifier #24335
gentoo-root
wants to merge
1
commit into
cilium:master
from
gentoo-root:verify-bpf-next-by-k8sverifier
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…rifier While the GitHub Actions workflow is broken for bpf-next kernels, temporarily reenable the old test. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
pchaigno
approved these changes
Mar 13, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Maxim!
/test-1.26-net-next |
Unfortunately the K8sVerifier test failed for net-next with the same error we saw in ci-verifier, so we're facing a regression in our datapath and not in the kernel's behavior 😞 |
nbusseneau
approved these changes
Mar 14, 2023
borkmann
added a commit
that referenced
this pull request
Mar 16, 2023
Paul reported that verifier complexity tests were broken on bpf-next for a while due to the GH action testing the wrong datapath configurations for some months. This slipped in via #22754. Paul says: As part of the verifier tests, we first compile the datapath then attempt to load it. We do that for a set of kernels and datapath configurations. We compile e.g. bpf_lxc with: make -C bpf KERNEL=[kernel] MAX_LXC_OPTIONS=[config] That method of passing the two environment variables however doesn't seem to work in the CI for some reason. The datapath config used for the compilation ends up empty and the default one is used instead. That means the verifier test has been covering a subset of the expected datapath configurations. The 4.19 kernel is not affected because it was kept in the previous K8sVerifier ginkgo test. Specifically, bpf-next fails with an invalid size of register fill error from the verifier: [...] 578: R0=scalar() 578: (79) r6 = *(u64 *)(r10 -288) ; R6_w=ctx(off=0,imm=0) R10=fp0 579: (b4) w9 = -141 ; R9_w=4294967155 580: (c6) if w0 s< 0x0 goto pc+2320 ; R0=scalar(smax=9223372034707292159,umax=18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min=0,u32_max=2147483647) ; if (no_neigh) { 581: (61) r1 = *(u32 *)(r10 -292) ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 582: (16) if w1 == 0x0 goto pc+2292 ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; dmac = nh_params.nh_family == AF_INET ? 583: (18) r1 = 0xffff8c030542d400 ; R1_w=map_ptr(off=0,ks=4,vs=8,imm=0) 585: (61) r2 = *(u32 *)(r10 -304) invalid size of register fill processed 7447 insns (limit 1000000) max_states_per_insn 5 total_states 426 peak_states 420 mark_read 117 Maxim built a small reproducer: 1) Setup LVH: # mkdir -v /tmp/lvh # docker run -v /tmp/lvh:/mnt/images quay.io/lvh-images/complexity-test:bpf-next-main cp /data/images/complexity-test_bpf-next.qcow2.zst /mnt/images/ # zstd -d /tmp/lvh/complexity-test_bpf-next.qcow2.zst -o /tmp/datapath-bpf-complexity.qcow2 # rm -vf /tmp/lvh/complexity-test_bpf-next.qcow2.zst # rmdir -v /tmp/lvh # ./lvh run --image /tmp/datapath-bpf-complexity.qcow2 --host-mount "$YOUR_CILIUM_REPOSITORY_DIR" 2) Inside the VM: # cd /host # make -C bpf KERNEL=netnext MAX_HOST_OPTIONS='-DSKIP_DEBUG=1 -DENABLE_IPV4=1 -DENABLE_IPV6=1 -DENABLE_ROUTING=1 -DPOLICY_VERDICT_NOTIFY=1 -DALLOW_ICMP_FRAG_NEEDED=1 -DENABLE_IDENTITY_MARK=1 -DMONITOR_AGGREGATION=3 -DCT_REPORT_FLAGS=0x0002 -DENABLE_HOST_FIREWALL=1 -DENABLE_ICMP_RULE=1 -DENABLE_CUSTOM_CALLS=1 -DENABLE_SRV6=1 -DHAVE_LPM_TRIE_MAP_TYPE=1 -DHAVE_LRU_HASH_MAP_TYPE=1 -DENABLE_MASQUERADE=1 -DENABLE_SRC_RANGE_CHECK=1 -DENABLE_NODEPORT=1 -DENABLE_NODEPORT_ACCELERATION=1 -DENABLE_SESSION_AFFINITY=1 -DENABLE_DSR_ICMP_ERRORS=1 -DENABLE_DSR=1 -DENABLE_DSR_HYBRID=1 -DENABLE_IPV4_FRAGMENTS=1 -DENABLE_TPROXY=1 -DENABLE_HOST_ROUTING=1 -DENABLE_BANDWIDTH_MANAGER=1 -DETH_HLEN=0 -DENCAP_IFINDEX=1 -DTUNNEL_MODE=1 -DENABLE_EGRESS_GATEWAY=1 -DENABLE_VTEP=1 -DENABLE_SCTP=1' bpf_host.o # env TC_PROGS="" XDP_PROGS="" CG_PROGS="" TC_PROGS="bpf_host" ./test/bpf/verifier-test.sh The above verifier log with nh_params.nh_family == AF_INET test seems to make little sense. It looks like the *(u32 *)(r10 -304) is about fetching the nh_family, but the verifier thinks that the data on stack was a spilled pointer. We can actually fix this by using fib_params->l.family directly which would also simplify the code a bit. For the next hop, the fib lookup helper updates the field depending on whether the next hop is a v4 or v6 address. Related: #24335 Reported-by: Paul Chaignon <paul@cilium.io> Co-developed-by: Maxim Mikityanskiy <maxim@isovalent.com> Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
7 tasks
borkmann
added a commit
that referenced
this pull request
Mar 16, 2023
Paul reported that verifier complexity tests were broken on bpf-next for a while due to the GH action testing the wrong datapath configurations for some months. This slipped in via #22754. Paul says: As part of the verifier tests, we first compile the datapath then attempt to load it. We do that for a set of kernels and datapath configurations. We compile e.g. bpf_lxc with: make -C bpf KERNEL=[kernel] MAX_LXC_OPTIONS=[config] That method of passing the two environment variables however doesn't seem to work in the CI for some reason. The datapath config used for the compilation ends up empty and the default one is used instead. That means the verifier test has been covering a subset of the expected datapath configurations. The 4.19 kernel is not affected because it was kept in the previous K8sVerifier ginkgo test. Specifically, bpf-next fails with an invalid size of register fill error from the verifier: [...] 578: R0=scalar() 578: (79) r6 = *(u64 *)(r10 -288) ; R6_w=ctx(off=0,imm=0) R10=fp0 579: (b4) w9 = -141 ; R9_w=4294967155 580: (c6) if w0 s< 0x0 goto pc+2320 ; R0=scalar(smax=9223372034707292159,umax=18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min=0,u32_max=2147483647) ; if (no_neigh) { 581: (61) r1 = *(u32 *)(r10 -292) ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 582: (16) if w1 == 0x0 goto pc+2292 ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; dmac = nh_params.nh_family == AF_INET ? 583: (18) r1 = 0xffff8c030542d400 ; R1_w=map_ptr(off=0,ks=4,vs=8,imm=0) 585: (61) r2 = *(u32 *)(r10 -304) invalid size of register fill processed 7447 insns (limit 1000000) max_states_per_insn 5 total_states 426 peak_states 420 mark_read 117 Maxim built a small reproducer: 1) Setup LVH: # mkdir -v /tmp/lvh # docker run -v /tmp/lvh:/mnt/images quay.io/lvh-images/complexity-test:bpf-next-main cp /data/images/complexity-test_bpf-next.qcow2.zst /mnt/images/ # zstd -d /tmp/lvh/complexity-test_bpf-next.qcow2.zst -o /tmp/datapath-bpf-complexity.qcow2 # rm -vf /tmp/lvh/complexity-test_bpf-next.qcow2.zst # rmdir -v /tmp/lvh # ./lvh run --image /tmp/datapath-bpf-complexity.qcow2 --host-mount "$YOUR_CILIUM_REPOSITORY_DIR" 2) Inside the VM: # cd /host # make -C bpf KERNEL=netnext MAX_HOST_OPTIONS='-DSKIP_DEBUG=1 -DENABLE_IPV4=1 -DENABLE_IPV6=1 -DENABLE_ROUTING=1 -DPOLICY_VERDICT_NOTIFY=1 -DALLOW_ICMP_FRAG_NEEDED=1 -DENABLE_IDENTITY_MARK=1 -DMONITOR_AGGREGATION=3 -DCT_REPORT_FLAGS=0x0002 -DENABLE_HOST_FIREWALL=1 -DENABLE_ICMP_RULE=1 -DENABLE_CUSTOM_CALLS=1 -DENABLE_SRV6=1 -DHAVE_LPM_TRIE_MAP_TYPE=1 -DHAVE_LRU_HASH_MAP_TYPE=1 -DENABLE_MASQUERADE=1 -DENABLE_SRC_RANGE_CHECK=1 -DENABLE_NODEPORT=1 -DENABLE_NODEPORT_ACCELERATION=1 -DENABLE_SESSION_AFFINITY=1 -DENABLE_DSR_ICMP_ERRORS=1 -DENABLE_DSR=1 -DENABLE_DSR_HYBRID=1 -DENABLE_IPV4_FRAGMENTS=1 -DENABLE_TPROXY=1 -DENABLE_HOST_ROUTING=1 -DENABLE_BANDWIDTH_MANAGER=1 -DETH_HLEN=0 -DENCAP_IFINDEX=1 -DTUNNEL_MODE=1 -DENABLE_EGRESS_GATEWAY=1 -DENABLE_VTEP=1 -DENABLE_SCTP=1' bpf_host.o # env TC_PROGS="" XDP_PROGS="" CG_PROGS="" TC_PROGS="bpf_host" ./test/bpf/verifier-test.sh The above verifier log with nh_params.nh_family == AF_INET test seems to make little sense. It looks like the *(u32 *)(r10 -304) is about fetching the nh_family, but the verifier thinks that the data on stack was a spilled pointer. We can actually fix this by using fib_params->l.family directly which would also simplify the code a bit. For the next hop, the fib lookup helper updates the field depending on whether the next hop is a v4 or v6 address. Related: #24335 Reported-by: Paul Chaignon <paul@cilium.io> Co-developed-by: Maxim Mikityanskiy <maxim@isovalent.com> Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While the GitHub Actions workflow is broken for bpf-next kernels, temporarily reenable the old test.