Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: move 4.19 complexity tests to tests-datapath-verifier GHA workflow #24517

Merged
merged 4 commits into from Apr 4, 2023

Conversation

tklauser
Copy link
Member

@tklauser tklauser commented Mar 22, 2023

It seems the verifier issues with 4.19 are resolved when running complexity tests using LVH on that version. Let's cover it in the the GitHub actions workflow instead of Ginkgo tests.

Successful run with test commit: https://github.com/cilium/cilium/actions/runs/4489169184/jobs/7894672779

@tklauser tklauser added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/ci This PR makes changes to the CI. ci/hyperjump labels Mar 22, 2023
@tklauser tklauser force-pushed the pr/tklauser/complexity-4.19-gha branch 2 times, most recently from fff56b8 to 221957b Compare March 22, 2023 12:24
@tklauser tklauser marked this pull request as ready for review March 22, 2023 12:25
@tklauser tklauser requested review from a team as code owners March 22, 2023 12:25
aanm added a commit that referenced this pull request Mar 22, 2023
It's being removed in #24517

Signed-off-by: André Martins <andre@cilium.io>
@tklauser tklauser force-pushed the pr/tklauser/complexity-4.19-gha branch from 221957b to 9265243 Compare March 23, 2023 05:59
Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@ti-mo ti-mo added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 24, 2023
@ti-mo
Copy link
Contributor

ti-mo commented Mar 24, 2023

/test

@pchaigno pchaigno removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 24, 2023
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a bit of a pain, but this is highly suspicious to me. AFAIK, the complexity issue we were hitting was #17647 (comment), which never got fixed on our side and was happening for more than a year.

Did it get fixed in the kernel without us knowing about it? Is there something else that fixed it very recently on our side? Or is this test actually broken again and not testing what we think?

I'll try to check if something recent on the kernel side could explain this.

@ti-mo
Copy link
Contributor

ti-mo commented Mar 24, 2023

Sorry to be a bit of a pain, but this is highly suspicious to me. AFAIK, the complexity issue we were hitting was #17647 (comment), which never got fixed on our side and was happening for more than a year.

Did it get fixed in the kernel without us knowing about it? Is there something else that fixed it very recently on our side? Or is this test actually broken again and not testing what we think?

I'll try to check if something recent on the kernel side could explain this.

No problem at all, we need to know this stuff for sure. Take a look at this: https://github.com/cilium/cilium/actions/runs/4509588303/jobs/7939538892?pr=24538. This has much more detailed output and verifier stats for each tail call. It's based on the changes in this PR (adding 4.19 coverage) and loading all ELFs using Go.

I'll add a uname at the start of the test run so we know exactly what we're dealing with.

@ti-mo ti-mo requested a review from a team as a code owner April 3, 2023 13:46
Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting my docs-structure hat on, and approving again

@tklauser tklauser force-pushed the pr/tklauser/complexity-4.19-gha branch from 0502cce to 359d398 Compare April 4, 2023 08:04
@lmb lmb self-assigned this Apr 4, 2023
@lmb lmb force-pushed the pr/tklauser/complexity-4.19-gha branch 5 times, most recently from 637b583 to 989f6e2 Compare April 4, 2023 14:02
The verifier is prone to produce several megabytes of log output
when things break. Writing this into the test output makes it difficult
to discern which tests actually fail.

Only dump the last 10 lines of verifier output to stderr and write
the full log to a file instead.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
@lmb lmb force-pushed the pr/tklauser/complexity-4.19-gha branch 3 times, most recently from 289be27 to cbac0c4 Compare April 4, 2023 15:34
@lmb
Copy link
Contributor

lmb commented Apr 4, 2023

/test-ci-verifier

@lmb lmb force-pushed the pr/tklauser/complexity-4.19-gha branch from cbac0c4 to ef7e456 Compare April 4, 2023 16:03
lmb and others added 3 commits April 4, 2023 17:16
The bpf_lxc complexity test fails on current 4.19 kernels due to a
backported kernel fix [0] that increases complexity across the board.
Our CI has so far not picked this up since the Jenkins VMs run a kernel
without the backport.

Reduce the tested feature set on affected 4.19 configurations so that
we can use the new LVH / GHA based CI infrastructure.

0: #17647 (comment)

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
It seems the verifier issues with 4.19 are resolved when running
complexity tests using LVH on that version. Let's cover it in the GitHub
actions workflow instead of Ginkgo tests.

Signed-off-by: Tobias Klauser <tobias@cilium.io>
With all verifier-test.sh logic ported to a Go-driven test in package
test/verifier, this script is no longer being used.

The functionality of check-complexity.sh is provided out-of-the-box
by the Go verifier tests. Running `go test -v` displays verifier stats
for each tail call in Cilium.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@lmb lmb force-pushed the pr/tklauser/complexity-4.19-gha branch from ef7e456 to 4d747e2 Compare April 4, 2023 16:17
@lmb
Copy link
Contributor

lmb commented Apr 4, 2023

This is ✔️ on 4.19 now, with only cluster aware addressing disabled: https://github.com/cilium/cilium/actions/runs/4609988154/jobs/8147889395

@lmb lmb requested a review from pchaigno April 4, 2023 16:25
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (assuming the test commit is removed of course).

@pchaigno pchaigno added the kind/complexity-issue Relates to BPF complexity or program size issues label Apr 4, 2023
@lmb lmb force-pushed the pr/tklauser/complexity-4.19-gha branch from 4d747e2 to 9c7c3b2 Compare April 4, 2023 16:37
@lmb lmb removed the dont-merge/blocked Another PR must be merged before this one. label Apr 4, 2023
@tklauser tklauser added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 4, 2023
Copy link
Contributor

@zacharysarah zacharysarah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from a docs perspective. 👍🏻

@borkmann borkmann merged commit 60dfd1a into master Apr 4, 2023
43 checks passed
@borkmann borkmann deleted the pr/tklauser/complexity-4.19-gha branch April 4, 2023 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/hyperjump kind/complexity-issue Relates to BPF complexity or program size issues ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet