CI: Verifier tests: Keep generated object files and logs on test failure #25862

qmonnet · 2023-06-02T13:09:26Z

The CI workflow for testing Cilium's programs against the verifier is supposed to upload its log files and the generated files on failures. However, we lose some of this data, because:

Multiple jobs (for different kernel versions) upload their log and object files to the same artifact, overwriting previous uploads in the process, so that we only keep the files uploaded by the last job to complete. We can address that by using different artifact names for the different jobs.
The test itself runs make -C bpf/ clean before testing the programs of each source file, thus removing the object files previously generated. Instead, we can clean up the directory just once before running the tests, not between two tests.
At last, each source file is compiled multiple times with different sets of options, resulting in only the last version of this object file being kept. The solution implemented here consists in renaming these object files (on test failure) to make sure we keep them around.

qmonnet · 2023-06-02T13:14:37Z

Tested on another PR, for example on this run.

tklauser

Nice, thanks! One nit inline in case a respin is needed.

test/verifier/verifier_test.go

ti-mo · 2023-06-02T15:07:03Z

Let's give all .o's a unique name instead and include it as part of the name of the subtest. I think renaming is confusing, and it'd be nice to have all objs in the zip for debugging.

qmonnet · 2023-06-02T15:30:51Z

@ti-mo: I don't mind adding all .o's to the artifacts. But that still means renaming them, right? Only we rename them all, and earlier, instead of doing it only on failure.

We could try to generate them with the right name from the beginning, but given we're executing a call to make to compile them, it will be quite involved if we have to update the Makefile to support that?

test/verifier/verifier_test.go

viktor-kurchenko

LGTM.

julianwiedmann

lgtm

qmonnet · 2023-06-05T14:49:23Z

Updated to rename all object files regardless of exit status. I checked (locally) that all object files are preserved. @ti-mo does this correspond to what you had in mind?

ti-mo · 2023-06-06T09:15:12Z

Updated to rename all object files regardless of exit status. I checked (locally) that all object files are preserved. @ti-mo does this correspond to what you had in mind?

Yes, thanks! The make invocation is indeed what deterred me from doing the renaming at the time. We can start using Cilium's Go clang pipeline here at some point for more flexibility. In the meantime, this will do!

ti-mo · 2023-06-06T09:16:17Z

Not sure why ci-verifier is not starting since the last push was 18 hours ago.

ti-mo · 2023-06-06T09:18:19Z

/test

qmonnet · 2023-06-06T13:14:51Z

/test-runtime

qmonnet · 2023-06-06T13:18:58Z

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathConfig Iptables Skip conntrack for pod traffic

Failure Output

FAIL: Found 2 k8s-app=cilium logs matching list of errors that must be investigated:

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/505/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

The verifier tests include a run of "make -C bpf/ clean" prior to testing the programs in each BPF source files. While this sounds like a sane practice, this means that at the end of the run, we're left with only one object file under bpf/, the one used by the last tests (namely: bpf_sock.o). While the object files are trivial to rebuild locally, this prevents the CI workflow to find and upload the relevant object files for instances where the verifier's logs are not enough to debug. Let's run instead a single "make -C bpf/ clean" on startup. Signed-off-by: Quentin Monnet <quentin@isovalent.com>

For a given source file, we build the object multiple times, with a different set of build options. This results in object files being lost on failure. To address this, we rename the object files to include the iteration number, so that they're still present when we upload to the CI artifact. Signed-off-by: Quentin Monnet <quentin@isovalent.com>

In the tests-datapath-verifier workflow, all jobs generated from the matrix (for running the tests with different kernel versions) upload files to the same artifact on failure. This means that a job may overwrite the upload from a previous job, and in the end we only get the uploads from the last job to complete. If several jobs fail with different errors, then we lose useful debugging information. Let's have the jobs upload their files to separate artifacts, named after the kernel version in use. Signed-off-by: Quentin Monnet <quentin@isovalent.com>

qmonnet · 2023-06-07T08:32:46Z

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' hit: #25958 (89.87% similarity)

qmonnet · 2023-06-07T12:14:11Z

Runtime tests on Jenkins and GitHub are currently broken due to #25968, and net-next on Jenkins hit #25958 - both unrelated to the current PR.

None of these workflows are affected by the change in this PR. I'm marking as ready to merge.

qmonnet added area/CI Continuous Integration testing issue or flake sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/ci This PR makes changes to the CI. labels Jun 2, 2023

qmonnet requested review from a team as code owners June 2, 2023 13:09

qmonnet requested review from ti-mo, tklauser, julianwiedmann and viktor-kurchenko June 2, 2023 13:09

tklauser approved these changes Jun 2, 2023

View reviewed changes

test/verifier/verifier_test.go Outdated Show resolved Hide resolved

qmonnet force-pushed the pr/ci-verifier-keep-objfiles branch from 6068539 to 8178f17 Compare June 2, 2023 14:43

viktor-kurchenko reviewed Jun 5, 2023

View reviewed changes

test/verifier/verifier_test.go Outdated Show resolved Hide resolved

viktor-kurchenko approved these changes Jun 5, 2023

View reviewed changes

julianwiedmann approved these changes Jun 5, 2023

View reviewed changes

qmonnet force-pushed the pr/ci-verifier-keep-objfiles branch from 8178f17 to cb94261 Compare June 5, 2023 14:43

ti-mo approved these changes Jun 6, 2023

View reviewed changes

qmonnet mentioned this pull request Jun 6, 2023

CI: Runtime: TestCompileAndLoadDefaultEndpoint: Context deadline exceeded during BPF compilation #25939

Closed

qmonnet force-pushed the pr/ci-verifier-keep-objfiles branch from cb94261 to 2ef102c Compare June 6, 2023 13:18

qmonnet added 2 commits June 7, 2023 09:31

qmonnet force-pushed the pr/ci-verifier-keep-objfiles branch from 2ef102c to bda8d4d Compare June 7, 2023 08:32

maintainer-s-little-helper bot mentioned this pull request Jun 7, 2023

CI: Cilium K8s Client connection reset by peer #25958

Closed

qmonnet added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 7, 2023

dylandreimerink merged commit e3580a0 into cilium:main Jun 7, 2023
59 of 62 checks passed

qmonnet deleted the pr/ci-verifier-keep-objfiles branch June 7, 2023 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Verifier tests: Keep generated object files and logs on test failure #25862

CI: Verifier tests: Keep generated object files and logs on test failure #25862

qmonnet commented Jun 2, 2023

qmonnet commented Jun 2, 2023

tklauser left a comment

ti-mo commented Jun 2, 2023

qmonnet commented Jun 2, 2023

viktor-kurchenko left a comment

julianwiedmann left a comment

qmonnet commented Jun 5, 2023

ti-mo commented Jun 6, 2023

ti-mo commented Jun 6, 2023

ti-mo commented Jun 6, 2023

qmonnet commented Jun 6, 2023

qmonnet commented Jun 6, 2023 •

edited by maintainer-s-little-helper bot

Test Name

Failure Output

qmonnet commented Jun 7, 2023 •

edited by maintainer-s-little-helper bot

qmonnet commented Jun 7, 2023

CI: Verifier tests: Keep generated object files and logs on test failure #25862

CI: Verifier tests: Keep generated object files and logs on test failure #25862

Conversation

qmonnet commented Jun 2, 2023

qmonnet commented Jun 2, 2023

tklauser left a comment

Choose a reason for hiding this comment

ti-mo commented Jun 2, 2023

qmonnet commented Jun 2, 2023

viktor-kurchenko left a comment

Choose a reason for hiding this comment

julianwiedmann left a comment

Choose a reason for hiding this comment

qmonnet commented Jun 5, 2023

ti-mo commented Jun 6, 2023

ti-mo commented Jun 6, 2023

ti-mo commented Jun 6, 2023

qmonnet commented Jun 6, 2023

qmonnet commented Jun 6, 2023 • edited by maintainer-s-little-helper bot

Test Name

Failure Output

qmonnet commented Jun 7, 2023 • edited by maintainer-s-little-helper bot

qmonnet commented Jun 7, 2023

qmonnet commented Jun 6, 2023 •

edited by maintainer-s-little-helper bot

qmonnet commented Jun 7, 2023 •

edited by maintainer-s-little-helper bot