Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: update cilium-{runtime,builder} images #11734

Merged
merged 1 commit into from May 28, 2020

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented May 27, 2020

Pull in updated LLVM into cilium-{runtime,builder} images. On top of the
base 10.0.0, they include the cherry-picked commits from John's work:

Given we build https://github.com/cilium/image-tools via GH actions, move
all the tools from there to the docker.io auto-built repos.

Also adds ss tool to the runtime image.

Signed-off-by: Daniel Borkmann daniel@iogearbox.net

@borkmann borkmann added pending-review area/docker Impacts the integration with Docker. release-note/misc This PR makes changes that have no direct user impact. labels May 27, 2020
@borkmann borkmann requested review from a team as code owners May 27, 2020 23:59
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.8.0 May 27, 2020
@coveralls
Copy link

coveralls commented May 28, 2020

Coverage Status

Coverage decreased (-0.007%) to 36.922% when pulling ab93f0d595dea3dbf183ef7a06baf40287b419bf on pr/cilium-images-update into 87f0905 on master.

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is expected to reduce BPF complexity a bit? Would be nice to have some report of complexity in GitHub, as we have for test coverage.

Also adds ss tool to the runtime image.

Fixes #11648, right?

@borkmann
Copy link
Member Author

Also adds ss tool to the runtime image.

Fixes #11648, right?

It adds it to the runtime, but the actual bugtool change @soumynathan is working on along with removing all the other non-working cmd invocations.

@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

borkmann commented May 28, 2020

If I understand correctly, this is expected to reduce BPF complexity a bit? Would be nice to have some report of complexity in GitHub, as we have for test coverage.

Yes, expected to slightly reduce as well. Agree, some sort of GH action coverage would be really useful to find bad offenders which have a significant increase in verifier complexity. Maybe there could be one action per tested kernel under a number of pre-set test configs for the BPF datapath. We have this BUILD_PERMUTATION right now in the Makefile, but perhaps we can come up with something better which would then rework the former and be useable for the GH actions.

@borkmann
Copy link
Member Author

borkmann commented May 28, 2020

(Noticed that I need to add copy of ss to COPY --from=tools line as well, sigh.)
EDIT: Done now and rebuilt all the images.

@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

retest-4.19

@borkmann
Copy link
Member Author

quay issue on 4.19:

11:19:21  + cd /home/jenkins/workspace/Cilium-PR-Ginkgo-Tests-Kernel/src/github.com/cilium/cilium/
11:19:21  + make jenkins-precheck
11:19:24  docker-compose -f test/docker-compose.yml -p Cilium-PR-Ginkgo-Tests-Kernel-$BUILD_NUMBER run --rm precheck
11:19:26  Creating network "cilium-pr-ginkgo-tests-kernel-1773_default" with the default driver
11:19:26  Pulling precheck (quay.io/cilium/cilium-builder:2020-05-29)...
11:19:26  Get https://quay.io/v2/: EOF
11:19:26  Makefile:142: recipe for target 'jenkins-precheck' failed
11:19:26  make: *** [jenkins-precheck] Error 1

@borkmann
Copy link
Member Author

retest-4.19

@borkmann
Copy link
Member Author

On the 4.19 CI from quay:

11:38:19  Step 15/27 : FROM quay.io/cilium/cilium-runtime:2020-05-29
11:38:19  Get https://quay.io/v2/cilium/cilium-runtime/manifests/2020-05-29: received unexpected HTTP status: 500 Internal Server Error
11:38:19  Makefile.docker:21: recipe for target 'docker-image-no-clean' failed
11:38:19  make: *** [docker-image-no-clean] Error 1

@borkmann
Copy link
Member Author

retest-4.19

@borkmann
Copy link
Member Author

retest-net-next

@jrfastab
Copy link
Contributor

From K8s-1.11-Kernel-netnext is this expected?

2020-05-28T10:33:04.010364682Z level=warning msg="+ set -e" subsys=datapath-loader
2020-05-28T10:33:04.010367047Z level=warning msg="+ cilium-map-migrate -e bpf_sock.o -r 0" subsys=datapath-loader
2020-05-28T10:33:04.010369139Z level=warning msg="+ '[' 0 -eq 0 ']'" subsys=datapath-loader
2020-05-28T10:33:04.010371125Z level=warning msg="+ set +e" subsys=datapath-loader
2020-05-28T10:33:04.010373136Z level=warning msg="+ bpftool cgroup attach /var/run/cilium/cgroupv2 getpeername6 pinned /sys/fs/bpf/tc/globals/cilium_cgroups_getpeername6" subsys=datapath-loader
2020-05-28T10:33:04.010375253Z level=warning msg="Error: invalid attach type" subsys=datapath-loader
2020-05-28T10:33:04.01037725Z level=warning msg="+ RETCODE=255" subsys=datapath-loader

@borkmann
Copy link
Member Author

From K8s-1.11-Kernel-netnext is this expected?

2020-05-28T10:33:04.010364682Z level=warning msg="+ set -e" subsys=datapath-loader
2020-05-28T10:33:04.010367047Z level=warning msg="+ cilium-map-migrate -e bpf_sock.o -r 0" subsys=datapath-loader
2020-05-28T10:33:04.010369139Z level=warning msg="+ '[' 0 -eq 0 ']'" subsys=datapath-loader
2020-05-28T10:33:04.010371125Z level=warning msg="+ set +e" subsys=datapath-loader
2020-05-28T10:33:04.010373136Z level=warning msg="+ bpftool cgroup attach /var/run/cilium/cgroupv2 getpeername6 pinned /sys/fs/bpf/tc/globals/cilium_cgroups_getpeername6" subsys=datapath-loader
2020-05-28T10:33:04.010375253Z level=warning msg="Error: invalid attach type" subsys=datapath-loader
2020-05-28T10:33:04.01037725Z level=warning msg="+ RETCODE=255" subsys=datapath-loader

Heh, no. Looks like https://github.com/cilium/image-tools/blob/master/images/bpftool/checkout-linux.sh is using a too old revision here.

@errordeveloper
Copy link
Contributor

errordeveloper commented May 28, 2020

@borkmann ah, looks your commit went in (cilium/image-tools@7877cb0) while my PR (cilium/image-tools#3) was in flight... apologies for that...

@borkmann
Copy link
Member Author

cilium/image-tools#22 is building right now. Will update the rest once done.

Pull in updated LLVM into cilium-{runtime,builder} images. On top of the
base 10.0.0, they include the cherry-picked commits from John's work:

  - llvm/llvm-project@29bc5dd
  - llvm/llvm-project@13f6c81

Given we build https://github.com/cilium/image-tools via GH actions, move
all the tools from there to the docker.io auto-built repos.

Also add ss tool to the runtime image for diagnostics via bugtool (e.g.
for the L7 proxy).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@joestringer
Copy link
Member

Yes, expected to slightly reduce as well. Agree, some sort of GH action coverage would be really useful to find bad offenders which have a significant increase in verifier complexity. Maybe there could be one action per tested kernel under a number of pre-set test configs for the BPF datapath. We have this BUILD_PERMUTATION right now in the Makefile, but perhaps we can come up with something better which would then rework the former and be useable for the GH actions.

*whispers* ./test/bpf/check-complexity.sh

This only outputs complexity numbers for the binaries currently compiled in the bpf/ directory, and the output is not particularly machine-friendly right now, but this was exactly the goal of this script.

As a side note it'd be nice if the kernel actually presented complexity in a more structured manner so we don't have to give it a log and then parse the arbitrary log output to figure these numbers out.

What I don't understand is how github actions can help present these over time. I had imagined previously that we could improve this script to output a text file during CI runs and use that with either Gingko Measure() and/or jenkins XML output to push somewhere with longer-term storage. I guess if we could get something like what coveralls does, where it measures, stores, and posts a comment with a summary of the complexity diff then that'd be neat.

Also worth evaluating whether it makes sense for this to actually be per-PR, or automatically for specific PRs (all PRs that change bpf/?), or manually triggered, or on a regular cadence with output to somewhere we can regularly check.

@pchaigno
Copy link
Member

pchaigno commented May 28, 2020

whispers ./test/bpf/check-complexity.sh

This only outputs complexity numbers for the binaries currently compiled in the bpf/ directory, and the output is not particularly machine-friendly right now, but this was exactly the goal of this script.

I think Daniel had this script in mind; at least I did. The issue is that it only tests one specific datapath config. right now, that which achieves the max. complexity on 4.9. We would need to test different config. for different kernels. We already have #10798 opened for that though.

What I don't understand is how github actions can help present these over time. I had imagined previously that we could improve this script to output a text file during CI runs and use that with either Gingko Measure() and/or jenkins XML output to push somewhere with longer-term storage. I guess if we could get something like what coveralls does, where it measures, stores, and posts a comment with a summary of the complexity diff then that'd be neat.

I think it's just that the integration with GitHub Actions is much easier. We could even annotate program sections with their complexity changes. GitHub Actions have several Ubuntu images, so maybe one way to get some of the kernel versions we need. We would probably need some Jenkins test to get all the kernels we need though :-/

(all PRs that change bpf/?)

One more thing that'd be super easy with GitHub Actions.

@joestringer
Copy link
Member

On complexity measurement topic, ref. #4837, now reopened.

@borkmann
Copy link
Member Author

retest-net-next

@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

whispers ./test/bpf/check-complexity.sh
This only outputs complexity numbers for the binaries currently compiled in the bpf/ directory, and the output is not particularly machine-friendly right now, but this was exactly the goal of this script.

I think Daniel had this script in mind; at least I did. The issue is that it only tests one specific datapath config. right now, that which achieves the max. complexity on 4.9. We would need to test different config. for different kernels. We already have #10798 opened for that though.

Yeah, I think by now we have explosion of ifdefs in our BPF datapath, so it is hard to say which config exactly causes the highest complexity. Would be nice to categorise which config items are available for the three kernels we test and then, maybe, generate a bunch of random combinations (say ~250) to compile for them (as GH action) and trying to load plus taking the diff of complexity. Seed could be fixed so it's reproducible.

What I don't understand is how github actions can help present these over time. I had imagined previously that we could improve this script to output a text file during CI runs and use that with either Gingko Measure() and/or jenkins XML output to push somewhere with longer-term storage. I guess if we could get something like what coveralls does, where it measures, stores, and posts a comment with a summary of the complexity diff then that'd be neat.

I think it's just that the integration with GitHub Actions is much easier. We could even annotate program sections with their complexity changes. GitHub Actions have several Ubuntu images, so maybe one way to get some of the kernel versions we need. We would probably need some Jenkins test to get all the kernels we need though :-/

That would also be cool!

(all PRs that change bpf/?)

One more thing that'd be super easy with GitHub Actions.

@borkmann
Copy link
Member Author

retest-runtime

@borkmann borkmann merged commit b87a855 into master May 28, 2020
1.8.0 automation moved this from In progress to Merged May 28, 2020
@borkmann borkmann deleted the pr/cilium-images-update branch May 28, 2020 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker Impacts the integration with Docker. release-note/misc This PR makes changes that have no direct user impact.
Projects
No open projects
1.8.0
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

9 participants