Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium, bpf: select v3 BPF CPU in llc for 5.7+ kernels #10804

Closed
wants to merge 1 commit into from

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Apr 1, 2020

Given this kernel has full 32bit subregister tracking, enable llc to
emit instructions for those which also helps us to reduce complexity
on the verifier.

See also complexity comparison in recent commit 4baaa2c ("docker,
runtime: upgrade to recent clang/llvm image in runtime").

From CI side, we are good as well here since test-me-please runs for:

  • 4.9 kernel compiles with v1 CPU
  • 4.19 kernel compiles with v2 CPU
  • bpf-next kernel compiles with v3 CPU

Signed-off-by: Daniel Borkmann daniel@iogearbox.net

Given this kernel has full 32bit subregister tracking, enable llc to
emit instructions for those which also helps us to reduce complexity
on the verifier.

See also complexity comparison in recent commit 4baaa2c ("docker,
runtime: upgrade to recent clang/llvm image in runtime").

From CI side, we are good as well here since test-me-please runs for:

 - 4.9      kernel compiles with v1 CPU
 - 4.19     kernel compiles with v2 CPU
 - bpf-next kernel compiles with v3 CPU

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann borkmann added pending-review sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Apr 1, 2020
@borkmann borkmann requested a review from a team April 1, 2020 12:46
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.8.0 Apr 1, 2020
@borkmann
Copy link
Member Author

borkmann commented Apr 1, 2020

test-me-please

@borkmann borkmann requested a review from jrfastab April 1, 2020 12:52
@borkmann
Copy link
Member Author

borkmann commented Apr 1, 2020

test-me-please

@borkmann
Copy link
Member Author

borkmann commented Apr 1, 2020

Unrelated but rather odd one from go build:

13:00:54  CGO_ENABLED=0 go build -mod=vendor -ldflags '-X "github.com/cilium/cilium/pkg/version.Version=1.7.90 72307b0bf 2020-04-01T12:46:16+00:00 go version go1.14.1 linux/amd64" -s -w -X "github.com/cilium/cilium/pkg/envoy.RequiredEnvoyVersionSHA=a3385205ad620550b35d3b0b651e40898386e6e3" -X "github.com/cilium/cilium/pkg/datapath/loader.DatapathSHA=d29006039db53eed1518dd9e483fc74f14917bab"' -a -installsuffix cgo -tags operator_aws -o cilium-operator
13:01:24  # github.com/cilium/cilium/operator
[2020-04-01T13:01:24.415Z] unexpected fault address 0x7f466d0f1000
[2020-04-01T13:01:24.415Z] fatal error: fault
[2020-04-01T13:01:24.415Z] [signal SIGBUS: bus error code=0x2 addr=0x7f466d0f1000 pc=0x46457f]
[2020-04-01T13:01:24.415Z] 
[2020-04-01T13:01:24.415Z] goroutine 1 [running]:
[2020-04-01T13:01:24.415Z] runtime.throw(0x6b8260, 0x5)
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/runtime/panic.go:1114 +0x72 fp=0xc00004ae30 sp=0xc00004ae00 pc=0x4332b2
[2020-04-01T13:01:24.415Z] runtime.sigpanic()
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/runtime/signal_unix.go:692 +0x443 fp=0xc00004ae60 sp=0xc00004ae30 pc=0x4498f3
[2020-04-01T13:01:24.415Z] runtime.memmove(0x7f466c4b0800, 0xc02f600000, 0x12ba08f)
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/runtime/memmove_amd64.s:425 +0x50f fp=0xc00004ae68 sp=0xc00004ae60 pc=0x46457f
[2020-04-01T13:01:24.415Z] cmd/link/internal/ld.(*OutBuf).Write(0xc000718000, 0xc02f600000, 0x12ba08f, 0x1874076, 0x0, 0x0, 0x0)
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/cmd/link/internal/ld/outbuf.go:65 +0xa1 fp=0xc00004aeb8 sp=0xc00004ae68 pc=0x5c88f1
[2020-04-01T13:01:24.415Z] cmd/link/internal/ld.(*OutBuf).WriteSym(0xc000718000, 0xc0make: *** [Makefile:8: cilium-operator] Error 2
[2020-04-01T13:01:25.370Z] The command '/bin/sh -c make CGO_ENABLED=0 GOOS=linux LOCKDEBUG=$LOCKDEBUG PKG_BUILD=1 EXTRA_GOBUILD_FLAGS="-a -installsuffix cgo -tags operator_aws"' returned a non-zero code: 2
13:01:25  Makefile:245: recipe for target 'docker-operator-image' failed
13:01:25  make: *** [docker-operator-image] Error 2
Post stage

@tklauser
Copy link
Member

tklauser commented Apr 1, 2020

Unrelated but rather odd one from go build:

13:00:54  CGO_ENABLED=0 go build -mod=vendor -ldflags '-X "github.com/cilium/cilium/pkg/version.Version=1.7.90 72307b0bf 2020-04-01T12:46:16+00:00 go version go1.14.1 linux/amd64" -s -w -X "github.com/cilium/cilium/pkg/envoy.RequiredEnvoyVersionSHA=a3385205ad620550b35d3b0b651e40898386e6e3" -X "github.com/cilium/cilium/pkg/datapath/loader.DatapathSHA=d29006039db53eed1518dd9e483fc74f14917bab"' -a -installsuffix cgo -tags operator_aws -o cilium-operator
13:01:24  # github.com/cilium/cilium/operator
[2020-04-01T13:01:24.415Z] unexpected fault address 0x7f466d0f1000
[2020-04-01T13:01:24.415Z] fatal error: fault
[2020-04-01T13:01:24.415Z] [signal SIGBUS: bus error code=0x2 addr=0x7f466d0f1000 pc=0x46457f]
[2020-04-01T13:01:24.415Z] 
[2020-04-01T13:01:24.415Z] goroutine 1 [running]:
[2020-04-01T13:01:24.415Z] runtime.throw(0x6b8260, 0x5)
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/runtime/panic.go:1114 +0x72 fp=0xc00004ae30 sp=0xc00004ae00 pc=0x4332b2
[2020-04-01T13:01:24.415Z] runtime.sigpanic()
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/runtime/signal_unix.go:692 +0x443 fp=0xc00004ae60 sp=0xc00004ae30 pc=0x4498f3
[2020-04-01T13:01:24.415Z] runtime.memmove(0x7f466c4b0800, 0xc02f600000, 0x12ba08f)
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/runtime/memmove_amd64.s:425 +0x50f fp=0xc00004ae68 sp=0xc00004ae60 pc=0x46457f
[2020-04-01T13:01:24.415Z] cmd/link/internal/ld.(*OutBuf).Write(0xc000718000, 0xc02f600000, 0x12ba08f, 0x1874076, 0x0, 0x0, 0x0)
[2020-04-01T13:01:24.415Z] 	/usr/local/go/src/cmd/link/internal/ld/outbuf.go:65 +0xa1 fp=0xc00004aeb8 sp=0xc00004ae68 pc=0x5c88f1
[2020-04-01T13:01:24.415Z] cmd/link/internal/ld.(*OutBuf).WriteSym(0xc000718000, 0xc0make: *** [Makefile:8: cilium-operator] Error 2
[2020-04-01T13:01:25.370Z] The command '/bin/sh -c make CGO_ENABLED=0 GOOS=linux LOCKDEBUG=$LOCKDEBUG PKG_BUILD=1 EXTRA_GOBUILD_FLAGS="-a -installsuffix cgo -tags operator_aws"' returned a non-zero code: 2
13:01:25  Makefile:245: recipe for target 'docker-operator-image' failed
13:01:25  make: *** [docker-operator-image] Error 2
Post stage

Could this be related to the operator split (#9920) that @errordeveloper has been working on? Specifically #10689?

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 45.501% when pulling d43085f on pr/v3-cpu into 09eebce on master.

Copy link
Contributor

@jrfastab jrfastab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@borkmann
Copy link
Member Author

borkmann commented Apr 1, 2020

test-me-please

1 similar comment
@borkmann
Copy link
Member Author

borkmann commented Apr 2, 2020

test-me-please

@ciliumbot
Copy link

Build finished.

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Probing indirectly seems a bit fragile with regards to potential helper backports, should we try to add direct probes to e.g. bpftool in the future?

[Nit: the BPF guide mentions v1 and v2 processors, should it be updated to mention v3 too?]

@errordeveloper
Copy link
Contributor

Could this be related to the operator split (#9920) that @errordeveloper has been working on? Specifically #10689?

Does that issue still persist?

@tklauser
Copy link
Member

Could this be related to the operator split (#9920) that @errordeveloper has been working on? Specifically #10689?

Does that issue still persist?

Didn't see it happen in the last few days. Looks like it was fixed in the meantime.

@tklauser
Copy link
Member

restart-ginkgo

@tklauser
Copy link
Member

test-with-kernel

@tklauser
Copy link
Member

test-gke

@aanm
Copy link
Member

aanm commented Apr 20, 2020

test-me-please jenkins history was gone and ginkgo-tests have failed.

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one question about whether we'll misdetect the presence of the subregister backtracking changes.

We'll also need to keep this in mind as we think about slimming down the images: #9411

// but the jmp/alu32 handling is not suited for pre 5.7 due to
// lack of 32 bit subreg tracking.
if h := probes.NewProbeManager().GetHelpers("cgroup_sock_addr"); h != nil {
if _, ok := h["bpf_get_netns_cookie"]; ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance that the subregister backtracking won't be backported but the netns_cookie changes will be? I think the netns_cookie helper is quite powerful with a relatively minimal set of changes.

@joestringer joestringer mentioned this pull request Apr 20, 2020
4 tasks
@borkmann
Copy link
Member Author

(pr currently on hold given v3 complexity)

@aanm aanm marked this pull request as draft April 23, 2020 10:09
@aanm aanm removed the wip label Apr 23, 2020
@aanm
Copy link
Member

aanm commented Apr 23, 2020

this PR has been marked as a draft PR since it had a WIP label. Please click in "Ready for review" [below vvv ] once the PR is ready to be reviewed. CI will still run for draft PRs.

@stale
Copy link

stale bot commented May 23, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label May 23, 2020
@borkmann
Copy link
Member Author

Closing for now. We'll revisit mcpu=v3 at a later point in time.

@borkmann borkmann closed this May 29, 2020
@nbusseneau nbusseneau deleted the pr/v3-cpu branch May 21, 2021 21:35
@nbusseneau nbusseneau restored the pr/v3-cpu branch May 21, 2021 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Projects
No open projects
1.8.0
  
In progress
Development

Successfully merging this pull request may close these issues.

None yet