Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HELP] Running Cilium in k3d (bpf-mount) #363

Closed
arjantop-cai opened this issue Sep 30, 2020 · 15 comments
Closed

[HELP] Running Cilium in k3d (bpf-mount) #363

arjantop-cai opened this issue Sep 30, 2020 · 15 comments
Assignees
Labels
bug Something isn't working not a bug Luckily this is not a bug with k3d after all ¯\_(ツ)_/¯ question Further information is requested

Comments

@arjantop-cai
Copy link

What did you do

  • How was the cluster created?

    • k3d cluster create test --k3s-server-arg="--flannel-backend=none"
  • What did you do afterwards?

From here: https://docs.cilium.io/en/v1.8/gettingstarted/kind/

helm install cilium cilium/cilium --version 1.8.3 \
   --namespace kube-system \
   --set global.nodeinit.enabled=true \
   --set global.kubeProxyReplacement=partial \
   --set global.hostServices.enabled=false \
   --set global.externalIPs.enabled=true \
   --set global.nodePort.enabled=true \
   --set global.hostPort.enabled=true \
   --set config.bpfMasquerade=false \
   --set global.pullPolicy=IfNotPresent \
   --set config.ipam=kubernetes

Or from here: https://docs.cilium.io/en/v1.8/gettingstarted/k3s/

kubectl create -f https://raw.githubusercontent.com/cilium/cilium/v1.8/install/kubernetes/quick-install.yaml

Error:
kubectl describe pod -nkube-system cilium-5hq8g

kubelet, k3d-test-server-0  Error: failed to generate container "90be1c0c5cf1ed67b8529b5112e1411c22b0a4c86e087442f0873c173a2deb46" spec: path "/sys/fs/bpf" is mounted on "/sys" but it is not a shared or slave mount

What did you expect to happen

Cilium install should become healthy (it works in k3s: https://docs.cilium.io/en/v1.8/gettingstarted/k3s/)

Which OS & Architecture

  • MacOS 10.15.6

Which version of k3d

$ k3d version
k3d version v3.0.0
k3s version latest (default)

Which version of docker

$ docker version
Client: Docker Engine - Community
 Azure integration  0.1.15
 Version:           19.03.13-beta2
 API version:       1.40
 Go version:        go1.13.14
 Git commit:        ff3fbc9d55
 Built:             Mon Aug  3 14:58:48 2020
 OS/Arch:           darwin/amd64
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          19.03.13-beta2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.14
  Git commit:       ff3fbc9d55
  Built:            Mon Aug  3 15:06:50 2020
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          v1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
docker info
Client:
 Debug Mode: false
 Plugins:
  app: Docker Application (Docker Inc., v0.8.0)
  buildx: Build with BuildKit (Docker Inc., v0.3.1-tp-docker)
  scan: Docker Scan (Docker Inc., v0.3.3)

Server:
 Containers: 4
  Running: 3
  Paused: 0
  Stopped: 1
 Images: 25
 Server Version: 19.03.13-beta2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.76-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 5.811GiB
 Name: docker-desktop
 ID: V4NV:VJJA:NWL6:JSNI:4XLR:5R2M:CCCY:MMV6:476A:TCM5:4BOO:2YYT
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 69
  Goroutines: 80
  System Time: 2020-09-30T22:07:53.2873526Z
  EventsListeners: 4
 HTTP Proxy: gateway.docker.internal:3128
 HTTPS Proxy: gateway.docker.internal:3129
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: true
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine
@arjantop-cai arjantop-cai added the bug Something isn't working label Sep 30, 2020
@iwilltry42 iwilltry42 self-assigned this Oct 1, 2020
@iwilltry42
Copy link
Member

Hi, thanks for opening this issue 👍
What you're experiencing here is not actually a bug in k3d, but rather caused by the fact, that you're running k3s inside a container, which is not exactly the same as running it plain on the host.
I guess the step that's affecting you is this one: https://docs.cilium.io/en/v1.8/gettingstarted/k3s/#mount-the-bpf-filesystem
What you could try to solve it, is creating your cluster with a bind-mount of /sys or just /sys/fs/bpf, like e.g. k3d cluster create test --k3s-server-arg="--flannel-backend=none" --volume /sys/fs/bpf:/sys/fs/bpf (Note: this may require you to run k3d as root).

@iwilltry42 iwilltry42 added not a bug Luckily this is not a bug with k3d after all ¯\_(ツ)_/¯ question Further information is requested labels Oct 2, 2020
@iwilltry42 iwilltry42 changed the title [BUG] Cilium install never becomes healthy [HELP] Running Cilium in k3d (bpf-mount) Oct 2, 2020
@arjantop-cai
Copy link
Author

@iwilltry42 Thanks for looking into this. The install should work without mounting (for non-production use-case), it will get auto-mounted when cilium pod is started.

I am also on a mac, don't know if that could work (I also tried just in case and the error is the same).

The cilium install works out of the box with kind, so I expected for k3s to also work.

@arjantop-cai
Copy link
Author

I diffed docker insect of both clusters, could this difference be the cause?

"seccomp=unconfined",
"apparmor=unconfined",

@iwilltry42
Copy link
Member

iwilltry42 commented Oct 2, 2020

I diffed docker insect of both clusters, could this difference be the cause?

"seccomp=unconfined",
"apparmor=unconfined",

Nice finding!
I'm not 100% familiar with those settings, but from what I see in kind's code, it could very well be... Let me create a test release that you can test with 👍

UPDATE: I linked a Google Drive folder with test release files in the PR and will drop it here as well: https://drive.google.com/drive/u/0/folders/1dAvLKlqs5hgXmUnrVs2pn0VzzaMsdKus

@arjantop-cai
Copy link
Author

Sorry @iwilltry42, this is not it, might be needed but does not solve the current issue.

Started investigating now, might be the issue with the base image (k3s), will share the results.

@iwilltry42
Copy link
Member

@arjantop-cai , damn.. but it was worth a try 🤔
Thanks for investigating 👍

@blaggacao
Copy link

blaggacao commented Oct 5, 2020

Shouldn't it be: --volume /sys/fs/bpf:/sys/fs/bpf:shared ? like in the csi case?

Or also --volume /sys/fs/bpf:/sys/fs/bpf:slave according to the error message.

@arjantop-cai
Copy link
Author

@blaggacao I don't think its that simple, I am running on a mac, there is no /sys/fs/bfp, if I mount anything (with docker for mac) it will try to mount from the host machine, not from linux VM that docker is running in.

If I run the command it (as expected) does not work, but now already fails during k3d startup, not when installing cilium:

WARN[0000] Failed to stat file/directory/named volume that you're trying to mount: '/sys/fs/bpf' in '/sys/fs/bpf:/sys/fs/bpf:slave' -> Please make sure it exists
ERRO[0001] Error response from daemon: path /sys/fs/bpf is mounted on /sys but it is not a shared or slave mount

@blaggacao
Copy link

Oh, sorry, I overlooked you where working on a Mac. In sone time from now I might be conducting experiments with cillium as well on linux. So I might come back with more insight.

@iwilltry42
Copy link
Member

iwilltry42 commented Oct 6, 2020

After doing some quick research, I'm afraid we won't be able to resolve this properly on k3d side.
Please check out cilium/cilium#10516 and consequently docker/for-mac#4454
Update: On the other hand, this makes it even more unclear to me why it works in kind 🤔
Update2: Maybe this is up to something: https://github.com/kubernetes-sigs/kind/blob/c13c54b9564aed8bc4f28b90af20a1100da66963/images/base/files/usr/local/bin/entrypoint#L53-L62
Update3: Seems like trying to mount /sys in DfD will indeed try to mount from the Docker VM and not from the Mac Host (it's just, that usually the directories you'd mount are mounted into the Docker VM by default): https://docs.docker.com/docker-for-mac/ (section "FILE SHARING")

@skurfuerst
Copy link

Hey,

I've been able to fix the k3d issue like this:

k3d cluster create foo --agents 1 \
    --k3s-server-arg "--disable=servicelb" --k3s-server-arg "--disable=traefik"  --no-lb \
    --k3s-server-arg "--disable-network-policy" --k3s-server-arg "--flannel-backend=none" \
     --k3s-agent-arg "--no-flannel"

docker ps

docker exec -it k3d-foo-agent-0 mount bpffs /sys/fs/bpf -t bpf
docker exec -it k3d-foo-agent-0 mount --make-shared /sys/fs/bpf

docker exec -it k3d-foo-server-0 mount bpffs /sys/fs/bpf -t bpf
docker exec -it k3d-foo-server-0 mount --make-shared /sys/fs/bpf

Then, it does not fail anymore with the problem of accessing the bpf FS. It does not run through yet; but maybe it helpf anybody else :)

All the best,
Sebastian

@skurfuerst
Copy link

Hey @iwilltry42 @arjantop-cai @blaggacao ,

I figured out how you can run Cilium with k3s / k3d on docker for mac. See https://sandstorm.de/de/blog/post/running-cilium-in-k3s-and-k3d-lightweight-kubernetes-on-mac-os-for-development.html for the full explanation :-) Feel free to re-use the content for your documentation or so :-)

All the best,
and keep up the great work,
Sebastian

@arjantop-cai
Copy link
Author

@skurfuerst So the only reason why it is not working is that image does not have bash installed?

Should k3d base image have it installed for compatibility?

@skurfuerst
Copy link

@arjantop-cai IMHO that would help, but AFAIK this is the k3s image, not the k3d image. And this is based on Busybox; so it might be more difficult to use Bash here.

IMHO it would actually be better to rewrite the Script on the Cilium side to only use /bin/sh. The script is quite simple, and AFAICS no bash features are used there.

All the best,
Sebastian

@iwilltry42
Copy link
Member

Hey @skurfuerst great work! Thanks for sharing your insights and the content!
I'd love to accept a PR for the docs on k3d.io, if you want to share something there :)

@k3d-io k3d-io locked and limited conversation to collaborators Feb 5, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
bug Something isn't working not a bug Luckily this is not a bug with k3d after all ¯\_(ツ)_/¯ question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants