Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] No space left due to spurious entries in /proc/*/mountinfo wrt k3d #594

Closed
leelavg opened this issue May 10, 2021 · 5 comments
Closed
Labels
bug Something isn't working
Milestone

Comments

@leelavg
Copy link
Contributor

leelavg commented May 10, 2021

Pardon for not using the bug template. First of all, thanks for creating k3d, I've been using it from 3.x version and I did my part spreading the word. Please find below details for the actual issue.

Versions:

# k3d version
k3d version v4.4.3
k3s version v1.20.6-k3s1 (default)

# docker version
Client: Docker Engine - Community
 Version:           20.10.6
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        370c289
 Built:             Fri Apr  9 22:47:32 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.6
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       8728dd2
  Built:            Fri Apr  9 22:45:12 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.4
  GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc:
  Version:          1.0.0-rc93
  GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
  scan: Docker Scan (Docker Inc.)

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 56
 Server Version: 20.10.6
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.11.14-100.fc32.x86_64
 Operating System: Fedora 32 (Server Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 6
 Total Memory: 7.67GiB
 Name: XXX
 ID: ZQ6O:XXX:VH33
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: leelavg
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Issue and additional info

  • When a cluster with same name is created using shared volume mounts after it's deletion, the process still leaves a pid in /proc/*/mountinfo and due to this after several cluster creations and deletions I was getting an error like no space left on device: unknown and I guess it might be related to [COLLECTION] Pods evicted due to NodeHasDiskPressure #133?
  • I was using some workaround which was posted here (search for diff <), as this article is getting some hits recently I want to make sure that I'm not doing something wrong ending up with these entries
  • I don't remember exact error message and issue can be reproduced by following below steps.
# Before cluster creation and any operations
# grep k3d /proc/*/mountinfo | wc -l
0

# df -h | grep -v root
Filesystem                          Size  Used Avail Use% Mounted on
devtmpfs                            3.9G     0  3.9G   0% /dev
tmpfs                               3.9G     0  3.9G   0% /dev/shm
tmpfs                               3.9G  1.2M  3.9G   1% /run
tmpfs                               3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs                               3.9G  611M  3.3G  16% /tmp
/dev/sda1                          1014M  284M  731M  28% /boot
/dev/mapper/vg1-lv1                 100G  5.3G   95G   6% /var
tmpfs                               786M     0  786M   0% /run/user/0
overlay                             100G  5.3G   95G   6% /var/lib/docker/overlay2/f6c39ac88367d9a0322e715bf3d57c1756d7b97704790d9449eab0739c8ed81e/merged

# df -ha | grep -Pv 'root|cgroup'
Filesystem                          Size  Used Avail Use% Mounted on
sysfs                                  0     0     0    - /sys
proc                                   0     0     0    - /proc
devtmpfs                            3.9G     0  3.9G   0% /dev
securityfs                             0     0     0    - /sys/kernel/security
tmpfs                               3.9G     0  3.9G   0% /dev/shm
devpts                                 0     0     0    - /dev/pts
tmpfs                               3.9G  1.2M  3.9G   1% /run
tmpfs                               3.9G     0  3.9G   0% /sys/fs/cgroup
pstore                                 0     0     0    - /sys/fs/pstore
none                                   0     0     0    - /sys/fs/bpf
selinuxfs                              0     0     0    - /sys/fs/selinux
systemd-1                              -     -     -    - /proc/sys/fs/binfmt_misc
mqueue                                 0     0     0    - /dev/mqueue
hugetlbfs                              0     0     0    - /dev/hugepages
debugfs                                0     0     0    - /sys/kernel/debug
tracefs                                0     0     0    - /sys/kernel/tracing
configfs                               0     0     0    - /sys/kernel/config
fusectl                                0     0     0    - /sys/fs/fuse/connections
tmpfs                               3.9G  611M  3.3G  16% /tmp
/dev/sda1                          1014M  284M  731M  28% /boot
/dev/mapper/vg1-lv1                 100G  5.3G   95G   6% /var
sunrpc                                 0     0     0    - /var/lib/nfs/rpc_pipefs
tmpfs                               786M     0  786M   0% /run/user/0
binfmt_misc                            0     0     0    - /proc/sys/fs/binfmt_misc
overlay                             100G  5.3G   95G   6% /var/lib/docker/overlay2/f6c39ac88367d9a0322e715bf3d57c1756d7b97704790d9449eab0739c8ed81e/merged
nsfs                                   0     0     0    - /run/docker/netns/de309cbe7973

# for i in {1..10}; do k3d cluster create test -v /tmp/k3d/kubelet/pods:/var/lib/kubelet/pods:shared 2>&1 >/dev/null; sleep 1; k3d cluster list; k3d cluster delete test 2>&1 >/dev/null; grep k3d /proc/*/mountinfo | wc -l; done;
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
270
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
544
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
885
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
1216
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
1530
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
1956
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
2275
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
2592
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
grep: /proc/380201/mountinfo: No such file or directory
2988
WARN[0000] No node filter specified                     
NAME   SERVERS   AGENTS   LOADBALANCER
test   1/1       0/0      true
3300

# df -h | grep -v root
Filesystem                          Size  Used Avail Use% Mounted on
devtmpfs                            3.9G     0  3.9G   0% /dev
tmpfs                               3.9G     0  3.9G   0% /dev/shm
tmpfs                               3.9G  1.2M  3.9G   1% /run
tmpfs                               3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs                               3.9G  611M  3.3G  16% /tmp
/dev/sda1                          1014M  284M  731M  28% /boot
/dev/mapper/vg1-lv1                 100G  5.3G   95G   6% /var
tmpfs                               786M     0  786M   0% /run/user/0
overlay                             100G  5.3G   95G   6% /var/lib/docker/overlay2/f6c39ac88367d9a0322e715bf3d57c1756d7b97704790d9449eab0739c8ed81e/merged

# df -ha | grep -Pv 'root|cgroup'
Filesystem                          Size  Used Avail Use% Mounted on
sysfs                                  0     0     0    - /sys
proc                                   0     0     0    - /proc
devtmpfs                            3.9G     0  3.9G   0% /dev
securityfs                             0     0     0    - /sys/kernel/security
tmpfs                               3.9G     0  3.9G   0% /dev/shm
devpts                                 0     0     0    - /dev/pts
tmpfs                               3.9G  1.2M  3.9G   1% /run
pstore                                 0     0     0    - /sys/fs/pstore
none                                   0     0     0    - /sys/fs/bpf
selinuxfs                              0     0     0    - /sys/fs/selinux
systemd-1                              -     -     -    - /proc/sys/fs/binfmt_misc
mqueue                                 0     0     0    - /dev/mqueue
hugetlbfs                              0     0     0    - /dev/hugepages
debugfs                                0     0     0    - /sys/kernel/debug
tracefs                                0     0     0    - /sys/kernel/tracing
configfs                               0     0     0    - /sys/kernel/config
fusectl                                0     0     0    - /sys/fs/fuse/connections
tmpfs                               3.9G  611M  3.3G  16% /tmp
/dev/sda1                          1014M  284M  731M  28% /boot
/dev/mapper/vg1-lv1                 100G  5.3G   95G   6% /var
sunrpc                                 0     0     0    - /var/lib/nfs/rpc_pipefs
tmpfs                               786M     0  786M   0% /run/user/0
binfmt_misc                            0     0     0    - /proc/sys/fs/binfmt_misc
overlay                             100G  5.3G   95G   6% /var/lib/docker/overlay2/f6c39ac88367d9a0322e715bf3d57c1756d7b97704790d9449eab0739c8ed81e/merged
nsfs                                   0     0     0    - /run/docker/netns/de309cbe7973
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/6eecef4ae7b3a5c04e0a76832f0ce3ba5062e1837a476230b706e6e3b615f7bb/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/0604535ecf15325738042094f71af3e01e54423335c1c60a105719efa5cf4630/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/fb7d0b6a6a6c354b278a316d3bf3ac98067c3de488eab17c91c0ec0f1bd21cae/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/9b6ab99f52d007da847a2331d77d158340bf3adcd26836dcc7e01cdb875f3235/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/8f364dc440e2683527f8aff0c35e81a125d78aa91576f8adf4795586d895b374/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/9bba99f433206592b2f3c6f8e62e10712b66ac332d1ecac6c338398cb2ca9f56/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/d040a8bdc374292b35b0e42332a141abe9fc3d9c70301d58f5603c737318c13f/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/d3c21276357e26cf1a3c65d9848c90569909afc90b7fac9c814e4c2aa7d83fa4/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/0a340b08c0a98ed80b194c55a061884e54187d308b5a5dbbb6fd8f4dac84fb83/_data/pods
tmpfs                               3.9G  611M  3.3G  16% /var/lib/docker/volumes/4b237dd6a4bec5dd61b03ccec5c47961c339e9aafc9b8679052ebc8493269106/_data/pods
  k3d cluster create test -a $agents \
      -v /tmp/k3d/kubelet/pods:/var/lib/kubelet/pods:shared \
      -v /mnt/sdc:/mnt/sdc -v /mnt/sdd:/mnt/sdd \
      -v /mnt/sde:/mnt/sde \
      -v ~/.k3d/registries.yaml:/etc/rancher/k3s/registries.yaml \
      --k3s-server-arg "--kube-apiserver-arg=feature-gates=EphemeralContainers=true" \
      --k3s-server-arg --disable=local-storage
  # Delete k3d cluster
  k3d cluster delete test

  # Unmount any left overs of k3d
  diff <(df -ha | grep pods | awk '{print $NF}') <(df -h | grep pods | awk '{print $NF}') | awk '{print $2}' | xargs umount -l

Please let me know if any additional info is required. I hope I'm doing something wrong here.

Thanks.

@leelavg leelavg added the bug Something isn't working label May 10, 2021
@iwilltry42 iwilltry42 added this to the Backlog milestone May 10, 2021
@iwilltry42
Copy link
Member

Hi @leelavg , thanks for opening this issue!
Interesting finding and thanks for the feedback and sharing your post!
However, I'm not sure, where you see needed action on k3d side here? 🤔
When k3d deletes the node containers, it also wipes container volumes (RemoveVolumes: true) and everything that k3d manages is gone.
What would you expect to happen in your use case?

@leelavg
Copy link
Contributor Author

leelavg commented May 10, 2021

@iwilltry42 , i expect df -ha before cluster creation and after cluster deletion should match, in the sense no entries related to k3d should exist.

This can also be seen in proc/*/mountinfo that k3d holding many pids even after cluster deletion.

@iwilltry42
Copy link
Member

Well yeah, I understood the end goal, but not how k3d could do anything here.
I guess that would all be up to the runtime (docker), right?
K3d already cleans up everything it creates along the way. How docker takes care of the rest is unfortunately not up to us.
If you have any idea how k3d could do anything here, that'd be cool 👍

@leelavg
Copy link
Contributor Author

leelavg commented May 11, 2021

I guess that would all be up to the runtime (docker), right?

Ack, now I get it that it's not under k3d purview.

If you have any idea how k3d could do anything here, that'd be cool 👍

I can take your word for it 😀. However, I'm looking at go-lang based projects, if my time permits I'd definitely want to contribute.

Maybe document about left over pids by docker when shared mounts are used and close the issue?

@iwilltry42
Copy link
Member

if my time permits I'd definitely want to contribute

That would be great :)

document about left over pids by docker when shared mounts are used

Go for it 😉
It seems to be unrelated to k3d, but we could still put something in the FAQ section (even though it's the first time I see such an issue) 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants