kubelet cannot find device for dir /var/lib/kubelet in cached partitions map #38337

dmrub · 2016-12-08T00:05:09Z

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.):

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): "btrfs kubelet"

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version): v1.4.6

Environment:

Cloud provider or hardware configuration: bare metal
OS (e.g. from /etc/os-release): CentOS Linux 7 (Core)
Kernel (e.g. uname -a): Linux vilnus 3.10.0-327.36.3.el7.x86_64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Mon Oct 24 16:09:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Install tools: Ansible script from https://github.com/kubernetes/contrib.git with custom changes.
Others: Cluster with 4 bare metal nodes, same OS on all nodes

What happened:
kubelet continuously reports following error messages:

Dec 06 01:27:26 vilnus kubelet[2447]: E1206 01:27:26.798004    2447 kubelet.go:2132] Failed to check if disk space is available on the root partition: failed to get fs info for "root": error trying to get filesystem Device for dir /var/lib/kubelet: err: could not find device with major: 0, minor: 37 in cached partitions map

As (I assume) a side effect heapster pod fails with following error messages:

2016-12-07T22:48:05.074258000Z E1207 22:48:05.073677       1 summary.go:114] error while getting metrics summary from Kubelet antego(192.168.81.104:10255): request failed - "500 Internal Server Error", response: "Internal Error: failed RootFsInfo: error trying to get filesystem Device for dir /var/lib/kubelet: err: could not find device with major: 0, minor: 36 in cached partitions map"

When kubelet starts there is no device with major:minor ID '0:37':

Dec 06 01:27:21 vilnus kubelet[2447]: I1206 01:27:21.304451    2447 fs.go:117] Filesystem partitions: map[/dev/sda4:{mountpoint:/ major:0 minor:35 fsType:btrfs blockSize:0} /dev/sda2:{mountp
oint:/boot major:8 minor:2 fsType:xfs blockSize:0}]

stat tool does not report device with 0:35 but 0:37 (i.e. 25h/37d):

[root@vilnus rubinstein]# stat /
  File: ‘/’
  Size: 152       	Blocks: 0          IO Block: 4096   directory
Device: 25h/37d	Inode: 256         Links: 1
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:root_t:s0
Access: 2016-12-07 23:49:42.368681876 +0100
Modify: 2016-12-06 01:27:50.993617031 +0100
Change: 2016-12-06 01:27:50.993617031 +0100
 Birth: -

Here is also /etc/fstab

#
# /etc/fstab
# Created by anaconda on Tue Nov  8 15:25:05 2016
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=4706dd7f-81aa-4bb6-bcb6-e742aae08456 /                       btrfs   subvol=root     0 0
UUID=72a8c6a4-b442-48ef-a6ad-a8da2146fab6 /boot                   xfs     defaults        0 0
UUID=039B-CD7E          /boot/efi               vfat    umask=0077,shortname=winnt 0 0
UUID=56f454ff-1d8e-4900-b034-1f777df92c30 swap                    swap    defaults        0 0
UUID=4706dd7f-81aa-4bb6-bcb6-e742aae08456 /mnt/disk-4706dd7f-81aa-4bb6-bcb6-e742aae08456 btrfs   subvol=/ 0 0

Same errors appears on all nodes and all nodes have / mounted to btrfs partition.

What you expected to happen:

No errors, kubernetes 1.2.0 installed from CentOS 7 package had no such errors.

How to reproduce it (as minimally and precisely as possible):

Kubernetes binaries were installed from here :
https://storage.googleapis.com/kubernetes-release/release/v1.4.6/bin/linux/amd64

Run kubelet on btrfs root partition.

Anything else do we need to know:

I would like at least to have recommendations how to figure out what exactly issue is.

The text was updated successfully, but these errors were encountered:

dmrub · 2016-12-08T16:27:14Z

looking into /proc/PID/mountinfo file of the kubelet process I see 0:35 device, but no 0:37

# grep '0:35' /proc/24371/mountinfo
62 1 0:35 /root / rw,relatime shared:1 - btrfs /dev/sda4 rw,seclabel,ssd,space_cache
74 62 0:35 / /mnt/disk-4706dd7f-81aa-4bb6-bcb6-e742aae08456 rw,relatime shared:28 - btrfs /dev/sda4 rw,seclabel,ssd,space_cache

Looks like an issue with btrfs subvolumes related to this:
https://www.spinics.net/lists/linux-btrfs/msg58908.html
https://www.spinics.net/lists/linux-btrfs/msg59039.html

kubernetes#38337

ligc · 2017-01-13T08:27:37Z

HI, we ran into this issue on SLES 12 nodes, and the patch dc8b6cc does fix the issue, thanks, any plan on integrating the patch to Kubernetes?

kubernetes#38337

dmrub · 2017-01-13T09:41:41Z

I created pull request, but still need to sign CLA.

kubernetes/kubernetes#38337

dmrub · 2017-01-15T14:15:54Z

I created pull request for cadvisor google/cadvisor#1574

kubernetes#38337

kubernetes/kubernetes#38337

dashpole · 2017-03-15T15:27:07Z

This will hopefully make it into one of the first few patch releases of 1.6.

mcluseau · 2017-04-07T05:31:01Z

The same issue happens with tmpfs roots:

E0407 05:28:05.147388       1 summary.go:97] error while getting metrics summary from Kubelet 10.109.1.4(10.109.1.4:10255): request failed - "500 Internal Server Error", response: "Internal Error: failed RootFsInfo: error trying to get filesystem Device for dir /var/lib/kubelet: err: could not find device with major: 0, minor: 35 in cached partitions map"

# grep 0:35.*kubelet /proc/self/mountinfo 
462 199 0:35 /var/lib/kubelet /var/lib/kubelet rw,relatime shared:1 - tmpfs tmpfs rw,seclabel

In my case, I use CoreOS's kubelet-wrapper (rkt with the "fly" stage0) and /var/lib/kubelet is bind-mounted rshared.

dashpole · 2017-04-07T15:58:31Z

@MikaelCluseau, your bug looks more similar to #44059, since the major and minor numbers all match up. @dmrub, I would prefer closing this issue, as it was related to brtfs, and moving discussion on @MikaelCluseau's bug to #44059. For anyone experiencing the original problem, @dmrub's solution is included in the v1.5.6 patch release.

mcluseau · 2017-04-08T00:01:54Z

Thanks @dashpole I'll move there

This commit fixes the warning messages reported by kubelet when checking for the disk space on a btrfs `/` which has `/var/lib/kubelet` inside of a btrfs sub-volume. This fix follows the same principle adopted to fix issue kubernetes#38337 with commit dc8b6cc. This commit fixes issue 47046. Signed-off-by: Flavio Castelli <fcastelli@suse.com>

naevtamarkus · 2018-11-15T11:00:44Z

Does anybody know if this is still an issue?

zdzichu · 2020-12-16T09:28:33Z

Yes, still happening with v1.19.4+k3s-fadc5a80 on Fedora 33 with btrfs /.

adambkaplan · 2021-01-19T16:46:09Z

It appears there was a regression from v1.18 to 1.19 for systems with btrfs. It appears a fix to cAdvisor was required and will be in k8s 1.21, backports to 1.19 are in progress.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubelet cannot find device for dir /var/lib/kubelet in cached partitions map #38337

kubelet cannot find device for dir /var/lib/kubelet in cached partitions map #38337

dmrub commented Dec 8, 2016

dmrub commented Dec 8, 2016 •

edited

ligc commented Jan 13, 2017

dmrub commented Jan 13, 2017

dmrub commented Jan 15, 2017

dashpole commented Mar 15, 2017

mcluseau commented Apr 7, 2017 •

edited

dashpole commented Apr 7, 2017

mcluseau commented Apr 8, 2017

naevtamarkus commented Nov 15, 2018

zdzichu commented Dec 16, 2020

adambkaplan commented Jan 19, 2021

kubelet cannot find device for dir /var/lib/kubelet in cached partitions map #38337

kubelet cannot find device for dir /var/lib/kubelet in cached partitions map #38337

Comments

dmrub commented Dec 8, 2016

dmrub commented Dec 8, 2016 • edited

ligc commented Jan 13, 2017

dmrub commented Jan 13, 2017

dmrub commented Jan 15, 2017

dashpole commented Mar 15, 2017

mcluseau commented Apr 7, 2017 • edited

dashpole commented Apr 7, 2017

mcluseau commented Apr 8, 2017

naevtamarkus commented Nov 15, 2018

zdzichu commented Dec 16, 2020

adambkaplan commented Jan 19, 2021

dmrub commented Dec 8, 2016 •

edited

mcluseau commented Apr 7, 2017 •

edited