Skip to content
This repository has been archived by the owner. It is now read-only.

Failed to mount sysroot on reboot for nodes with a 'large' disk #2485

Closed
basvdlei opened this issue Jul 31, 2018 · 14 comments
Closed

Failed to mount sysroot on reboot for nodes with a 'large' disk #2485

basvdlei opened this issue Jul 31, 2018 · 14 comments

Comments

@basvdlei
Copy link

@basvdlei basvdlei commented Jul 31, 2018

Issue Report

Bug

Container Linux Version

NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1800.5.0
VERSION_ID=1800.5.0
BUILD_ID=2018-07-28-2250
PRETTY_NAME="Container Linux by CoreOS 1800.5.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

VMWare ESXi

Expected Behavior

When rebooting a node with a "large" disk it should be able to mount sysroot.

Actual Behavior

A node with a "large" disk fails to mount sysroot when it's rebooted:

systemd[1]: Mounting /sysroot...
EXT4-fs (sda9): ext4_check_descriptors: Block bitmap for group 0 overlaps block group descriptors
EXT4-fs (sda9): group descriptors corrupted!
mount[419]: mount: /sysroot: mount(2) system call failed: Structure needs cleaning.
systemd[1]: sysroot.mount: Mount process exited, code=exited status=32
Failed to mount /sysroot.

sysroot-mount-failed

Reproduction Steps

  1. Create a CoreOS node with a 3.91TB disk (have not been able to test other sizes yet)
  2. Root filesystem is resized and mounted correctly on the first boot
  3. Reboot the machine
  4. Mount of /sysroot fails during the boot

Other Information

We first observed this issue when a machine with a 3.91TB failed to update from 1745.7.0 to 1800.4.0. Version 1745.7.0 was still able to mount the filesystem while 1800.4.0 gave the error described above.

It looks like some regression was introduced in kernel 4.14.55 with the ext4 changes https://lwn.net/Articles/759535/ and (from what we could gather) this may even be the patch: https://patchwork.ozlabs.org/patch/950668/

All of our machines with smaller disks (<500GB) still boot and reboot correctly.

@adarshaj
Copy link

@adarshaj adarshaj commented Jul 31, 2018

We are affected by this too. Our platform is baremetal with 4TB disk.
I suspected something to do with using rook's cephfs as we used one of the directories on root partition for OSD. We tried running fsck (with latest e2fsprogs release, built on Jul 10 2018) but fsck reports disk as okay. I am attaching the rdsosreport.txt in case its helpful for triaging issue.

fwiw, I did a fresh installation after doing sgdisk -z /dev/sda (zap gpt), this failed too with the exact same issue, so I'm pretty sure its something with new kernel, not harddisk (also, smartctl reports disk as healthy). However, doing mkfs.ext4 -S /dev/sda9(WARNING: Lost all data), the above error stopped occuring, but root partition was completely erased (after running fsck) with lots and lots and lots of invalid metadata on inodes.

I tried with 1800.5.0 too, but same issue persists.

Loading

@adarshaj
Copy link

@adarshaj adarshaj commented Jul 31, 2018

This seems to be the fix - torvalds/linux@5012284 (read the commit msg for details and linked culprit commit at torvalds/linux@8844618 -- which is exactly the behavior reported above in the logs). How can we test this kernel?

Loading

@dm0-
Copy link

@dm0- dm0- commented Jul 31, 2018

I've cherry-picked the upcoming ext4 fixes (including the commit you linked) onto the current stable and produced a test image here: http://builds.developer.core-os.net/boards/amd64-usr/1800.5.0%2Bjenkins2-build-1800%2Blocal-1683/coreos_production_image.bin.bz2

Can you confirm that resolves the issue?

Loading

@adarshaj
Copy link

@adarshaj adarshaj commented Jul 31, 2018

Is there a way to test this without nuking the ROOT labelled partition? (for context, I'm running a node in tectonic cluster with cluo managing upgrades)

Loading

@dm0-
Copy link

@dm0- dm0- commented Jul 31, 2018

If the failure is reproducible by just mounting the root partition, you could try booting the ISO or PXE version and mounting the disk manually. That way, nothing will be overwritten on persistent storage.

Loading

@basvdlei
Copy link
Author

@basvdlei basvdlei commented Aug 1, 2018

Even with the test image, I'm still able to reproduce this issue.

I took both a 1800.5.0 image (https://stable.release.core-os.net/amd64-usr/1800.5.0/coreos_production_image.bin.bz2) and the test image of @dm0- above and ran through the following scenario.

  • Convert and resize the raw image to a 4TB qcow2 image
qemu-img convert -p -O qcow2 coreos_production_image.bin coreos_production_image.qcow2
qemu-img resize coreos_production_image.qcow2 4T
  • Created an booted a KVM VM using this image.
  • Let it boot to the login prompt.
  • Trigger a reboot.

kvm-test

Loading

@basvdlei
Copy link
Author

@basvdlei basvdlei commented Aug 1, 2018

Did a couple of more tests with different disk sizes (1TB -> 2TB -> 3TB). The 1TB and 2TB cases worked fine.

The 3TB drive case failed. I also noticed that it displayed an additional log line during the resizing:

EXT4-fs (sda9): Converting file system to meta_bg

Just to make sure, was this commit included in the test image? torvalds/linux@44de022

Loading

@bgilbert
Copy link
Member

@bgilbert bgilbert commented Aug 2, 2018

torvalds/linux@44de022 is not currently in the stable queue for kernels older than 4.17 because of a trivial patch conflict (see e.g. 4.14). I've reproduced the issue on 1800.5.0, and confirmed that the combination of torvalds/linux@44de022 and the other ext4 changes queued for 4.14 fixes the problem.

Loading

@bgilbert
Copy link
Member

@bgilbert bgilbert commented Aug 2, 2018

Backport posted to stable@.

Loading

@adarshaj
Copy link

@adarshaj adarshaj commented Aug 2, 2018

So we should wait until a new point release gets to stable channel here: https://coreos.com/releases/ before upgrading, right?

Loading

@bgilbert
Copy link
Member

@bgilbert bgilbert commented Aug 2, 2018

@adarshaj Correct. We'll backport the fix to the existing release branches.

Loading

@basvdlei
Copy link
Author

@basvdlei basvdlei commented Aug 8, 2018

Thanks! Release 1800.6.0 solves this issue for us. We successfully updated our 4TB nodes.

Loading

@adarshaj
Copy link

@adarshaj adarshaj commented Aug 9, 2018

I can confirm too, all the bare metal instances with 4TB disks have successfully upgraded and harddisks are getting mounted without any issues. Thanks!

I guess this issue can be closed now.

Loading

@bgilbert
Copy link
Member

@bgilbert bgilbert commented Aug 13, 2018

Fixed in alpha 1855.1.0, beta 1828.3.0, and stable 1800.6.0, and upstream in kernel 4.14.62. Thanks for reporting.

Loading

@bgilbert bgilbert closed this Aug 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants