Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thrown into BusyBox on bootup when ZFS is root fs (0.6.5.6-1 Debian Jessie release) #200

Closed
azeemism opened this issue Apr 6, 2016 · 28 comments

Comments

@azeemism
Copy link

azeemism commented Apr 6, 2016

This was not the case for 0.6.5.2-2 Debian Jessie release.

@FransUrbo, could this be related to the commit for openzfs/zfs#4474. Note under 0.6.5.2-2 I had no trouble also loading /usr, /var, /var/log etc. on separate ZFS datasets (openzfs/zfs#4474 (comment))

On every startup/reboot I see the following:

1

/# mount -o zfsutil -t zfs rpool/ROOT/debian-8 /root
/# exit

2

/# mount -o zfsutil -t zfs /root

3

/# exit

Jessie appears to load normally after this point:

root@vbox1:~# dmesg | grep ZFS
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.16.0-4-amd64 root=ZFS=rpool/ROOT/debian-8 ro boot=zfs boot=zfs rpool=rpool bootfs=rpool/ROOT/debian-8 quiet
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.16.0-4-amd64 root=ZFS=rpool/ROOT/debian-8 ro boot=zfs boot=zfs rpool=rpool bootfs=rpool/ROOT/debian-8 quiet
[    4.272120] ZFS: Loaded module v0.6.5.6-1, ZFS pool version 5000, ZFS filesystem version 5
root@vbox1:~#
root@vbox1:~# zpool status
  pool: rpool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: none requested
config:

        NAME                                       STATE     READ WRITE CKSUM
        rpool                                      ONLINE       0     0     0
          mirror-0                                 ONLINE       0     0     0
            ata-VBOX_HARDDISK_VBbbb2e13d-f1007fb3  ONLINE       0     0     0
            ata-VBOX_HARDDISK_VB44950065-9dadc8f2  ONLINE       0     0     0

errors: No known data errors
root@vbox1:~# zfs mount
rpool/ROOT/debian-8             /
root@vbox1:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=504864,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,relatime,size=811672k,mode=755)
rpool/ROOT/debian-8 on / type zfs (rw,relatime,xattr,noacl)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=22,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
/dev/sda4 on /boot type ext4 (rw,noatime,data=ordered)
/dev/sda1 on /boot/efi type vfat 
(rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=utf8,shortname=mixed,errors=remount-ro)
rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
none on /media/sf_deb8 type vboxsf (rw,nodev,relatime)
root@vbox1:~#

Note: /boot is on ext4 and both /boot and /boot/efi are on a different disk

root@vbox1:~# zdb
rpool:
    version: 5000
    name: 'rpool'
    state: 0
    txg: 115
    pool_guid: 2570617493699306475
    errata: 0
    hostid: 8323329
    hostname: 'vbox1'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 2570617493699306475
        children[0]:
            type: 'mirror'
            id: 0
            guid: 6773440328371405558
            metaslab_array: 34
            metaslab_shift: 34
            ashift: 13
            asize: 2199007002624
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 1345902431162178517
                path: '/dev/disk/by-id/ata-VBOX_HARDDISK_VBbbb2e13d-f1007fb3-part1'
                whole_disk: 1
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 7771114504148432262
                path: '/dev/disk/by-id/ata-VBOX_HARDDISK_VB44950065-9dadc8f2-part1'
                whole_disk: 1
                create_txg: 4
    features_for_read:

STEPS to reproduce the issue:

see: http://zfsonlinux.org/debian.html

Install dependencies:

# apt-get install build-essential gawk alien fakeroot linux-headers-$(uname -r)
# apt-get install zlib1g-dev uuid-dev libblkid-dev parted lsscsi wget
# apt-get install git patch automake autoconf libtool init-system-helpers
# apt-get install libselinux1-dev

Then:

# apt-get install libnvpair1 libuutil1 libzfs2 libzpool2 spl zfsutils dkms spl-dkms zfs-dkms zfs-initramfs zfsonlinux
# apt-get update
# apt-get upgrade
# apt-get install debian-zfs

Prevent disk import as sdX, keep disk import as by-id:

# nano /etc/default/zfs
ZPOOL_IMPORT_PATH="/dev/disk/by-id"

Then remake initramfs:

# update-initramfs -c -k all
# update-grub
# reboot

Create rpool and other pools:

# zpool create -f -m none \
-o ashift=13 \
-o autoexpand=on \
-o autoreplace=on \
-o feature@lz4_compress=enabled \
-O compression=lz4 \
-O sync=disabled \
-O atime=off \
-O xattr=sa \
-O com.sun:auto-snapshot=true \
-O canmount=off \
-O overlay=on \
rpool \
mirror /dev/disk/by-id/ata-VBOX_HARDDISK_VBbbb2e13d-f1007fb3 /dev/disk/by-id/ata-VBOX_HARDDISK_VB44950065-9dadc8f2
# zpool export rpool
# zpool import -d /dev/disk/by-id -N rpool
# zfs create rpool/ROOT
# zfs set canmount=off rpool/ROOT
# zfs create rpool/ROOT/debian-8
# zfs set canmount=on rpool/ROOT/debian-8

Move files to rpool:

# zpool export rpool
# zpool import -f -d /dev/disk/by-id -o altroot=/sysroot -N rpool
# zfs set mountpoint=/ rpool/ROOT/debian-8
# modprobe efivars
# grub-probe /sysroot
# rsync -axv / /sysroot/

Recreate zpool.cache that goes missing when altroot is used:

# zpool set cachefile=/sysroot/etc/zfs/zpool.cache rpool

Setup /sysroot:

# mount --bind /dev /sysroot/dev
# chroot /sysroot /bin/bash
# mount -t proc proc /proc
# mount -t sysfs sysfs /sys
# mount /boot
# mount /boot/efi
# grub-probe / 

Comment out / in /etc/fstab

Under 0.6.5.2-2 there is not need to add rpool/ROOT/debian-8 / zfs default 0 0
Whether the line above is added or not added to /etc/fstab the same issue occurs even when using zfs set mountpoint=legacy rpool/ROOT/debian-8.

Update initramfs:

# cd /boot
# mv initrd.img-3.16.0-4-amd64 initrd.img-3.16.0-4-amd64.old-pre.zfs
# update-initramfs -c -k all

Update grub and exit chroot:

# nano /etc/default/grub
GRUB_CMDLINE_LINUX="boot=zfs rpool=rpool bootfs=rpool/ROOT/debian-8"
# zpool set bootfs=rpool/ROOT/debian-8 rpool
# update-grub
# cd /
# umount /boot/efi
# umount /boot
# umount /sys
# umount /proc
# umount /dev
# exit
# zpool export rpool
# reboot
@lnxsrt
Copy link

lnxsrt commented Apr 6, 2016

Also seeing this with the new version.

@FransUrbo
Copy link
Contributor

I'm currently looking at this.

Try booting with zfsdebug=1 and let me know what you see.

Also, in the first shell you get, please run zpool status to verify that the pool is actually imported correctly.

Btw, openzfs/zfs#4474 is NOT related. That happens way, way later in the whole startup procedure. You're getting a problem in the /usr/share/initramfs-tools/scripts/zfs, which is copied onto the initrd and then used for the booting (finding and importing the pool and the root fs etc).

@arcenik
Copy link

arcenik commented Apr 6, 2016

The issue 4474 is about systemd but this problem occurs before systemd in the initramfs.

It appears that ZFS_BOOTFS is cleared after pool import in the script /usr/share/initramfs-tools/scripts/zfs (line 278)

zfs-initramfs-import-pool

Fix : comment the line and regenerate initramfs

# update-initramfs -u

@FransUrbo
Copy link
Contributor

Better yet, change the "${pool}" to "${ZFS_RPOOL}".

@arcenik Does that work?

@arcenik
Copy link

arcenik commented Apr 6, 2016

@FransUrbo : that does not work, find_rootfs function still return nothing.

@arcenik
Copy link

arcenik commented Apr 6, 2016

@FransUrbo : the bootfs value for my zfs pool is "-" (which is the default value). Therefore the find_rootfs print nothing and return 1.

So initramfs script should change the ZFS_BOOTFS only if find_rootfs return 0 value

find_rootfs ${ZFS_RPOOL} && ZFS_BOOTFS="$(find_rootfs ${ZFS_RPOOL})"

@FransUrbo
Copy link
Contributor

@arcenik Ok, thanx. That makes sence.

@FransUrbo
Copy link
Contributor

How's this:

        # Import the pool (if not already done so in the AUTO check above).
        if [ -n "${ZFS_RPOOL}" -a -z "${POOL_IMPORTED}" ]
        then
                if import_pool "${ZFS_RPOOL}"; then
                        root_fs="$(find_rootfs "${ZFS_RPOOL}")"
                        [ -n "${root_fs}" ] && ZFS_BOOTFS="${root_fs}"
                fi
        fi

@Fabian-Gruenbichler
Copy link

@FransUrbo this last variant fixes this issue for me!

@FransUrbo
Copy link
Contributor

@Fabian-Gruenbichler, perfect thanx! I'll include that in the next update. I'll hold of a little to make sure there isn't any more issues lingering and then I'll build new packages for both Wheezy and Jessie.

Expect them in five-six hours (with any luck).

@arcenik
Copy link

arcenik commented Apr 6, 2016

@FransUrbo instead of checking the content printed by find_rootfs you should check it's return code.

root_fs="$(find_rootfs "${ZFS_RPOOL}")"
$? && ZFS_BOOTFS="${root_fs}"

@FransUrbo
Copy link
Contributor

@arcenik Even better, thanx!

@FransUrbo
Copy link
Contributor

@arcenik Actually, that don't seem to work:

[celia.pts/6]$ var="$(file /etc/passwd > /dev/null 2>&1)"
[celia.pts/6]$ [ "$?" ] && echo whatever
whatever
[celia.pts/6]$ var="$(file /etc/passwdX > /dev/null 2>&1)"
[celia.pts/6]$ [ "$?" ] && echo whatever
whatever
[celia.pts/6]$ var="$(file /etc/passwdX > /dev/null 2>&1)"
[celia.pts/6]$ "$?" && echo whatever
bash: 1: command not found

However, this works:

[celia.pts/6]$ if var="$(file /etc/passwdX > /dev/null 2>&1)"; then echo whatever; fi
[celia.pts/6]$ if var="$(file /etc/passwd > /dev/null 2>&1)"; then echo whatever; fi
whatever

@FransUrbo
Copy link
Contributor

So how's this:

        # Import the pool (if not already done so in the AUTO check above).
        if [ -n "${ZFS_RPOOL}" -a -z "${POOL_IMPORTED}" ]
        then
                if import_pool "${ZFS_RPOOL}"; then
                        if root_fs="$(find_rootfs "${ZFS_RPOOL}")"; then
                                ZFS_BOOTFS="${root_fs}"
                        fi
                fi
        fi

@FransUrbo
Copy link
Contributor

Or possibly even better:

        # Import the pool (if not already done so in the AUTO check above).
        if [ -n "${ZFS_RPOOL}" -a -z "${POOL_IMPORTED}" ]
        then
                import_pool "${ZFS_RPOOL}" && \
                        root_fs="$(find_rootfs "${ZFS_RPOOL}")" && \
                                ZFS_BOOTFS="${root_fs}"
        fi

@FransUrbo
Copy link
Contributor

This problem also exists a few lines above:

                OLD_IFS="${IFS}" ; IFS=";"
                for pool in ${POOLS}
                do
                        [ -z "${pool}" ] && continue

                        import_pool "${pool}"
                        ZFS_BOOTFS="$(find_rootfs "${pool}")"
                done
                IFS="${OLD_IFS}"

I'm suggesting:

                OLD_IFS="${IFS}" ; IFS=";"
                for pool in ${POOLS}
                do
                        [ -z "${pool}" ] && continue

                        import_pool "${pool}" && \
                                root_fs="$(find_rootfs "${pool}")" && \
                                        ZFS_BOOTFS="${root_fs}"
                        [ -n "${ZFS_BOOTFS} ] && break
                done
                IFS="${OLD_IFS}"

FransUrbo added a commit that referenced this issue Apr 6, 2016
@arcenik
Copy link

arcenik commented Apr 6, 2016

@ FransUrbo

Here is a working example for bash:

$  a=$(true); [ $? -ne 0 ] && echo aaaaaaaaaaaa
$  a=$(false); [ $? -ne 0 ] && echo aaaaaaaaaaaa
aaaaaaaaaaaa

It also works to with busybox sh but not in initramfs.

And the working script for initramfs:

zfs-initramfs-import-pool2

@FransUrbo
Copy link
Contributor

Still shouldn't do it if the import isn't successful, which my last examples fix.

@arcenik
Copy link

arcenik commented Apr 6, 2016

@FransUrbo : I'm trying your last proposal but ${POOLS} is empty on my test system

FransUrbo added a commit that referenced this issue Apr 6, 2016
@FransUrbo
Copy link
Contributor

It usually is... It's only in a few rare conditions when people have a root pool and a data pool (etc) that that is used. And when you specify boot=zfs zfs:AUTO and NOTHING else!

@FransUrbo
Copy link
Contributor

Do note that there's TWO different places where almost the same code is used, and I showed solutions to both of them. Be sure to apply the correct code to the correct place.

@FransUrbo
Copy link
Contributor

Closing this as fixed in 0.6.5.6-2 which is on it's way up to the repo now.

@arcenik
Copy link

arcenik commented Apr 6, 2016

Did you really tested it ??

zfs-boot-failed

@FransUrbo
Copy link
Contributor

Did you really tested it ??

No. I don't have any machine with root on zfs available at the moment.

I'll issue new debs right away.

@arcenik
Copy link

arcenik commented Apr 6, 2016

You could try this : https://github.com/arcenik/debian-zfs-root ;-)

@FransUrbo
Copy link
Contributor

You could try this : https://github.com/arcenik/debian-zfs-root ;-)

About a year or so ago, I created a native Debian GNU/Linux ISO image with native ZFS support.

I usually use that, but I don't have time to set anything up now. I might even have the original VMs around, but I don't have to find them either :).

Sorry about the mess-up.

@FransUrbo
Copy link
Contributor

0.6.5.6-3 just pushed to the repo.

@arcenik
Copy link

arcenik commented Apr 6, 2016

The version 0.6.5.6-3 is working with my install script. Debian with zfs root is booting without problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants