Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Booting into root filesystem on NFS fails with recent kernels #860

Closed
gber opened this issue Aug 17, 2023 · 15 comments
Closed

Booting into root filesystem on NFS fails with recent kernels #860

gber opened this issue Aug 17, 2023 · 15 comments

Comments

@gber
Copy link
Contributor

gber commented Aug 17, 2023

The boot process fails with the following error:

Running: mount -t overlay -o upperdir=/run/initramfs/ltsp/0/up,lowerdir=/root,workdir=/run/initramfs/ltsp/0/work /run/initramfs/ltsp /root/
[   12.587974] overlayfs: failed to retrieve lower fileattr (sbin/init, err=-6)
mv: can't rename '/root/sbin/init': No such device or address
LTSP command failed: mv /root/sbin/init /root/sbin/init.ltsp
Aborting ltsp
LTSP boot error! Enable DEBUG_SHELL to troubleshoot!

This happens in 55-initrd-bottom.sh:76 when trying to temporarily rename /sbin/init.

The underlying cause is a kernel issue, rename(2) of a symlink on an overlayfs where the lower filesystem is NFS always
fails with ENXIO.

This is an issue has been reproduced on kernels 5.15.0 and 6.1.0, on 5.10.0 and earlier it does not happen.

See also:

gber pushed a commit to gber/ltsp that referenced this issue Aug 17, 2023
…ue (ltsp#860)

This affects at least Debian bookworm with Linux 6.1.0.
Closes #1049397
gber pushed a commit to gber/ltsp that referenced this issue Aug 17, 2023
…ue (ltsp#860)

This affects at least Debian bookworm with Linux 6.1.0.
Closes #1049397
gber pushed a commit to gber/ltsp that referenced this issue Aug 17, 2023
…ue (ltsp#860)

This affects at least Debian bookworm with Linux 6.1.0.
Closes #1049397
@alkisg
Copy link
Member

alkisg commented Aug 17, 2023

Thank you gber!
Closing the issue as "merged upstream", but let's continue chatting about whatever else is needed.

@vagrantc, @sunweaver, how can we get it to the next Bookworm point release?

@alkisg alkisg closed this as completed Aug 17, 2023
@sunweaver
Copy link
Contributor

@alkisg, @vagrantc : If you are ok with it, I can do an upload to Debian unstable as 23.02-2 (simply with this patch applied) and then do a follow-up upload to bookworm as 23.02-1+deb12u1 also only with this single patch applied.

I can push changes to a new debian/sid and the alreadying existing debian/bookworm branch (of this repository) if wanted/agreed/needed.

@alkisg
Copy link
Member

alkisg commented Aug 17, 2023

@sunweaver let's wait for a day to give vagrantc a chance to reply if he wants to do the upload.
Otherwise go ahead, I'm fine with it, and thank you.

@sunweaver
Copy link
Contributor

@alkisg ack. Will continue on this tomorrow.

@alkisg
Copy link
Member

alkisg commented Aug 19, 2023

@sunweaver OK thanks, go ahead now, no point in waiting more. Cheers!

@sunweaver
Copy link
Contributor

sunweaver commented Aug 19, 2023

@alkisg @vagrantc @gber I have now uploaded ltsp 23.02-2 to Debian unstable and 23.02-1+deb12u1 to Debian bookworm (this will need some time until it appears in bookworm-proposed-updates).

I have a recent copy of the ltsp Git repo under https://github.com/sunweaver/ltsp/. From there, please pull over the debian/sid branch, the debian/bookworm branch and the added tags I added for the uploads.

The bookworm-pu acceptance request has also been sent to Debian BTS:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1050090

Once the Debian release team has accepted the upload, we can consider this cased closed. Thanks for the quick response and cooperation.

@sunweaver
Copy link
Contributor

sunweaver commented Aug 23, 2023

@alkisg @vagrantc @gber I realized yesterday, that I still fail to boot into a bookworm rootfs via NFS if the LTSP host is a Debian 11. As the ltsp initrd gets provided by the LTSP host and is not part of the rootfs, we need to provide @gber fix also in Debian 11. I did an upload of 22.01-1+deb11u1 for this yesterday.

Please pull over my debian/bullseye branch and the related tag from https://github.com/sunweaver/ltsp for having that change in this repo, as well.

Thanks+Greets!
Mike

@sunweaver
Copy link
Contributor

sunweaver commented Aug 23, 2023

@alkisg also, I see that you haven't pulled over debian/sid and debian/bookworm (+ tags) from my repo, yet. Please make sure you pull those changes over into this repo, so everything here is up-to-date.

@alkisg
Copy link
Member

alkisg commented Aug 23, 2023

@sunweaver I only work on the main branch, while the debian/bullseye and debian/bookworm branches are managed by vagrantc, I don't touch them at all!

Let's leave it up to him to act on them when he's available; they don't block/affect the Debian upload process, right?

@sunweaver
Copy link
Contributor

sunweaver commented Aug 23, 2023

@sunweaver I only work on the main branch, while the debian/bullseye and debian/bookworm branches are managed by vagrantc, I don't touch them at all!

My worries here are that @vagrantc might forget to sync over those branches and continues on those branches with other stuff. As those are the packages that will (likely, the release team still needs to do their review) land in Debian bullseye / bookworm, the changes should end up here at the latest when the uploads have been accepted by the RT.

Let's leave it up to him to act on them when he's available; they don't block/affect the Debian upload process, right?

The upload process is not affected, however, I'd like to avoid that what is in Debian runs out of sync with what is on those debian/* branches in this Git.

@vagrantc
Copy link
Collaborator

I've been watching but playing catchup after a few days offline ... will take a look at the branches now...

@vagrantc
Copy link
Collaborator

Thanks for addressing the issue!

I am somewhat surprised it was not noticed for the entirety of the bullseye and bookworm release cycles ... ? Was it triggered by a backported linux kernel patch, maybe?

I've pushed a debian/trixie branch (based on @sunweaver debian/sid branch), a debian/bullseye branch and a debian/bookworm branch. Typically, sid uploads come right out of upstream, but if we need anything in debian/patches, I prefer to use a branch name matching the targeted release (e.g. trixie in this case, even though the upload goes through unstable).

Somewhat minor, but on all three branches, I removed the debian/patches/README file, which was not really appropriate for a Non-Maintainer Upload(NMU), as it does not describe the maintainer workflow with patches...

I had at some point entirely given up on using NFS directly, as it seemed to introduce many quirky problems like this in the past...

@sunweaver
Copy link
Contributor

@vagrantc In bullseye the issue only pops up on bullseye LTSP hosts/servers serving NFS rootfs based on bookworm.

On a bookworm only setup, this should have indeed popped up. The cause is in the Linux kernel introduced with 5.15.

Thanks for updating the branch and sorry for d/p/README.

Personally, I use rootfs on NFS for developing and testing the LTSP image, once that has been consolidated, I serve squashfs images via NFS.

@alkisg
Copy link
Member

alkisg commented Aug 24, 2023

@sunweaver ltsp.img only updates the target files if they're newer.
So bullseye server + bookworm chroot shouldn't be affected unless the server has local modifications, right?

$ grep 'cp -au' 55-initrd-bottom.sh 
    rsr cp -au /usr/share/ltsp "$rootmnt/usr/share/" ||
    rsr cp -au /etc/ltsp "$rootmnt/etc/" ||

@sunweaver
Copy link
Contributor

sunweaver commented Aug 24, 2023

@sunweaver ltsp.img only updates the target files if they're newer. So bullseye server + bookworm chroot shouldn't be affected unless the server has local modifications, right?

$ grep 'cp -au' 55-initrd-bottom.sh 
    rsr cp -au /usr/share/ltsp "$rootmnt/usr/share/" ||
    rsr cp -au /etc/ltsp "$rootmnt/etc/" ||

Hmmm... Interesting... In practice, the boot of a bookworm chroot from a bullseye LTSP server failed. I applied @gber 's patch on top of 21.01-1 on the LTSP server, updated the ltsp initrd and then booting the bookworm chroot worked.

(side note:) We are booting bullseye chroots from bullseye LTSP server all the time, so that combination works anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants