-
-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OverlayFS broken on NixOS Kernel 4.19 and 4.20 #54509
Comments
Hmm, this is a blocker for 19.03, can't have either side of the current situation while keeping the latest LTS kernel. Thanks for the (wip) test, I bet it'll be useful to check for the same kind of regression. cc @lheckemann to keep in mind. @aszlig was the issue we were working around reported upstream? I guess it hasn't been fixed since the patch (a partial revert) still applies. |
I'm trying to debug the issue but I'm confused right now. I tried to run the tests without the patches in #52942 but I then get the error:
I'm not sure this is actually the issue #52942 is working around? I was first suspecting it has something to do with the interaction between
It runs without issue using the same Any ideas what is different between running a test VM and my second directly built VM? |
IIRC, yes, |
Ok but I still don't understand why it works in the non test case VM. |
It does for me, from your commit, reverted the patch. (Also added a workaround for the gdk_pixbuf mismatch between 18.09 and unstable #54278.)
|
For what it's worth, I can confirm the issue. Pretty much all my docker containers are broken after the recent channel update to 19.03pre166987.bc41317e243 a.k.a. nixos-unstable. |
Hmm, looks like we're between a rock and a hard place, something changed in the kernel breaking our tests (not sure if it's userland that broke) and the revert to make the tests pass breaks overlayfs. The simple solution, and what probably will happen for 19.03 unless something is figured out, is to revert back to the previous LTS as default, but this is not an ideal solution. |
Should we revert #52942? Maybe there's a better way to fix the test failures. |
The report has been posted to |
Yes, please. As much as it sucks, a situation where |
I think reverting #52942 and got back to 4.14 as the default would be the most reasonable thing to do until the issue is resolved. |
Whatever we do, we should do it soon, because the current situation is really bad and IMHO we shouldn't have kept |
I agree with the suggestion to revert #52942 and go back to 4.14. Then we can see about alternative solutions to the test failures with 4.19. |
Hm, it seems my mail (
(This was on Jan 29 20:52:53 CET) @szmi: Is the edit: Just sent it again without attachments, let's hope this was the culprit. |
This reverts commit de86af4. (Manual revert due to conflicts.) See NixOS#54509 The patch is causing overlayfs to misbehave.
This reverts commit b861ebb. The current issues (See NixOS#54509 and NixOS#48828) are causing headaches to users of the unstable branches.
Okay, worked without attachments: https://www.spinics.net/lists/linux-unionfs/msg06627.html |
There are a bunch of overlay fixes in 4.20.7, can you try the latest kernels? |
Same fixes with 4.19 (which is desired), looking this evening. |
:( |
WTF |
In case you didn't realise, this is the same initial issue which caused us to add a patch reverting part of the changes to make the tests pass. This is the subject of the thread. So it looks like there were other changes in passing, unrelated to our problem. |
Is somebody able to create the setup as it is in during the startup of the nixos test VM in a simplified version. One that would allow to have access to the shell right befor the |
For a new nixos-install of 18.09 I’m seeing something similar, with the main difference that except after switch_root it says “No such file or directory”...is this the same issue? |
@tbenst this particular issue shouldn't affect 18.09, and what you are experiencing shouldn't be this issue considering (1) the faulty revert patch was reverted (2) the current issue affects the testing infra. (To be fair, your comment could be about the testing infra, it's unspecified if it's your system boot or a VM.) Though, I'm not downplaying whatever issue you're facing! In fact, let's try and figure out what's going on. Ideally, open an issue with the following information (and mention me and this issue in the body).
|
Thanks, @samueldr for the very kind comment! I figured out the issue thankfully--as soon as I enabled systemd-boot everything worked. |
Just to keep everyone outside of #nixos-dev up to date, we do have a fix for this. Here is the upstream post and patch: https://www.spinics.net/lists/linux-unionfs/msg06733.html |
In Linux 4.19 there has been a major rework of the overlayfs implementation and it now opens files in lowerdir with O_NOATIME, which in turn caused issues in our VM tests because the process owner of QEMU doesn't match the file owner of the lowerdir. The crux here is that 9p propagates the O_NOATIME flag to the host and the guest kernel has no way of verifying whether that flag will lead to any problems beforehand. There is ongoing work to possibly fix this in the kernel, but it will take a while until there is a working patch and consensus. So in order to bring our default kernel back to 4.19 and of course make it possible to run newer kernels in VM tests, I'm merging a small QEMU patch as an interim solution, which we can drop once we have a working fix in the next round of stable kernels. Now we already had Linux 4.19 set as the default kernel, but that was subsequently reverted in 048c36c because the patch we have used was the revert of the commit I bisected a while ago. This patch broke overlayfs in other ways, so I'm also merging in a VM test by @bachp, which only tests whether overlayfs is working, just to be on the safe side that something like this won't happen in the future. Even though this change could be considered a moderate mass-rebuild at least for GNU/Linux, I'm merging this to master, mainly to give us some time to get it into the current 19.03 release branch (and subsequent testing window) once we got no new breaking builds from Hydra. Cc: @samueldr, @lheckemann Fixes: #54509 Fixes: #48828 Merges: #57641 Merges: #54508
In Linux 4.19 there has been a major rework of the overlayfs implementation and it now opens files in lowerdir with O_NOATIME, which in turn caused issues in our VM tests because the process owner of QEMU doesn't match the file owner of the lowerdir. The crux here is that 9p propagates the O_NOATIME flag to the host and the guest kernel has no way of verifying whether that flag will lead to any problems beforehand. There is ongoing work to possibly fix this in the kernel, but it will take a while until there is a working patch and consensus. So in order to bring our default kernel back to 4.19 and of course make it possible to run newer kernels in VM tests, I'm merging a small QEMU patch as an interim solution, which we can drop once we have a working fix in the next round of stable kernels. Now we already had Linux 4.19 set as the default kernel, but that was subsequently reverted in 048c36c because the patch we have used was the revert of the commit I bisected a while ago. This patch broke overlayfs in other ways, so I'm also merging in a VM test by @bachp, which only tests whether overlayfs is working, just to be on the safe side that something like this won't happen in the future. Even though this change could be considered a moderate mass-rebuild at least for GNU/Linux, I'm merging this to master, mainly to give us some time to get it into the current 19.03 release branch (and subsequent testing window) once we got no new breaking builds from Hydra. Cc: @samueldr, @lheckemann Fixes: #54509 Fixes: #48828 Merges: #57641 Merges: #54508 (cherry picked from commit 12efcc2)
QEMU's local 9pfs server passes through O_NOATIME from the client. If the QEMU process doesn't have permissions to use O_NOATIME (namely, it does not own the file nor have the CAP_FOWNER capability), the open will fail. This causes issues when from the client's point of view, it believes it has permissions to use O_NOATIME (e.g., a process running as root in the virtual machine). Additionally, overlayfs on Linux opens files on the lower layer using O_NOATIME, so in this case a 9pfs mount can't be used as a lower layer for overlayfs (cf. https://github.com/osandov/drgn/blob/dabfe1971951701da13863dbe6d8a1d172ad9650/vmtest/onoatimehack.c and NixOS/nixpkgs#54509). Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g., network filesystems. open(2) notes that O_NOATIME "may not be effective on all filesystems. One example is NFS, where the server maintains the access time." This means that we can honor it when possible but fall back to ignoring it. Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Message-Id: <e9bee604e8df528584693a4ec474ded6295ce8ad.1587149256.git.osandov@fb.com> Signed-off-by: Greg Kurz <groug@kaod.org>
QEMU's local 9pfs server passes through O_NOATIME from the client. If the QEMU process doesn't have permissions to use O_NOATIME (namely, it does not own the file nor have the CAP_FOWNER capability), the open will fail. This causes issues when from the client's point of view, it believes it has permissions to use O_NOATIME (e.g., a process running as root in the virtual machine). Additionally, overlayfs on Linux opens files on the lower layer using O_NOATIME, so in this case a 9pfs mount can't be used as a lower layer for overlayfs (cf. https://github.com/osandov/drgn/blob/dabfe1971951701da13863dbe6d8a1d172ad9650/vmtest/onoatimehack.c and NixOS/nixpkgs#54509). Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g., network filesystems. open(2) notes that O_NOATIME "may not be effective on all filesystems. One example is NFS, where the server maintains the access time." This means that we can honor it when possible but fall back to ignoring it. Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Message-Id: <e9bee604e8df528584693a4ec474ded6295ce8ad.1587149256.git.osandov@fb.com> Signed-off-by: Greg Kurz <groug@kaod.org> (cherry picked from commit a5804fc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
QEMU's local 9pfs server passes through O_NOATIME from the client. If the QEMU process doesn't have permissions to use O_NOATIME (namely, it does not own the file nor have the CAP_FOWNER capability), the open will fail. This causes issues when from the client's point of view, it believes it has permissions to use O_NOATIME (e.g., a process running as root in the virtual machine). Additionally, overlayfs on Linux opens files on the lower layer using O_NOATIME, so in this case a 9pfs mount can't be used as a lower layer for overlayfs (cf. https://github.com/osandov/drgn/blob/dabfe1971951701da13863dbe6d8a1d172ad9650/vmtest/onoatimehack.c and NixOS/nixpkgs#54509). Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g., network filesystems. open(2) notes that O_NOATIME "may not be effective on all filesystems. One example is NFS, where the server maintains the access time." This means that we can honor it when possible but fall back to ignoring it. Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Message-Id: <e9bee604e8df528584693a4ec474ded6295ce8ad.1587149256.git.osandov@fb.com> Signed-off-by: Greg Kurz <groug@kaod.org> (cherry picked from commit a5804fc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Issue description
When using overlayfs with NixOS and Kernel 4.19 it is not possible to overwrite a file that already exists in the lower directory with new content. Instead if attempted the file appears empty. (See how to reproduce for details).
It seems like the revert of an upstream commit in 4.19 #52942 breaks something fundamentally in overlayfs.
EDIT: Reverting #52942 fixes the behavior. But I'm unable to run the tests as the VM kernel pancis :(
This also affects docker if used with one of the overlay drivers.
Steps to reproduce
Via the test in #54508
or
Manually:
Technical details
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste theresults.
The text was updated successfully, but these errors were encountered: