Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail on unmount/rmdir in overlay.Driver.Put #1607

Merged
merged 1 commit into from May 25, 2023

Conversation

mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented May 19, 2023

… instead of only logging a debug-level failure.

We are seeing layers being mounted while our metadata claims that they are unmounted, and this seems to be a way how that could happen without leaving a visible trace.

It's unclear to me why this silently succeeds; this behavior is very long-standing.

Motivated by containers/podman#17216 , where we seem to be seeing mounted layers with zero count in mount points.json.

⚠️ I don’t really know what I’m doing, there might be a good reason for the current behavior. Alternatively, I’d like the Debug logs to be elevated to at least a Warning, possibly outright Error. @nalind @giuseppe PTAL.

@mtrmac
Copy link
Collaborator Author

mtrmac commented May 19, 2023

I have filed containers/podman#18634 to trigger a Podman test run at least, to see if there is something very obvious I’m missing.

mtrmac added a commit to mtrmac/libpod that referenced this pull request May 19, 2023
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request May 19, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 19, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request May 19, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 19, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 19, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 19, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 20, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 20, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 20, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 20, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 21, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
instead of only logging a debug-level failure.

We are seeing layers being mounted while our metadata
claims that they are unmounted, and this seems to be a way
how that could happen without leaving a visible trace.

It's unclear to me why this silently succeeds; this behavior is
very long-standing.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request May 22, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@rhatdan
Copy link
Member

rhatdan commented May 23, 2023

@giuseppe PTAL

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how these failures could happen, since it is performing a lazy umount, but I agree it is better to know if something goes wrong so we know what to look at.

LGTM

@mtrmac
Copy link
Collaborator Author

mtrmac commented May 23, 2023

I don't know how these failures could happen, since it is performing a lazy umount

I don’t know that either, but I am seeing EINVAL on unmount in containers/podman#18634 . (At this point I have no idea whether that’s a long-standing bug to be fixed, a failure that should be ignored, or a short-term fallout of some unrelated bug.)

mtrmac added a commit to mtrmac/libpod that referenced this pull request May 23, 2023
Nerf the bloat check, this PR is testing-only

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@rhatdan rhatdan changed the title RFC: Fail on unmount/rmdir in overlay.Driver.Put Fail on unmount/rmdir in overlay.Driver.Put May 25, 2023
@rhatdan
Copy link
Member

rhatdan commented May 25, 2023

LGTM

@rhatdan rhatdan merged commit ff9ef3e into containers:main May 25, 2023
18 checks passed
@mtrmac mtrmac deleted the overlay-put-failures branch May 25, 2023 16:08
Luap99 added a commit to Luap99/libpod that referenced this pull request Jun 9, 2023
A c/storage PR[1] chnage the behavior to correctly report umount errors.
This is causing problem in the updgrade tests. The problem is that a
cotnainer is mounted inside another container and then unmounted on the
host. Therefore both operations happen in different mount namespaces.
this is expcted but we want to share the mounts between them. This is
the default but c/stroage make the root private by default thus the
mounts were not shared. To fix this use the `skip_mount_home` storage
option so the mount is kept shared.

[1] containers/storage#1607

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/libpod that referenced this pull request Jun 9, 2023
A c/storage PR[1] chnage the behavior to correctly report umount errors.
This is causing problem in the updgrade tests. The problem is that a
cotnainer is mounted inside another container and then unmounted on the
host. Therefore both operations happen in different mount namespaces.
this is expcted but we want to share the mounts between them. This is
the default but c/stroage make the root private by default thus the
mounts were not shared. To fix this use the `skip_mount_home` storage
option so the mount is kept shared.

[1] containers/storage#1607

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/libpod that referenced this pull request Jun 12, 2023
A c/storage PR[1] chnage the behavior to correctly report umount errors.
This is causing problem in the updgrade tests. The problem is that a
cotnainer is mounted inside another container and then unmounted on the
host. Therefore both operations happen in different mount namespaces.
this is expcted but we want to share the mounts between them. This is
the default but c/stroage make the root private by default thus the
mounts were not shared. To fix this use the `skip_mount_home` storage
option so the mount is kept shared.

[1] containers/storage#1607

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
cgiradkar pushed a commit to cgiradkar/podman that referenced this pull request Jul 12, 2023
A c/storage PR[1] chnage the behavior to correctly report umount errors.
This is causing problem in the updgrade tests. The problem is that a
cotnainer is mounted inside another container and then unmounted on the
host. Therefore both operations happen in different mount namespaces.
this is expcted but we want to share the mounts between them. This is
the default but c/stroage make the root private by default thus the
mounts were not shared. To fix this use the `skip_mount_home` storage
option so the mount is kept shared.

[1] containers/storage#1607

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
cgiradkar pushed a commit to cgiradkar/podman that referenced this pull request Jul 13, 2023
A c/storage PR[1] chnage the behavior to correctly report umount errors.
This is causing problem in the updgrade tests. The problem is that a
cotnainer is mounted inside another container and then unmounted on the
host. Therefore both operations happen in different mount namespaces.
this is expcted but we want to share the mounts between them. This is
the default but c/stroage make the root private by default thus the
mounts were not shared. To fix this use the `skip_mount_home` storage
option so the mount is kept shared.

[1] containers/storage#1607

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
cgiradkar pushed a commit to cgiradkar/podman that referenced this pull request Jul 17, 2023
A c/storage PR[1] chnage the behavior to correctly report umount errors.
This is causing problem in the updgrade tests. The problem is that a
cotnainer is mounted inside another container and then unmounted on the
host. Therefore both operations happen in different mount namespaces.
this is expcted but we want to share the mounts between them. This is
the default but c/stroage make the root private by default thus the
mounts were not shared. To fix this use the `skip_mount_home` storage
option so the mount is kept shared.

[1] containers/storage#1607

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
cgiradkar pushed a commit to cgiradkar/podman that referenced this pull request Jul 17, 2023
A c/storage PR[1] chnage the behavior to correctly report umount errors.
This is causing problem in the updgrade tests. The problem is that a
cotnainer is mounted inside another container and then unmounted on the
host. Therefore both operations happen in different mount namespaces.
this is expcted but we want to share the mounts between them. This is
the default but c/stroage make the root private by default thus the
mounts were not shared. To fix this use the `skip_mount_home` storage
option so the mount is kept shared.

[1] containers/storage#1607

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
flouthoc added a commit to flouthoc/storage that referenced this pull request Aug 14, 2023
It seems there are cases where underlying layers can be already
unmounted in such case the layer directory is not removed and put fails
early with `EINVAL` causing dangling directory for these layers.

Following case may be a hidden underlying issue with `unmounting` of
`c/stoage` which needs more investigation however prior discussion shows
that this is a long standing issue in `storage` and which was never
caught till containers#1607

Signed-off-by: Aditya R <arajan@redhat.com>
flouthoc added a commit to flouthoc/storage that referenced this pull request Aug 14, 2023
It seems there are cases where underlying layers can be already
unmounted in such case the layer directory is not removed and put fails
early with `EINVAL` causing dangling directory for these layers.

Following case may be a hidden underlying issue with `unmounting` of
`c/stoage` which needs more investigation however prior discussion shows
that this is a long standing issue in `storage` and which was never
caught till containers#1607

Signed-off-by: Aditya R <arajan@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants