Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupadd fails inside container when run rootless #183

Closed
castedo opened this issue Feb 13, 2020 · 18 comments
Closed

groupadd fails inside container when run rootless #183

castedo opened this issue Feb 13, 2020 · 18 comments

Comments

@castedo
Copy link

castedo commented Feb 13, 2020

/kind bug

Description

Executing "groupadd foo" within a debian or ubuntu container fails when run by rootless podman but works when run by root/sudo podman (podman run from RHEL8). I'm not sure if this is a bug or an indication https://github.com/containers/libpod/blob/master/rootless.md should get updated and maybe an enhancement should be made. Either way, I found this behavior surprising and confusing.

The impact of this is that many debian/ubuntu packages will fail to get-apt install when run inside a rootless container because many packages will have some dependency that will run groupadd within an install script (e.g. dbus package).

In addition to "debian:10.2" I also repro with more recent "debian:bullseye-20200130" and "ubuntu:19.10" and "ubuntu:focal-20200115".

Steps to reproduce the issue:

  1. as non-root run podman run -it --rm library/debian:10.2

  2. inside container run groupadd foo

Describe the results you received:

groupadd: /etc/group.7: lock file already used
groupadd: cannot lock /etc/group; try again later.

Describe the results you expected:

Expected it to succeed with no output which is what happens when running the same podman command but as root/sudo.

Output of podman version:

podman version 1.6.4

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.12.12
  podman version: 1.6.4
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.6-1.module+el8.1.1+5259+bcdd613a.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 6ffbb2ec70dbe5ba56e4bfde946fb04f19dd8bbf'
  Distribution:
    distribution: '"rhel"'
    version: "8.1"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 472834048
  MemTotal: 7902752768
  OCIRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module+el8.1.1+5259+bcdd613a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 4294176768
  SwapTotal: 4294963200
  arch: amd64
  cpus: 4
  eventlogger: journald
  hostname: ham.home
  kernel: 4.18.0-147.5.1.el8_1.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.2-2.git21fdece.module+el8.1.1+5460+3ac089c3.x86_64
    Version: |-
      slirp4netns version 0.4.2+dev
      commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
  uptime: 15h 53m 12.89s (Approximately 0.62 days)
registries:
  blocked: null
  insecure: null
  search:
  - private.dkr.ecr.us-east-1.amazonaws.com
  - registry.redhat.io
  - registry.access.redhat.com
  - quay.io
  - docker.io
store:
  ConfigFile: /home/castedo/.config/containers/storage.conf
  ContainerStore:
    number: 4
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-1.module+el8.1.1+5259+bcdd613a.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  GraphRoot: /home/castedo/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 44
  RunRoot: /run/user/1000
  VolumePath: /home/castedo/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.6.4-2.module+el8.1.1+5363+bf8ff1af.x86_64
@castedo castedo changed the title debian/ubuntu groupadd failed inside container when run rootless debian/ubuntu groupadd fails inside container when run rootless Feb 13, 2020
@mheon
Copy link
Member

mheon commented Feb 13, 2020

@giuseppe Potentially fuse-overlayfs? Is this a bug we know about already?

@giuseppe
Copy link
Member

it works for me both with fuse-overlayfs 0.7.2 and with the version from git master, it could be something different.

@castedo could you try the following:

$ podman run --name foo -it --rm library/debian:10.2

from another terminal:

$ strace -o /tmp/strace.log -f -Z -p $(PID of the shell inside the container, you can get it from runc list)

then run again the groupadd inside of the container. After it fails, kill the strace process and please attach here the /tmp/strace.log file

@castedo
Copy link
Author

castedo commented Feb 13, 2020

Thanks for looking at it!

I attach two because the first one (strace.log) was after a fresh reboot and showed a new error (below). The second one (strace2.log) was a repeat soon afterwards without the error.
strace.log
strace2.log

Error that results from first run after fresh reboot (immediately upon running podman run):
ERRO[0000] Error refreshing volume castedo.conf: error acquiring lock 0 for volume castedo.conf: file exists
Let me know if you want more details on where this castedo.conf comes from and why it's showing up. At the moment I don't know why that error is happening on the first run.

@giuseppe giuseppe transferred this issue from containers/podman Feb 14, 2020
@giuseppe
Copy link
Member

could you try with fuse-overlayfs 0.7.6?

@castedo
Copy link
Author

castedo commented Feb 14, 2020

I downloaded fuse-overlayfs-0.7.6-2.0.dev.gitce61c7e.fc32.x86_64.rpm from the F32 repository and tried to install it locally on my RHEL8.1 desktop. However I get the following dependency problem:

Error: transaction check vs depsolve:
rpmlib(PayloadIsZstd) <= 5.4.18-1 is needed by fuse-overlayfs-0.7.6-2.0.dev.gitce61c7e.fc32.x86_64

yum whatprovides 'rpmlib(PayloadIsZstd)' does not suggest anything to install. I believe I'm getting hit by this issue https://bugzilla.redhat.com/show_bug.cgi?id=1715799 which is a new rpm capability not backported into RHEL8 yet. Sounds like maybe 8.2 will support it and it's in CentsOS Stream.

Any suggestions?

@giuseppe
Copy link
Member

giuseppe commented Feb 14, 2020

the easiest would be to build from sources on the target system

@castedo
Copy link
Author

castedo commented Feb 14, 2020

A new data point: I get the same error with running groupadd in a RHEL 8 ubi container. This was definitely not happening a week or two ago. So some change on my machine seems to be causing groupadd in rootless container when it was not a week or two ago. I did a yum update of podman about a week ago. I also know I did a reboot about a week ago which killed some podman related process that caused a virtual IP address to not get cleaned up (there is another bug somewhere that it tracking that IP clean up issue).

@castedo
Copy link
Author

castedo commented Feb 14, 2020

More data points:
This appears to be a regression introduced by my upgrade of podman and/or fuse-overlayfs two days ago and not due to the state of my machine.

Now that I've downgraded from fuse-overlayfs 0.7.2 back to 0.4.1 and podman from 1.6.2 back to 1.4.2, this problem is no longer happening, I can run groupadd fine within containers.

Also, BEFORE I downgraded, I did the reproduction steps on a different local user account that has not been using podman and the same problem occurred. I'm quite certain this second user account was in a clean state as far as whatever rootless files podman or fuse-overlayfs might have been present.

@castedo castedo changed the title debian/ubuntu groupadd fails inside container when run rootless groupadd fails inside container when run rootless Feb 14, 2020
giuseppe added a commit to giuseppe/fuse-overlayfs that referenced this issue Feb 15, 2020
There is an issue on RHEL 8.1 where the nlink counter is always
incremented by one, no matter what is specified in e.attr.st_nlink.

Always set timeout to 0 to force a new stat on the inode.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1802907
Closes: containers#183

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@castedo
Copy link
Author

castedo commented Feb 27, 2020

Sorry I did not test with a new build from sources. I was traveling and this testing would be on my laptop ... too scary.

If I understand correctly, the PR with the work around is not getting merged into master but a new package fuse-overlayfs-0.7.2-2 is getting pushed out to the container-tools:rhel8 module stream for rhel8.1. I'm guessing the work around is in this patch.

So I will upgrade to fuse-overlayfs 0.7.2 when I see package fuse-overlayfs-0.7.2-2 become available to install from the container-tools:rhel8 stream for rhel 8.1. Currently I only see fuse-overlayfs-0.7.2-1 and I hit this bug when I upgrade to that.

Let me know if I should be doing something different before closing this bug.

Thank you!

@giuseppe
Copy link
Member

If I understand correctly, the PR with the work around is not getting merged into master but a new package fuse-overlayfs-0.7.2-2 is getting pushed out to the container-tools:rhel8 module stream for rhel8.1. I'm guessing the work around is in this patch.

yes exactly. The issue is in the kernel available on RHEL 8.1, a FUSE file system reports the wrong number of links after a link(2). The issue is going to be addressed in the kernel, but that takes more time, the patch provided here is just a workaround that is used in the last build of fuse-overlayfs

@rhatdan
Copy link
Member

rhatdan commented Feb 27, 2020

Since this is fixed in master and in the RHEL8.2 kernel, can we close this issue/

@castedo
Copy link
Author

castedo commented Feb 27, 2020

OK, will close.

But in my brain's bug tracking system it's not closed until that fuse-overlayfs-0.7.2-2 package is working on my laptop. ;-)

@castedo castedo closed this as completed Feb 27, 2020
@rhatdan
Copy link
Member

rhatdan commented Feb 27, 2020

That is what a bugzilla is for, to track packaging via your favourite distribution. Issues are for upstream, and are closed as soon as the issue is fixed in the matching branch. In this case the master branch.

@rhatdan
Copy link
Member

rhatdan commented Mar 2, 2020

Sorry about the confusion on "matching" i was talking about github branching. If someone opened a Issue on a particular branch of github.com fuse-overlay, then the issue would stay open until that particular branch PR was merged. IE the PR matching the branch.
In most cases, the PR is against the master branch, so this is not an issue.

@castedo
Copy link
Author

castedo commented Mar 3, 2020

Thanks. Now I think I get it. Upstream issues are more for tracking whether there's remaining code changes to make. Not necessarily that the exact bug reproduced by the issue reporter has been confirmed to no longer happen with a fixing upgrade or workaround.

@rjbell4
Copy link

rjbell4 commented Jul 29, 2020

I believe I am experiencing the same core issue. That is, I'm hitting behavior only in a rootless buildah container that I've been able to trace to st_nlink being reported as 2, even though there are subdirectories, etc.

I am on Red Hat (or CentOS) 8.1, but I am also running with fuse-overlayfs-0.7.2-5.module_el8.1.0+298+41f9343a.x86_64. Could somewhat point me to the specific kernel change that fixes this issue, and/or the fuse-overlayfs packaging change that contains the work around?

@rhatdan
Copy link
Member

rhatdan commented Jul 29, 2020

@giuseppe ^^

@giuseppe
Copy link
Member

I think both the fixed are available from 8.2.

For more details, the Bugzilla tracking the issue: https://bugzilla.redhat.com/show_bug.cgi?id=1802907

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants