Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: chown() fails for rootful podman with --userns=auto on bind-mounts #17120

Closed
dilyanpalauzov opened this issue Jan 15, 2023 · 10 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@dilyanpalauzov
Copy link
Contributor

Issue Description

CRI-O can run rootful containers in userns. That is it switches from root to user containers and from the perspective of the user containers, /etc/subuid and /etc/subgid it allocates new mappings user-id-within-the-container ⇔ user-id-on-the-host-system.

Rootless podman can do the same. (N.B. I try functional rootless podman and dysfunctional rootful podman on different systems).

But I cannot get this in rootful-podman version 4.3.2-dev. With
/etc/password containing containers:x:1066:1066::/home/containers:/bin/sh
/etc/group containing containers:x:1066:
/etc/subuid containing containers:3638944:65536
/etc/subgid containing containers:3638944:65536

I call

/usr/local/bin/podman run --cap-add=IPC_LOCK --log-driver=none --net=host --read-only --read-only-tmpfs=false --mount type=bind,src=/etc/k,dst=/conf --rm=true -a=stderr -a=stdout -a=stdin --userns=auto --name=k-run --tty localhost/k:2023-01-14

Under strace it executes:

[pid 2839590] getpid()                  = 1
[pid 2839590] getpid()                  = 1
[pid 2839590] unlink("/conf/k_ctl") = -1 ENOENT (No such file or directory)
[pid 2839590] socket(AF_UNIX, SOCK_STREAM, 0) = 4
[pid 2839590] setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 2839590] fcntl(4, F_GETFL)         = 0x2 (flags O_RDWR)
[pid 2839590] fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 2839590] bind(4, {sa_family=AF_UNIX, sun_path="/conf/k_ctl"}, 110) = 0
[pid 2839590] chmod("/conf/k_ctl", 0600) = 0
[pid 2839590] chown("/conf/k_ctl", 1065, 1065) = -1 EINVAL (Invalid argument)
[pid 2839590] getpid()                  = 1
[pid 2839590] write(2, " 0(1) ERROR: ctl [init_socks.c:130]: init_unix_sock(): ERROR: init_unix_sock: failed to change the owner/group for /conf/k_ctl to 1065.1065: Invalid argument[22]\n", 169) = 169
[pid 2839590] close(4 <unfinished ...>

and

# ls /etc/k/k_ctl -l   
srw------- 1 3638944 3638944 0 Jan 15 11:33 /etc/k/k_ctl=

Without --userns=auto above, this does run properly: /etc/k/k= belongs to user 1065, group 1065.

Steps to reproduce the issue

See above.

Describe the results you received

See above.

Describe the results you expected

chown("/conf/k_ctl", 1065, 1065) shall work under --userns=auto.

podman info output

host:                                                               
  arch: amd64           
  buildahVersion: 1.28.2  
  cgroupControllers:     
  - cpuset                                                          
  - cpu                                                             
  - io                                                              
  - memory            
  - hugetlb          
  - pids                       
  cgroupManager: systemd                                            
  cgroupVersion: v2                                                                                                            [77/1912]
  conmon:
    package: Unknown
    path: /usr/local/libexec/podman/conmon
    version: 'conmon version 2.1.4, commit: '
  cpuUtilization:
    idlePercent: 99.18
    systemPercent: 0.37
    userPercent: 0.45
  cpus: 8
  distribution:
    distribution: unknown
    version: unknown
  eventLogger: journald
  hostname: mail
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.10.162
  linkmode: dynamic
  logDriver: journald
  memFree: 444940288
  memTotal: 8350863360
  networkBackend: cni
  ociRuntime:
    name: crun
    package: Unknown
    path: /usr/local/bin/crun                                                                                                  [50/1912]
    version: |-
      crun version 1.7.2.0.0.0.28-da8f
      commit: 61805dffaf112bf7d883555ea91efac8bcf177a2
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_AUDIT_WRITE,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_MKNOD,CAP_NET_BIND_SERVICE,CAP_NET_RAW,C
AP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.4
  swapFree: 0                                                                                                                  [22/1912]
  swapTotal: 0
  uptime: 126h 34m 18.00s (Approximately 5.25 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries: {}
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 519784554496 
  graphRootUsed: 410149601280
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 8
  runRoot: /run/containers/storage 
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.3.2-dev
  Built: 1673643388
  BuiltTime: Fri Jan 13 20:56:28 2023
  GitCommit: 588c6ecc760558780ec2df4a78efdf02e476a0b2
  GoVersion: go1.19.3
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.2-dev

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

No

Additional environment details

No response

Additional information

No response

@dilyanpalauzov dilyanpalauzov added the kind/bug Categorizes issue or PR as related to a bug. label Jan 15, 2023
@giuseppe
Copy link
Member

podman uses by default a smaller size for the user namespace: 1024 IDs, while CRI-O uses 65536.

Could you try forcing a bigger size for the user namespace?

/usr/local/bin/podman run --cap-add=IPC_LOCK --log-driver=none --net=host --read-only --read-only-tmpfs=false --mount type=bind,src=/etc/k,dst=/conf --rm=true -a=stderr -a=stdout -a=stdin --userns=auto:size=2048 --name=k-run --tty localhost/k:2023-01-14

@dilyanpalauzov
Copy link
Contributor Author

dilyanpalauzov commented Jan 16, 2023

--userns=auto:size=2048

Thanks, that works. Why is this not necessary for rootless podman (APIVersion: 4.4.0-dev, GitCommit: 9311846), but is necessary for rootful podman (APIVersion: 4.3.2-dev, GitCommit: 588c6ec)?

Moreover, when I have in /etc/subuid the line containers:3638944:65536 I expect that podman uses as implict size 65536. Why does this not happen?

@dilyanpalauzov
Copy link
Contributor Author

https://docs.podman.io/en/latest/markdown/podman-run.1.html does not say that the default is size=1024.

@giuseppe
Copy link
Member

Thanks, that works. Why is this not necessary for rootless podman (APIVersion: 4.4.0-dev, GitCommit: 9311846), but is necessary for rootful podman (APIVersion: 4.3.2-dev, GitCommit: 588c6ec)?

are you using --userns=auto for rootless as well?

Moreover, when I have in /etc/subuid the line containers:3638944:65536 I expect that podman uses as implict size 65536. Why does this not happen?

That is the entire range of IDs that are allocated for auto user namespaces. If the entire range is allocated to a single container then you can run just one container. Instead these IDs are splitted among different containers.

@dilyanpalauzov
Copy link
Contributor Author

are you using --userns=auto for rootless as well?

No. If I use it rootless and rootful behave more similar to each other.

My understanding about https://docs.podman.io/en/latest/markdown/podman-run.1.html#userns-mode is:

«It defaults to the PODMAN_USERNS environment variable. An empty value (“”) means user namespaces are disabled unless an explicit mapping is set with the --uidmap and --gidmap options.…
host: run in the user namespace of the caller. The processes running in the container will have the same privileges on the host as any other process launched by the calling user (default).»

  • when I have no PODMAN_USERNS environment variable, its value is essentially the emtpy string and should be the default, but the default is host

Rootless user --userns=Key mappings:
Key=auto, Host User=$UID, Container User=nil (Host User UID is not mapped into container.)

  • the behaviour for --userns=auto is documented for rootless podman. For rootful podman the behaviour is not specified. In fact the same applies for all the other --userns options.

Please adjust https://docs.podman.io/en/latest/markdown/podman-run.1.html#userns-mode to specify what is common and different when using rootful/rootless podman.

@dilyanpalauzov
Copy link
Contributor Author

Please also specify that in the default userns for rootful podman ("" or host - whatever it is), the size is 65536 and for auto the estimation is 1000, even it /etc/passwd contains a single line username:x:1065:1065.

That is the entire range of IDs that are allocated for auto user namespaces. If the entire range is allocated to a single container then you can run just one container. Instead these IDs are splitted among different containers.

I would expect that the entire space is allocated and the different containers running under userns=auto for the user «containers» share the UIDs: if two distinct containers use uid=5 in --userns=auto, then both users are mapped on the host to the same sub-uid. This might be not utterly secure, but this is my expectation and the documetation of podman-run does not hint on a different behaviour.

@dilyanpalauzov
Copy link
Contributor Author

After rereading https://docs.podman.io/en/latest/markdown/podman-run.1.html#userns-mode:

auto[:OPTIONS,…]: automatically create a unique user namespace.… Podman allocates unique ranges of UIDs and GIDs from the containers subordinate user ids.… auto will estimate a size for the user namespace.

I agree that the created namespace is documented as distinct for different containers. That leaves unclear, how is size= guessed.

@giuseppe
Copy link
Member

It is documented "auto will estimate a size for the user namespace."

We use a heuristic to find out the initial size, it looks into the container image and sees what users are defined there. If you want a specific size you need to specify it.

If you've suggestions on how to improve the documentation, please open a PR.

@dilyanpalauzov
Copy link
Contributor Author

I filled #17156 .

For me it is unclear how the auto:size= estimation works. Is it based on the content of the included /etc/passwd? Is it based on the USER directlive in Containerfile? Are there other criteria?

@rhatdan
Copy link
Member

rhatdan commented Jan 18, 2023

It also finds the greatest UID defined in the container image.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 4, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

3 participants