Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XDG_RUNTIME_DIR directory "/run/user/1000" is not owned by the current user #13338

Closed
djarbz opened this issue Feb 24, 2022 · 28 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@djarbz
Copy link

djarbz commented Feb 24, 2022

/kind bug

Description

I am attempting to run podman in rootless mode with lingering.
When I attempt to start the systemctl --user service podman fails with the error in the title.
In my case, I am user 1002 and the error states UID 1002, but I replaced it with 1000 as I expect that to be most common and should help with other users searching for the same error.
In my case the directory is in fact owned by the current user and is writable by the current user.

EDIT: Whenever I reboot and need to test from scratch I create a new user, so the UIDs don't necessarily match, but when applicable, they do match the user currently being utilized for testing.

Steps to reproduce the issue:

As a test, I created a new user.
sudo su - test -c 'podman info' works

So I enable linger and it fails, there is a note about this on the troubleshooting page, so let's login to the console.

sudo loginctl enable-linger test
sudo su - test -c 'podman info'
ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1006" is not owned by the current user

sudo ls -lah /run/user/1006
total 0
drwx------ 7 test test 180 Feb 24 08:13 .
drwxr-xr-x 5 root root 100 Feb 24 08:13 ..
srw-rw-rw- 1 test test   0 Feb 24 08:13 bus
drwx------ 2 test test  40 Feb 24 08:13 containers
drwx------ 2 test test 140 Feb 24 08:13 gnupg
drwx-----T 2 test test  40 Feb 24 08:13 libpod
srw-rw-rw- 1 test test   0 Feb 24 08:13 pk-debconf-socket
drwxr-xr-x 2 test test  60 Feb 24 08:13 podman
drwxr-xr-x 4 test test 120 Feb 24 08:13 systemd

On the troubleshooting page it states that I need to create a login session, so I login from the console.
I still get the error.

test@2006-ct:~$ podman info
ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1006" is not owned by the current user

Ok, let's tests with the machinectl method.

sudo machinectl shell test@
Connected to the local host. Press ^] three times within 1s to exit session.
podman info
ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1006" is not owned by the current user

Let's reboot for good measure...
sudo su - test -c 'podman info' works!

So in summary, after enabling lingering we need to reboot the server for podman to operate as that user.

Describe the results you received:

Podman does not work as a lingering user until the host is rebooted as shown above.
Command: podman info
Error: ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1006" is not owned by the current user

Describe the results you expected:

Running podman commands should work as expected after a user has been granted lingering without rebooting the host.

Additional information you deem important (e.g. issue happens only occasionally):

I am working with Ansible on this and it is repeatable for every run and every instance when the host is recreated.
I am running a Debian CT on Proxmox.
The Proxmox filesystem is ZFS so I am using the VFS driver in the CT.

Proxmox:

pveversion
pve-manager/7.1-8/5b267f33 (running kernel: 5.13.19-2-pve)

cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye

Container:

cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye

Ansible:

- name: "{{ systemd_user }} | User prerequisites"
  block: 

  - name: "{{ systemd_user }} | Get home directory"
    ansible.builtin.user:
      name: "{{ systemd_user }}"
      state: present
    register: user_info

  - name: "{{ systemd_user }} | Set Systemd directory"
    ansible.builtin.set_fact:
      user_systemd: "{{ user_info.home }}/.config/systemd/user"

  - name: "{{ systemd_user }} | Create Systemd directory"
    ansible.builtin.file:
      path: "{{ user_systemd }}"
      state: directory
      mode: '0750'
      # recurse: yes

  - name: "{{ systemd_user }} | Fix Systemd connection"
    ansible.builtin.lineinfile:
      dest: "~/.bashrc"
      regexp: "{{ item.regexp }}"
      line: "{{ item.line }}"
    loop:
      - regexp: "^#?export XDG_RUNTIME_DIR"
        line: "export XDG_RUNTIME_DIR=\"${XDG_RUNTIME_DIR:-/run/user/$UID}\""
      - regexp: "^#?export DBUS_SESSION_BUS_ADDRESS"
        line: "export DBUS_SESSION_BUS_ADDRESS=\"${DBUS_SESSION_BUS_ADDRESS:-unix:path=${XDG_RUNTIME_DIR}/bus}\""
  
  become_user: "{{ systemd_user }}"
  become: true
  when: systemd_user != "root"

- name: "{{ systemd_user }} | Enable lingering"
  block:

  - name: "{{ systemd_user }} | Check if lingering is enabled"
    ansible.builtin.stat: 
      path: "/var/lib/systemd/linger/{{ systemd_user }}"
    register: linger

  - name: "{{ systemd_user }} | Enable lingering"
    ansible.builtin.command: "loginctl enable-linger {{ systemd_user }}"
    when: 
      - not linger.stat.exists
      - systemd_config.enable_linger | default('yes')
  
  # - name: "{{ systemd_user }} | Get user info"
  #   getent:
  #     database: passwd
  #     key: "{{ systemd_user }}"
  #   when: not linger.stat.exists

  # - name: Debug user info
  #   debug:
  #     var: ansible_facts.getent_passwd[{{ systemd_user }}]
  #   when: not linger.stat.exists

  # - name: Restart user Systemd service # Doesn't work...
  #   ansible.builtin.systemd:
  #     name: "user@{{ ansible_facts.getent_passwd[systemd_user].1 }}"
  #     state: restarted
  #   when: not linger.stat.exists

  # - name: "{{ systemd_user }} | Restart Systemd to apply lingering" # Doesn't work...
  #   ansible.builtin.systemd:
  #     daemon_reexec: yes
  #   when:
  #     - not linger.stat.exists | default('no')
  #     - systemd_config.enable_linger | default('yes')

  - name: "{{ systemd_user }} | Reboot to apply linger (There has got to be a better way!)"
    ansible.builtin.reboot:
    when: linger.changed
  
  become: true
  when: 
    - systemd_user != "root"
    - systemd_config.enable_linger | default('yes')

Output of podman version:

podman version
Version:      3.0.1
API Version:  3.0.0
Go Version:   go1.15.9
Built:        Wed Dec 31 18:00:00 1969
OS/Arch:      linux/amd64

Output of podman info --debug:

podman info --debug
host:
  arch: amd64
  buildahVersion: 1.19.6
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: unknown'
  cpus: 4
  distribution:
    distribution: debian
    version: "11"
  eventLogger: journald
  hostname: 2006-ct
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1006
      size: 1
    - container_id: 1
      host_id: 493216
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1006
      size: 1
    - container_id: 1
      host_id: 493216
      size: 65536
  kernel: 5.13.19-2-pve
  linkmode: dynamic
  memFree: 8145469440
  memTotal: 8589934592
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1006/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.4.0
  swapFree: 0
  swapTotal: 0
  uptime: 10m 18.12s
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: docker.io
    MirrorByDigestOnly: false
    Mirrors:
    - Insecure: false
      Location: [REDACTED]
    - Insecure: false
      Location: mirror.gcr.io
    Prefix: docker.io
  search:
  - docker.io
store:
  configFile: /home/test/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /home/test/.local/share/containers/storage
  graphStatus: {}
  imageStore:
    number: 0
  runRoot: /tmp/containers-user-1006/containers
  volumePath: /home/test/.local/share/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 0
  BuiltTime: Wed Dec 31 18:00:00 1969
  GitCommit: ""
  GoVersion: go1.15.9
  OsArch: linux/amd64
  Version: 3.0.1

Package info (e.g. output of rpm -q podman or apt list podman):

apt list podman
Listing... Done
podman/stable,now 3.0.1+dfsg1-3+b2 amd64 [installed]

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

I am using the latest version of Podman available for Debian 11, I have referenced the Podman Troubleshooting Guide.

Additional environment details (AWS, VirtualBox, physical, etc.):

Proxmox CT.

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 24, 2022
@mheon
Copy link
Member

mheon commented Feb 24, 2022

I strongly suspect that su is not setting environment variables appropriately and as such we don't know what the actual runtime directory of the user is. I think su --login may help?

@djarbz
Copy link
Author

djarbz commented Feb 24, 2022

I would normally agree with you, but it fails when logging in on the console as well and only works after a full reboot of the OS.

Here is my environment while logged into the console with podman info working after a reboot.

test@2006-ct:~$ env
SHELL=/bin/bash
PWD=/home/test
LOGNAME=test
MOTD_SHOWN=pam
HOME=/home/test
LANG=C
INVOCATION_ID=4c1509bd97034cbe9e518cccbef308f0
TERM=linux
USER=test
SHLVL=1
JOURNAL_STREAM=8:439666638
HUSHLOGIN=FALSE
PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
MAIL=/var/mail/test
_=/usr/bin/env

@Luap99
Copy link
Member

Luap99 commented Feb 25, 2022

If loginctl enable-linger requires a reboot is not up to us, this is a systemd feature.

What I do not understand here is why podman is failing with XDG_RUNTIME_DIR problem. In the issue description you say uid 1002 but the errors shows 1006.

Can you try this with podman 3.4 or podman 4.0 and see if you can reproduce there, maybe this is already fixed.

@djarbz
Copy link
Author

djarbz commented Feb 25, 2022

You can pretty much ignore the actual UID, every time I reboot I create a new user to test with when I need to start from scratch. For what matters, the UID does match for the appropriate user.

What is the recommended way to try with those versions? I see the source code is available, but is there a location for prebuilt binaries that I can download?

@djarbz
Copy link
Author

djarbz commented Mar 1, 2022

I am running the v4 release as provided here.

Created a new user test ID=1006 and connected via su - I get the same results when I login from the console.

test@2006-ct:~$ id -a
uid=1006(test) gid=1006(test) groups=1006(test)

Ran podman info, this used to work on v3.0.1.

test@2006-ct:~$ podman info
WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers 
Error: error opening "/etc/cni/net.d/cni.lock": permission denied
WARN[0001] Failed to add pause process to systemd sandbox cgroup: exec: "dbus-launch": executable file not found in $PATH

Enabled lingering and added the following lines to test's .bashrc.

export XDG_RUNTIME_DIR="${XDG_RUNTIME_DIR:-/run/user/$UID}"
export DBUS_SESSION_BUS_ADDRESS="${DBUS_SESSION_BUS_ADDRESS:-unix:path=${XDG_RUNTIME_DIR}/bus}"

Podman info still doesn't work, the warning is new.

test@2006-ct:~$ podman info
WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers 
ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1006" is not owned by the current user

I reboot and now I get the following, looks like the XDG error went away just like before, but the response is new.

test@2006-ct:~$ podman info
WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers 
Error: error opening "/etc/cni/net.d/cni.lock": permission denied

This only works as root now.

root@2006-ct:~# podman version
Client:       Podman Engine
Version:      4.0.1
API Version:  4.0.1
Go Version:   go1.17.7

Built:      Wed Dec 31 18:00:00 1969
OS/Arch:    linux/amd64

Same with this...

root@2006-ct:~# podman info --debug
host:
  arch: amd64
  buildahVersion: 1.24.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: bdb4f6e56cd193d40b75ffc9725d4b74a18cb33c'
  cpus: 4
  distribution:
    codename: bullseye
    distribution: debian
    version: "11"
  eventLogger: file
  hostname: 2006-ct
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.13.19-2-pve
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 8228069376
  memTotal: 8589934592
  networkBackend: cni
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 1.4
      commit: 3daded072ef008ef0840e8eccb0b52a7efbd165d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_AUDIT_WRITE,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_MKNOD,CAP_NET_BIND_SERVICE,CAP_NET_RAW,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.4.0
  swapFree: 0
  swapTotal: 0
  uptime: 2m 17.65s
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: docker.io
    MirrorByDigestOnly: false
    Mirrors:
    - Insecure: false
      Location: [REDACTED]
    - Insecure: false
      Location: mirror.gcr.io
    Prefix: docker.io
  search:
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 12
    paused: 0
    running: 11
    stopped: 1
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphStatus: {}
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 4
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.0.1
  Built: 0
  BuiltTime: Wed Dec 31 18:00:00 1969
  GitCommit: ""
  GoVersion: go1.17.7
  OsArch: linux/amd64
  Version: 4.0.1

@TomSweeneyRedHat
Copy link
Member

@djarbz I think this is fixed in 4.0.2. We found a late-breaking bug due to permissions on our call that gets the RunTimeDir. Any chance you can try 4.0.2? https://github.com/containers/podman/releases/tag/v4.0.2

@djarbz
Copy link
Author

djarbz commented Mar 3, 2022

I just updated.

root@2006-ct:~# podman version
Client:       Podman Engine
Version:      4.0.2
API Version:  4.0.2
Go Version:   go1.17.7

Built:      Wed Dec 31 18:00:00 1969
OS/Arch:    linux/amd64

I am still getting this error, however, I did not see the XDG error!

test@2006-ct:~$ podman info
WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers 
Error: error opening "/etc/cni/net.d/cni.lock": permission denied

@Luap99
Copy link
Member

Luap99 commented Mar 6, 2022

@djarbz see #13402 (comment) for the permission denied problem.

Since the XDG problem is fixed I close this issue.

@Luap99 Luap99 closed this as completed Mar 6, 2022
@djarbz
Copy link
Author

djarbz commented Mar 6, 2022

Hi @Luap99, I apologize, but it looks like this issue is not actually resolved.

I was able to fix my issue with cni.lock (#13402 (comment))
Now the XDG error is appearing after enabling linger for a user but prior to rebooting the host.

ansible@2006-ct:~$ apt list podman -a
Listing... Done
podman/unknown,now 100:4.0.2-1 amd64 [installed]

@Luap99 Luap99 reopened this Mar 6, 2022
@djarbz
Copy link
Author

djarbz commented Mar 6, 2022

I just spun up a Rocky Linux LXC to test and I am experiencing the same XDG error.

OS Release

[podmanone@2008-ct ~]$ cat /etc/os-release 
NAME="Rocky Linux"
VERSION="8.4 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.4"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.4 (Green Obsidian)"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:rocky:rocky:8.4:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky Linux"
ROCKY_SUPPORT_PRODUCT_VERSION="8"

Podman Version

[ansible@2008-ct ~]$ sudo podman version
Version:      3.4.2
API Version:  3.4.2
Go Version:   go1.16.12
Built:        Tue Feb  1 17:59:28 2022
OS/Arch:      linux/amd64

Podman Info

[ansible@2008-ct ~]$ sudo podman info --debug
host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.32-1.module+el8.5.0+735+2f243138.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.32, commit: a9e4d8e2ed7fdb9aff1201ce3c15cb3909665586'
  cpus: 4
  distribution:
    distribution: '"rocky"'
    version: "8.4"
  eventLogger: file
  hostname: 2008-ct.[REDACTED]
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.13.19-2-pve
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 7923744768
  memTotal: 8589934592
  ociRuntime:
    name: runc
    package: runc-1.0.3-1.module+el8.5.0+735+2f243138.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.3
      spec: 1.0.2-dev
      go: go1.16.12
      libseccomp: 2.5.1
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /bin/slirp4netns
    package: slirp4netns-1.1.8-1.module+el8.5.0+710+4c471e88.x86_64
    version: |-
      slirp4netns version 1.1.8
      commit: d361001f495417b880f20329121e3aa431a8f90f
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 0
  swapTotal: 0
  uptime: 26m 2.75s
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: docker.io
    MirrorByDigestOnly: false
    Mirrors:
    - Insecure: false
      Location: [REDACTED]
    - Insecure: false
      Location: mirror.gcr.io
    Prefix: docker.io
  search:
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 11
    paused: 0
    running: 11
    stopped: 0
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphStatus: {}
  imageStore:
    number: 2
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.4.2
  Built: 1643759968
  BuiltTime: Tue Feb  1 17:59:28 2022
  GitCommit: ""
  GoVersion: go1.16.12
  OsArch: linux/amd64
  Version: 3.4.2

@Luap99
Copy link
Member

Luap99 commented Mar 6, 2022

@djarbz Can you change the permissions of /run/user/$UID to 0700. This should work for now.

@djarbz
Copy link
Author

djarbz commented Mar 6, 2022

@Luap99

No change unfortunately, for either Debian or Rocky.

podmanone@2006-ct:~$ stat /run/user/$UID
  File: /run/user/1002
  Size: 160             Blocks: 0          IO Block: 4096   directory
Device: 100083h/1048707d        Inode: 1           Links: 6
Access: (0700/drwx------)  Uid: ( 1002/podmanone)   Gid: ( 1002/podmanone)
Access: 2022-03-06 12:18:42.993011790 -0600
Modify: 2022-03-06 12:18:47.041040045 -0600
Change: 2022-03-06 13:48:03.648524307 -0600
 Birth: -

@djarbz djarbz closed this as completed Mar 6, 2022
@djarbz djarbz reopened this Mar 6, 2022
@giuseppe
Copy link
Member

this is fixed with containers/common#947

@djarbz
Copy link
Author

djarbz commented Mar 22, 2022

Thanks @giuseppe
Is there a way to confirm that my version of containers-common has the fix?
I am using the Alvistack repo to get the latest packages for Debian and I see it was last compiled today, but I am still experiencing the error.

@giuseppe
Copy link
Member

the change must be vendored in Podman, so you'll need a newer Podman

@sarim
Copy link

sarim commented Apr 20, 2022

Umm @djarbz did your problem fixed with the new version? I don't think the referenced commit will actually fix the problem. The directory permission is already 700, so I don't see how the proposed fix will solve it.

The check st.Mode().Perm() == 0700 actually return true in my machine. So probably the problem is elsewhere.
I wrote this test snippet to test the problem.

	fmt.Println("hello world")
	runtimeDir := os.Getenv("XDG_RUNTIME_DIR")
	if runtimeDir != "" {
		st, err := os.Stat(runtimeDir)
		fmt.Println(st.Mode().Perm() == 0700)
 ~/golangUidTest
hello world
true

This returns true.

Below is my output from bash, (forgive my custom prompt), my uid is 1000 and /run/user/1000 is actually owned by me and permission is 700.

~ ➤ stat /run/user/1000/
  File: /run/user/1000/
  Size: 200             Blocks: 0          IO Block: 4096   directory
Device: 3bh/59d Inode: 1           Links: 8
Access: (0700/drwx------)  Uid: ( 1000/   gittu)   Gid: ( 1000/   gittu)
Access: 2022-04-20 21:48:48.774587922 +0600
Modify: 2022-04-20 21:44:25.954585972 +0600
Change: 2022-04-20 21:44:25.954585972 +0600
 Birth: -
↪ ~ ➤ podman version
ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1000" is not owned by the current user
↪ ~ ➤ whoami
gittu
↪ ~ ➤ id -u
1000

@djarbz
Copy link
Author

djarbz commented Apr 20, 2022

Hi @sarim,

You are correct, I do not believe that this corrected the issue.
I asked the maintainer of the repo I mentioned previously to incorporate this fix in his repo and I still have the issue.

@Luap99 Luap99 reopened this Apr 21, 2022
@Luap99
Copy link
Member

Luap99 commented Apr 21, 2022

Can you run with --log-level debug?
The error is happening here but I do not understand how this condition triggers since the permission match in your case: https://github.com/containers/common/blob/112a47964ddbe816349ea412c0073902e805f943/pkg/util/util_supported.go#L41-L42

@djarbz
Copy link
Author

djarbz commented Apr 22, 2022

Ok, I just rebuilt my test machine.

podmanone@2006-ct:~$ ls -lah /run/user/
total 0
drwxr-xr-x  4 root      root       80 Apr 22 14:33 .
drwxr-xr-x 22 root      root      840 Apr 22 14:32 ..
drwx------  4 ansible   ansible   120 Apr 22 14:31 1000
drwx------  6 podmanone podmanone 160 Apr 22 14:33 1002
podmanone@2006-ct:~$ podman info --log-level debug
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called info.PersistentPreRunE(podman info --log-level debug) 
DEBU[0000] Merged system config "/usr/share/containers/containers.conf" 
DEBU[0000] Merged system config "/etc/containers/containers.conf" 
DEBU[0000] environment variable PATH is already defined, skip the settings from containers.conf 
DEBU[0000] environment variable TERM is already defined, skip the settings from containers.conf 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /home/podmanone/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Overriding run root "/run/user/1002/containers" with "/tmp/containers-user-1002/containers" from database 
DEBU[0000] Overriding tmp dir "/run/user/1002/libpod/tmp" with "/tmp/podman-run-1002/libpod/tmp" from database 
DEBU[0000] systemd-logind: Unknown object '/'.          
DEBU[0000] Using graph driver vfs                       
DEBU[0000] Using graph root /home/podmanone/.local/share/containers/storage 
DEBU[0000] Using run root /tmp/containers-user-1002/containers 
DEBU[0000] Using static dir /home/podmanone/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /tmp/podman-run-1002/libpod/tmp 
DEBU[0000] Using volume path /home/podmanone/.local/share/containers/storage/volumes 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] Not configuring container store              
DEBU[0000] Initializing event backend file              
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 13             
ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1002" is not owned by the current user

@djarbz
Copy link
Author

djarbz commented Apr 24, 2022

Here is an interesting turn of events.

I modified the function linked in util_supported.go and ran it as an isolated test.
As you can see from the output below, my function does not produce the error, but Podman does.

podmanone@2006-ct:~/gotest$ go run main.go
podmanone@2006-ct:~/gotest$ podman info
ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/1002" is not owned by the current user

main.go

package main

import (
  "fmt"
  "os"
  "syscall"
)

func main() {
  runtimeDir := os.Getenv("XDG_RUNTIME_DIR")
  if runtimeDir != "" {
    st, err := os.Stat(runtimeDir)
    if err != nil {
      fmt.Printf("Could not stat (%s): %v", runtimeDir, err)
      os.Exit(1)
    }
    if int(st.Sys().(*syscall.Stat_t).Uid) != os.Geteuid() {
      fmt.Printf("XDG_RUNTIME_DIR directory %q is not owned by the current user", runtimeDir)
      os.Exit(2)
    }
  }
}

@sarim
Copy link

sarim commented Apr 24, 2022

As you can see from the output below, my function does not produce the error, but Podman does

Yes, the same behavior is seen in my snippet in previous post too.

(Un?)fortunately I couldn't reproduce the error in my system after booting up the PC the next day. I messed with podman versions, upgraded to 4.0.3 then down to 3.4.2 couple of times. Not sure what fixed the problem :/

I guess the next step would be to debug run a compiled from source podman with breakpoints...

@djarbz
Copy link
Author

djarbz commented Apr 24, 2022

I have found that after a reboot everything works fine, but I don't know what changes.
IMHO, it is unacceptable to require a reboot to add a new rootless user as that would affect other users on the server.

Running a debug version of Podman is a bit beyond my expertise.

@rhatdan
Copy link
Member

rhatdan commented Apr 25, 2022

The problem is you are leaking the XDG_RUNTIME_DIR environment from one user to another.

When you su from one user to another, the environment follows you and this is confusing Podman.

This is fixed in podman 4.0.4. I am closing. Please create a new issue, if you find one that is easily reproduced.

@rhatdan rhatdan closed this as completed Apr 25, 2022
@djarbz
Copy link
Author

djarbz commented Apr 26, 2022

Hi @rhatdan I log into the console as the linger user and still have this issue. I do not believe that it is related to environment leaking.

Code to test:

package main

import (
        "fmt"
        "os"
        "syscall"
)

func main() {
  runtimeDir := os.Getenv("XDG_RUNTIME_DIR")
  fmt.Printf("XDG_RUNTIME_DIR: %s\n", runtimeDir)
  if runtimeDir != "" {
    st, err := os.Stat(runtimeDir)
    if err != nil {
      fmt.Printf("Could not stat (%s): %v", runtimeDir, err)
      os.Exit(1)
    }
    if int(st.Sys().(*syscall.Stat_t).Uid) != os.Geteuid() {
      fmt.Printf("XDG_RUNTIME_DIR directory %q is not owned by the current user", runtimeDir)
      os.Exit(2)
    }
    stat := st.Sys().(*syscall.Stat_t)
    fmt.Printf("Dir: (%s) Owner: (%d) Group: (%d)\n", st.Name(), stat.Uid, stat.Gid)
  }
}

Output via console direct login:

podmanone@2006-ct:~/gotest$ id && go run main.go
uid=1002(podmanone) gid=1002(podmanone) groups=1002(podmanone)
XDG_RUNTIME_DIR: /run/user/1002
Dir: (1002) Owner: (1002) Group: (1002)

Output via sudo su - podmanone:

podmanone@2006-ct:~/gotest$ id && go run main.go 
uid=1002(podmanone) gid=1002(podmanone) groups=1002(podmanone)
XDG_RUNTIME_DIR: /run/user/1002
Dir: (1002) Owner: (1002) Group: (1002)

@rhatdan
Copy link
Member

rhatdan commented Apr 26, 2022

Please open a brand new issue then.

@manojchander6
Copy link

I created the folder manually and exported XDG_RUNTIME_DIR and it worked .Not sure if its the right approach .

mkdir -p /tmp/$USER-runtime
export XDG_RUNTIME_DIR=/tmp/$USER-runtime

podman info should now work without any exceptions

@istiak101
Copy link

https://access.redhat.com/solutions/4661741

@Frontesque
Copy link

I created the folder manually and exported XDG_RUNTIME_DIR and it worked .Not sure if its the right approach .

mkdir -p /tmp/$USER-runtime export XDG_RUNTIME_DIR=/tmp/$USER-runtime

podman info should now work without any exceptions

this worked for me perfectly but another quirk that i noticed is that the podman commands have to be run from the user's home directory. Not sure of the cause but that's fine with me.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 1, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

10 participants