Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[241] "No such process" when starting user instance with hidepid=2 and cgroupv2 #12955

Closed
yadutaf opened this issue Jul 4, 2019 · 19 comments
Closed

Comments

@yadutaf
Copy link
Contributor

yadutaf commented Jul 4, 2019

systemd version the issue has been seen with

241

Used distribution

Yocto, with an additional "-Ddefault-hierarchy=unified" build flag

Expected behaviour you didn't see

user@xxxx.service should start

Unexpected behaviour you saw

user@xxxx.service is in state failed with the following logs

root@machine:~# systemctl status user@xxxx.service
● user@xxxx.service - User Manager for UID xxxx
   Loaded: loaded (/lib/systemd/system/user@.service; static; vendor preset: enabled)
   Active: failed (Result: protocol) since Thu 2019-07-04 09:00:23 UTC; 2min 28s ago
     Docs: man:user@.service(5)
  Process: 1433 ExecStart=/lib/systemd/systemd --user (code=exited, status=1/FAILURE)
 Main PID: 1433 (code=exited, status=1/FAILURE)
      CPU: 5ms

Jul 04 09:00:23 machine systemd[1]: Starting User Manager for UID xxxx...
Jul 04 09:00:23 machine systemd[1433]: Failed to determine supported controllers: No such process
Jul 04 09:00:23 machine systemd[1433]: Failed to allocate manager object: No such process
Jul 04 09:00:23 machine systemd[1]: user@xxxx.service: Failed with result 'protocol'.
Jul 04 09:00:23 machine systemd[1]: Failed to start User Manager for UID xxxx.
Jul 04 09:00:23 machine systemd[1]: user@xxxx.service: Consumed 5ms CPU time.

Steps to reproduce the problem

  1. mount -oremount,gid=4,hidepid=2 /proc
  2. /bin/loginctl enable-linger user-xxxx

We are using the "unified" hierarchy, not sure whether it makes any difference

Workaround

This can be worked-around by adding SupplementaryGroups=4 to

(Original hint from https://github.com/pld-linux/systemd/blob/master/proc-hidepid.patch)

But, in this case, all user processes created by systemd user instance have the supplementary group 4, which reduces the usefulness of the 'hidepid=2' feature. It does not however affect sessions opened via SSH so that there is still some benefit :)

Is there any way to avoid propagating the supplementary group to the child processes or, is there a way to avoid needing the patch at least on the user@.service (which I believe, would 'fix' the 'leak') ? I'd naively imagine that the user instance could query the system instance for the info it needs :)

Thanks,

@poettering
Copy link
Member

Sorry, but hidepid= is simply not supported on systemd instances. we require the ability to identify remote processes and most of our services run at minimal privileges, and this means this cannot work. Sorry.

There has been work on making hidepid= a true mount option (i.e. an option of the mount itself instead of the pidns), but it stalled. If we ever get that we could have a per-service hidepid= which would make things a lot more useful.

But anyway, sorry. hidepid= the way it is is not compatible with systemd, and that's not going to change.

@poettering
Copy link
Member

The patch set I was talking of is this:

https://lwn.net/Articles/738597/

If you are interested in this, please work with the kernel folks, resurrect the patch set, and try to make it work.

@yadutaf
Copy link
Contributor Author

yadutaf commented Jul 5, 2019

Thanks for your quick reply !

I understand Systemd requires to see all PIDs on the system and it's fine. Maybe there could be a way to start the systemd user instance with CAP_SET_GID and instruct it to drop the additional group (4 in my case) either when it no longer needs it (if this time exists), either before exec-ing a non-systemd binary.

Would it make sense ?

@poettering
Copy link
Member

It's not just systemd, it's a lot of other stuff we require. Polkit/dbus for example and so on...

I think the clean fix would be to fix the kernel to make hidepid not a system-wide setting, see above. Instead of patching around in systemd, just fix this properly by patching around in the kernel.

@yadutaf
Copy link
Contributor Author

yadutaf commented Jul 18, 2019

I was able to hack something which works in my use case with:

  1. A hackish patch on the kernel (4.9)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ae6d4aa65c8f..11163b7bbe4d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -675,11 +675,14 @@ int proc_setattr(struct dentry *dentry, struct iattr *attr)
 /*
  * May current process learn task's sched/cmdline info (for hide_pid_min=1)
  * or euid/egid (for hide_pid_min=2)?
+ * jean-tiare: Hack for systemd user instance: Always allow PID 1.
  */
 static bool has_pid_permissions(struct pid_namespace *pid,
 				 struct task_struct *task,
 				 int hide_pid_min)
 {
+	if (task->pid == 1)
+		return true;
 	if (pid->hide_pid < hide_pid_min)
 		return true;
 	if (in_group_p(pid->pid_gid))
  1. A drop in for systemd-logind.service
# Grant full access to all PIDs to systemd-logind
[Service]
SupplementaryGroups=4

With both patches, the full test suite passes on my side. Systemd user instances are working fine and the started process have only access there own PIDs and PID 1. This may be missing a lot of (corner ?) cases but this suggests that 'hidepid' support in systemd is not very far and might not require a rework in /proc. Not sure if proposing a new mount flag like "showpid1" for procfs would be acceptable on the kernel side though.

komachi added a commit to komachi/ansible-decent-desktop that referenced this issue Jan 31, 2021
fix broken privoxy config
fix gsimplecal locale
install gantsign.keyboard from galaxy, remove submodule
disable hidepid=2 as it breaks systemd user units systemd/systemd#12955
rename decent_keyring to remote_keyring
remove empty logrotate role
remove tempdirs after using
update README
@Zocker1999NET
Copy link
Contributor

Zocker1999NET commented Mar 17, 2021

I fixed it on Linux 5.10 without patching the kernel, only by creating a special group proc, which has access to all process information by using following configuration in /etc/fstab with gid=proc:

proc /proc proc nosuid,nodev,noexec,hidepid=2,gid=proc 0 0

Then, to allow systemd-logind to access all processes, add a drop-in config for systemd-logind.service adding the supplement group proc to systemd-logind:

[Service]
SupplementaryGroups=proc

After restarting the computer (or restarting systemd-logind), everything worked fine like before enabling hidepid=2 as I observed it.

@4oo4
Copy link

4oo4 commented Aug 30, 2021

@Zocker1999NET Thanks for that workaround, could you elaborate a bit on how you did the permissions for the proc group? I tried something like this in addition to the gid=proc mount option, and wasn't able to create my user session afterward:

groupadd proc
usermod -a -G root proc
usermod -a -G proc Debian-gdm

Cheers

@Zocker1999NET
Copy link
Contributor

@4oo4 I just discovered that I also added the supplementary group to the user@.service, that might also be important … As I remember it, while using SDDM and KDE Plasma, I could jump into a working X11 user session but services depending on dbus and other user services where broken. Maybe this solves your problem? Otherwise you could post some relevant logs from journalctl?
Processes launched as root should still be allowed to access all processes without adding this additional group to them.

To clarify what I did exactly: I only created the group itself, changed/added the proc mount line and added the supplement group to both services. I never touched a group/user called Debian-gdm, but I don't use GDM as display manager. I use the KDE Plasma suite with SDDM as display manager. I implemented this in Ansible, see here. This should roughly translate to executing following commands in bash/zsh:

groupadd --system proc # --system primarily influences the range for choosing a GID, this might be however important, idk
echo "proc /proc proc nosuid,nodev,noexec,hidepid=2,gid=proc 0 0" >> /etc/fstab # assuming proc mount line is missing from fstab
for d in /etc/systemd/system/{systemd-logind,user@}.service.d; do
  mkdir -p "$d"
  cat > "$d"/proc_hidepid_whitelist.conf <<EOF
[Service]
SupplementaryGroups=proc
EOF
done

@poettering
Copy link
Member

Note that recent kernels allow per-namespace hidepid, and we expose this through ProctectProc= and turned it on for all our long-running services.

This is a much nicer solution since turning this on is opt-in per-service and making use of it can be done generically for all systems instead of just doing local hacks. And turning this on per-service won't negatively impact the rest of the system.

@4oo4
Copy link

4oo4 commented Sep 1, 2021

@Zocker1999NET Even though we're using different desktop environments, that sounds extremely similar to my issue. I couldn't properly start a wayland desktop session because of issues with dbus and the Debian-gdm service not starting properly, so it would instead start a partially-working X11 session.

@poettering Good to know, so that means that the patch set you were referring to finally got merged?

Thanks

@poettering
Copy link
Member

@poettering Good to know, so that means that the patch set you were referring to finally got merged?

Yes.

@ghost
Copy link

ghost commented Sep 7, 2021

Note that recent kernels allow per-namespace hidepid, and we expose this through ProctectProc= and turned it on for all our long-running services.

This is a much nicer solution since turning this on is opt-in per-service and making use of it can be done generically for all systems instead of just doing local hacks. And turning this on per-service won't negatively impact the rest of the system.

This is not a complete solution. Users may run one-off commands that contain credentials in the command line or environment, which would be left unprotected.

@MichaelHierweck
Copy link

@poettering I'm unsure whether

  • to mount proc with hidepid=invisible using Linux 5.10+ and a recent systemd version (which?) or
  • to avoid the hidepid mount option and to use Linux 5.10+ and a recent systemd version (which?) and use ProctectProc. How coud "ProtectProc" be applied to all "normal user sessions" (cron, ssh,...) then?

@jplitza
Copy link

jplitza commented Oct 29, 2021

@poettering How does having this as a service option help? To avoid users seeing each other's processes, I have to set ProtectProc on user@.service, too. Otherwise, the user could elevate its privileges by starting custom systemd user units.

The only real workaround I've found so far is disabling the unified cgroup hierarchy by setting the boot option systemd.unified_cgroup_hierarchy=0, which starts the user instance successfully but causes this message in the journal:

-.slice: Failed to migrate controller cgroups from /user.slice/user-1000.slice/user@1000.service, ignoring: Permission denied

@randomstuff
Copy link

randomstuff commented Jan 27, 2023

Users may run one-off commands that contain credentials in the command line or environment, which would be left unprotected.

Actually, as far as I understand, credential exposure might happen under the hood in a normal desktop (no command-line) usage:

  • Clicking on a password reset link from a mail client will end up calling "$BROWSER $URI" and expose the password reset link which might be used to take control of the user account.
  • An application using an external browser to handle a OAuth authorization flow will expose the OAuth authentication request URI. This might be used to trigger a CSRF attack in order to inject an unexpected authentication code / access token / ID token in the user's session.

@AkechiShiro
Copy link

AkechiShiro commented Sep 8, 2023

There is no real consensus or official solution as of yet on this issue ? Just asking as I just hit the issue recently

@AkechiShiro
Copy link

@Zocker1999NET I believe I ran all your workaround steps and added my user to the proc group, but I haven't been able to solve the issue, I should perhaps give a try to ProtectProc that @poettering mentioned, that could be better in my case :
image

@AkechiShiro
Copy link

Actually a reboot fixed it, it seems, but reloading systemd-logind and logging in back again, thanks for the workaround @Zocker1999NET and sorry for the ping.

@mamekoro
Copy link

mamekoro commented Dec 2, 2024

Sorry to wake up an old issue, but I'd like to provide new information.

I encountered an issue where Wayland and GNOME Shell boot into a black screen when procfs is mounted with hidepid=2 on Ubuntu 24.04.

I ran journalctl in recovery mode and found PID-related errors, so I created a drop-in file for org.gnome.Shell@wayland.service and tried using [Service] ProtectProc=default and [Service] SupplementaryGroups=proc as mentioned in previous posts. However, it didn't work because this is a user service, not a system service.

Instead, I tried the following command:

sudo setcap cap_sys_ptrace=ep /usr/bin/gnome-shell

After rebooting, Wayland and GNOME Shell successfully started. echo $XDG_SESSION_TYPE displays wayland (i.e., no fallback to X11), and ps aux displays only the processes owned by the current user. These are the expected behaviors.

I'm not familiar with setcap and Linux capabilities, and I'm not sure if this method is correct. But I'm posting this in the hope that it may help someone.

Please let me know if you have any concerns about using this method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

9 participants