Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add --ready-fd to signal readiness #7904

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

anarcat
Copy link

@anarcat anarcat commented Jan 9, 2024

This adds a new commandline option, --ready-fd, to send a notification when sway is ready to run.

This allows a process supervisor to notice sway's startup is complete which allows, for example, the session manager to start Wayland programs in the right order. Without this signal, users have to go through horrible hacks to order services.

For example, I've been using NotifyAccess=all in the sway.service file and exec systemd-notify --ready in my sway config to emulate this. Problem is it's racy and error-prone.

A particularly nasty bug triggered by NotifyAccess=all is when podman starts and then terminates a container. In that context, conmon(8) ends up notifying systemd it's the session master and takes over thee MainPID field in systemd. When it dies, systemd believes the session is over and proceeds to kill the entire session. This is explained in more details in:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039857

This might not actually be the right place to do this in the sway startup sequence, that said. We call server_run right after, and maybe somewhere below that would be a better place. But server_run only calls wl_display_run and that's part of the core wayland library, so that seems a little too far down. I'm not sure Wayland itself is a place to do this, so for now I'm scratching my own itch and doing this in Sway itself.

Note that this approach was taken instead of using the proper sd_notify library call, as that approach was refused in #7659. The --ready-fd approach was accepted in swaywm/swaylock#281 so it is hoped it will be seen as acceptable here.

An alternative implementation would be to instead check the NOTIFY_SOCKET environment variable and use that, if present. That variable is used by systemd and at least the s6 supervisor to receive readiness notifications, so it might be less disruptive.

This adds a new commandline option, --ready-fd, to send a notification
when sway is ready to run.

This allows a process supervisor to notice sway's startup is complete
which allows, for example, the session manager to start Wayland
programs in the right order. Without this signal, users have to go
through horrible hacks to order services.

For example, I've been using `NotifyAccess=all` in the `sway.service`
file and `exec systemd-notify --ready` in my sway config to emulate
this. Problem is it's racy and error-prone.

A particularly nasty bug triggered by `NotifyAccess=all` is when
`podman` starts and then terminates a container. In that context,
`conmon(8)` ends up notifying systemd it's the session master and
takes over thee `MainPID` field in systemd. When it dies, systemd
believes the session is over and proceeds to kill the entire
session. This is explained in more details in:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039857

This might not actually be the right place to do this in the sway
startup sequence, that said. We call `server_run` right after, and
maybe somewhere below that would be a better place. But `server_run`
only calls `wl_display_run` and that's part of the core wayland
library, so that seems a little too far down. I'm not sure Wayland
itself is a place to do this, so for now I'm scratching my own itch
and doing this in Sway itself.

Note that this approach was taken instead of using the proper
`sd_notify` library call, as that approach was refused in swaywm#7659. The
`--ready-fd` approach was accepted in swaywm/swaylock#281 so it is
hoped it will be seen as acceptable here.

An alternative implementation would be to instead check the
`NOTIFY_SOCKET` environment variable and use that, if present. That
variable is used by systemd and at least the s6 supervisor to receive
readiness notifications, so it might be less disruptive.
@danieldg
Copy link
Contributor

If I were integrating this into my setup scripts, I would want to have run systemctl import-environment to pull in SWAYSOCK and WAYLAND_DISPLAY before marking sway as ready, which is best done via a script that sway execs from the config; this location is still a bit early for that to work (and due to the fact that exec in sway is asynchronous, there's actually no way to do this from within sway).

If other things are messing with the notify socket that sway hands them, then you should remove that environment variable from things you exec, perhaps by running them with systemd-run so they are in their own control groups and not sway's. That also has advantages if you want to adjust process priorities or do other cgroup things.

Another solution would be to add an unset-env command to sway so you can remove NOTIFY_SOCKET after using it, but there are other ways to solve that: for example, have the systemd unit run a wrapper that unsets NOTIFY_SOCKET and exports it as SWAY_STARTUP_NOTIFY_SOCKET, then the post-startup exec script can just read that to send the notify instead (and other processes like conmon won't look at that variable).

@YHNdnzj
Copy link

YHNdnzj commented Mar 3, 2024

A particularly nasty bug triggered by NotifyAccess=all is when podman starts and then terminates a container. In that context, conmon(8) ends up notifying systemd it's the session master and takes over thee MainPID field in systemd. When it dies, systemd believes the session is over and proceeds to kill the entire session. This is explained in more details in:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039857

You can close down the notify access after signaling readiness, no? (systemd/systemd#26214)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants