-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
signal readiness to sd-daemon #7659
Conversation
This allows systemd to notice sway's startup is complete which allows, for example, the session manager to start Wayland programs in the right order. Without this signal, users have to go through rather horrible hacks to tell systemd that it can start further units. I've been using `NotifyAccess=all` in the sway.service file and `exec systemd-notify --ready` in my sway config to emulate this, but it's racy and error-prone. A particularly nasty bug triggered by `NotifyAccess=all` in particular is when podman starts and then terminates a container. In that context, conmon(8) ends up notifying systemd it's the session master and takes over thee "Main PID" field in systemd. When it dies, systemd believes the session is over and proceeds to kill the entire session. This is explained in more details in: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039857 This might not actually be the right place to do this. We call `server_run` right after, and maybe there would be a better place. But `server_run` only calls `wl_display_run` and that's part of the core `wayland` library. I'm not sure Wayland itself is a place to do this, so for now I'm scratching my own itch and doing this in Sway itself.
|
NACK, sorry. We will not add init-specific logic. You can achieve the same result with an |
Not quite. To have this work with an I haven't proposed this in jest: there's already systemd-specific logic in sway, specifically in swaybar, which can already link to systemd. I don't want to bug you with this at all, I just wanted to let you know I did try to propose this very carefully and it seemed like a rather modest (one-line) change... |
Then you need to wrap Sway with logic to pass the ready FD differently to the
It's not systemd-specific. swaybar uses only the sd-bus library which can be provided by e.g. basu. |
I'm not sure I understand what you mean here. systemd just does not allow other processes to do
I understand what you mean here, but it seems to me basu could easily be patched to support But I understand there's probably no other service manager out there that supports receiving those messages. I don't see this as a permanent limitation though... Anyway, I don't want to drag you down this annoying systemd lane if you don't want to. :) |
|
systemd uses graphical-session.target to order units after the display server. If you want to run sway as a user service you will need to start that target anyway. |
|
On 2023-07-03 11:57:16, Ronan Pigott wrote:
systemd uses graphical-session.target to order units after the display server. If you want to run sway as a user service you will need to start that target anyway.
Not necessarily. `graphical-session.target` is a system-level systemd
unit. I'm talking about running a `systemd --user` unit here.
Interestingly, this makes `systemd` explicitly *not* an init system in
this context, as we're not talking about having systemd run as PID 1
here...
|
|
You're thinking of graphical.target. graphical-session.target is indeed a user service for exactly this purpose. |
|
well i guess there's many ways to do this. the way i set it up here is that sway.service but this is getting wildly out of topic here. if the sway devs don't want this, i'll just go play elsewhere. :) |
That's incorrect, and your suggested patch doesn't even fix it. A valid DISPLAY or WAYLAND_DISPLAY in the user manager startup environment is an implicit assumption of graphical-session.target, which again, is the special target to "start Wayland programs in the right order" as you say. You need to manually import these vars from sway with If you wanted sway to handle this for you, you would need to do the SetEnvironment dbus calls within sway. Which also will not be accepted. |
|
On 2023-07-03 12:30:46, Ronan Pigott wrote:
[...]
If you wanted sway to handle this for you, you would need to do the SetEnvironment dbus calls within sway. Which also will not be accepted.
Yes. I am aware of that as well. That's something that can actually be
done with an `exec`, and safely, so I figured I could keep this a local,
configuration-specific change.
I have purposefully set this patch to be as minimal as possible in order
for it to be accepted, precisely because I knew more extensive changes
wouldn't be accepted.
If you're interested, I've actually tried to document all the tweaks I
had to do to make this work completely in:
https://anarc.at/software/desktop/wayland/#systemd-integration
the sd_notify stuff is only a rather tiny part of it, but everything
else can be done without patching Sway.
I find that this patch is an elegant one because it only keeps itself to
do the notification on the socket, which can be useful even if sway
isn't the actual session, as other things can depend on it safely again.
|
Not with your stated setup it can't. If graphical-session starts before the display variables are imported it is incorrect from an ordering standpoint, which is what this (incorrect) patch is trying to address. As is, it is possible to get the ordering correct with a target unit. With your patch, the order is always incorrect. |
|
I might very well be incorrect, but as far as I can tell, this works. I'm actually using this patch in my session now. I also believe the ordering is correct. I am not sure when the But again, I don't want to drag you fine folks too much into the systemd ordering world. There's a clear desire to have nothing to do at all with init systems in general and, it seems to me, with As you have stated, nothing here regarding systemd will get merged, even if it would be correctly implemented, so it seems rather futile for me to have to justify my reasoning when it's going to be shot down ad hominem anyway... |
|
Sway guarantees that WAYLAND_DISPLAY/DISPLAY are present in the environment for exec, however services ordered after this ready notify cannot be guaranteed the presence of DISPLAY/WAYLAND_DISPLAY in the systemd user manager activation environment. If either
sway does not wait on exec'd processes and systemd parallelizes unit transactions aggressively, so this is still possible. To guarantee correctness, you need a target unit that is ordered after the completion of import-environment, and systemd already includes a special target you can bind to, graphical-session.target, which is used for this purpose in many unit files. That is why this patch does not improve upon the status quo. Fortunately, I don't think there is any problem with using graphical-session.target, so it is not actually necessary to fix anything here.
There's no ad hominem, I'm just trying to explain what's wrong. Believe it or not I am plenty active on systemd too, and were we discussing this change on systemd's tracker I would have pointed out the same issue. |
I meant wrap Sway with a small script which copies the
That's out-of-scope. basu is purely a D-Bus library, nothing more. basu is not a loose collection of systemd-related libraries.
Other service managers do support receiving readiness notifications. For example s6 does. Even if a service manager doesn't, it's just one Something that I would consider would be a init-independant way of doing readiness notifications. Something which wouldn't be tied to systemd, e.g. similar to swaywm/swaylock#281. However, Sway being a server (for both Wayland and Sway IPC), I'm not sure that's the right approach, maybe something like socket activation would be more appropriate, I don't know. |
|
On 2023-07-03 20:31:27, Ronan Pigott wrote:
Sway guarantees that WAYLAND_DISPLAY/DISPLAY are present in the environment for exec, however services ordered after this ready notify cannot be guaranteed the presence of DISPLAY/WAYLAND_DISPLAY in the systemd user manager activation environment. If either
1. The user does not exec systemctl import-environment [...] (totally valid usage of sway)
2. The user _does_ exec systemctl import-environment [...], but it doesn't complete before the notify call.
sway does not wait on exec'd processes and systemd parallelizes unit transactions aggressively, so this is still possible. To guarantee correctness, you need a target unit that is ordered after the completion of import-environment, and systemd already includes a special target you can bind to, graphical-session.target, which is used for this purpose in many unit files.
That is why this patch does not improve upon the status quo. Fortunately, I don't think there is any problem with using graphical-session.target, so it is not actually necessary to fix anything here.
I did try graphical-session.target. I got lost in the weeds, there were
many problems, and I barely converged over the current configuration.
But I see your point, there's definitely a race condition there.
I think part of the problem here is there is "no good way" of doing
things. I have tried to follow the wiki instructions and they didn't
work for me. I must admit I haven't documented why in details, but it
was quite a struggle to get a working config at all...
It feels like Sway is designed to be a session manager, with everything
started as an `exec`. That's not something I find very satisfying, and I
wanted to try something else. I have pushed as far ahead as I can, and
ultimately came up with this patch...
On 2023-07-04 02:21:02, Simon Ser wrote:
> I'm not sure I understand what you mean here. systemd just does not allow other processes to do sd_notify, I fail to see what "pass the ready FD differently" refers to here.
I meant wrap Sway with a small script which copies the `NOTIFY_SOCKET` env var to another env var (e.g. `SWAY_NOTIFY_SOCKET`), then unsets `NOTIFY_SOCKET`. Then a Sway `exec` can use that env var instead of `NOTIFY_SOCKET`. That way child processes don't see `NOTIFY_SOCKET`.
I see, that way podman can't (ab)use the `NOTIFY_SOCKET`? That makes
sense...
[...]
> But I understand there's probably no other service manager out there that supports receiving those messages. I don't see this as a permanent limitation though...
Other service managers do support receiving readiness notifications. For example s6 does. Even if a service manager doesn't, it's just one `read(1)` away.
* * *
Something that I would consider would be a init-independant way of doing readiness notifications. Something which wouldn't be tied to systemd, e.g. similar to swaywm/swaylock#281. However, Sway being a server (for both Wayland and Sway IPC), I'm not sure that's the right approach, maybe something like socket activation would be more appropriate, I don't know.
That sounds like rewriting `sd_notify` but sure, if that's what's needed
to get this merged, that would make sense. :)
I guess the next step for me here would be to try to rewrite this PR to
follow swaylock's approach, or try rebuilding my entire systemd
configuration to use the graphical-session.target *again*.
Given that things generally work here, I'm not really motivated to do
either at this stage, maybe later.
Sorry for the noise.
|
This adds a new commandline option, --ready-fd, to send a notification when sway is ready to run. This allows a process supervisor to notice sway's startup is complete which allows, for example, the session manager to start Wayland programs in the right order. Without this signal, users have to go through horrible hacks to order services. For example, I've been using `NotifyAccess=all` in the `sway.service` file and `exec systemd-notify --ready` in my sway config to emulate this. Problem is it's racy and error-prone. A particularly nasty bug triggered by `NotifyAccess=all` is when `podman` starts and then terminates a container. In that context, `conmon(8)` ends up notifying systemd it's the session master and takes over thee `MainPID` field in systemd. When it dies, systemd believes the session is over and proceeds to kill the entire session. This is explained in more details in: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039857 This might not actually be the right place to do this in the sway startup sequence, that said. We call `server_run` right after, and maybe somewhere below that would be a better place. But `server_run` only calls `wl_display_run` and that's part of the core wayland library, so that seems a little too far down. I'm not sure Wayland itself is a place to do this, so for now I'm scratching my own itch and doing this in Sway itself. Note that this approach was taken instead of using the proper `sd_notify` library call, as that approach was refused in swaywm#7659. The `--ready-fd` approach was accepted in swaywm/swaylock#281 so it is hoped it will be seen as acceptable here. An alternative implementation would be to instead check the `NOTIFY_SOCKET` environment variable and use that, if present. That variable is used by systemd and at least the s6 supervisor to receive readiness notifications, so it might be less disruptive.
So I've done #7904 because it's similar to swaywm/swaylock#281 but now i see that the latter was actually dropped later, and only sends a newline, which won't work for systemd (but works for s6? go figure). I've done this only because you offered this as an exit strategy here but i'm not particularly enthusiastic about the patch. It necessarily requires a wrapper script to parse the NOTIFY_SOCKET variable and pass it down to the --ready-fd argument which seems rather silly: fundamentally, Also, with regards to:
from what i understand, Anyway, thanks for the feedback. |
It was dropped later because, as you discovered too, we found in other discussions that there was no convenient way to pass an fd for systemd's notify socket. We returned to the plain \n to not claim to support something we didn't. I would say that this is not s6-specific. It's a generic system generally agreed by other init systems and one trivially used by wrapper scripts and other parent programs, requiring just a pipe.
sd_notify is not "just" a readiness protocol, it's a systemd general service IPC layer, including things like controlling PID 1's stored fd polling behavior. The general counter-argument to implementing systemd's protocols is that they are not well suited for other init system designs where there isn't a central service management daemon, and that these systemd APIs change with the wind as they just do what systemd needs at any given time - they're not formally designed and specified, so not something you really want to clone. On the other hand, s6-to-systemd wrappers are trivial and exist. The friction is a annoying to all of us (why couldn't systemd just play ball, ugh) - sorry. |
There's a document called "Interface Portability and Stability Promise": https://systemd.io/PORTABILITY_AND_STABILITY/ that mentions the notify protocol, linking to its definition. It's been there for a long time. You can reimplement it at will, and it's going to always work. |
|
Feel free to write a service manager cloning systemd's APIs if you want. Until then, it remains a highly systemd-specific API with features unrelated to readiness-notification in the best systemd-style (one of which I added not too long ago), which other authors have found unsuitable and some outright unimplementable, stable or not. Also nothing stopping systemd from playing adding compatibility for the much simpler s6/ready-fd-style protocol. Who knows, maybe we'll change our stance on having systemd-specific code at some point, but "other things can clone systemd" definitely isn't going to be the winning argument. |
This allows systemd to notice sway's startup is complete which allows, for example, the session manager to start Wayland programs in the right order. Without this signal, users have to go through rather horrible hacks to tell systemd that it can start further units.
I've been using
NotifyAccess=allin the sway.service file andexec systemd-notify --readyin my sway config to emulate this, but it's racy and error-prone.A particularly nasty bug triggered by
NotifyAccess=allin particular is when podman starts and then terminates a container. In that context, conmon(8) ends up notifying systemd it's the session master and takes over thee "Main PID" field in systemd. When it dies, systemd believes the session is over and proceeds to kill the entire session. This is explained in more details in:https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039857
This might not actually be the right place to do this. We call
server_runright after, and maybe there would be a better place. Butserver_runonly callswl_display_runand that's part of the corewaylandlibrary. I'm not sure Wayland itself is a place to do this, so for now I'm scratching my own itch and doing this in Sway itself.