Support XDG activation for applications started from commands #5812

bbb651 · 2025-02-16T23:36:48Z

bbb651
Feb 16, 2025

On wayland, windows are generally not allowed to focus themselves (focus stealing prevention), including when they are first created. Instead, focus can be transferred from the currently focused window to a different window through the xdg-activation protocol, using a token (null terminated string) that is transferred between them. The means of transport is not defined in the protocol but has settled on the XDG_ACTIVATION_TOKEN environment variable.

In practice, no compositor actually enforces full focus stealing prevention, due to missing client support (relevant mutter blog). The single biggest blocker is terminal support, this VTE issue goes into a lot of detail.

Since it's impossible to know ahead of time if a command will launch an application or not, and the activation tokens needing to be fresh (they'll usually get expired in the matter of seconds, and they optionally take the serial of the event that caused the activation), this has to be done preemptively for all commands (by command I a commandline that might include multiple commands - environment variables work well. If there are multiple applications spawned in one commandline they'll race for focus), which sounds bad but afaik should be cheap. Afaik there's no existing sequence for this so one will need to be created, afaik APC is the most appropriate here.

I've made a hack that calls out to the compositor directly to create the token, and I've been daily driving it with niri's strict-new-window-focus-policy and it confirms the shell integration can work for fish.

~~The biggest problem is shell support~~ that's what I originally because comments of comments on the VTE issue:

The biggest drawback of that is that I think it's very likely bash isn't going to be interested in any of this, and AFAIK it doesn't have the necessary hooks (maybe DEBUG hook could do it, but I'd be very reluctant to rely on that) to implement this via vte.sh.

In bash, which is the shell I'm by far the most familiar with, there are two ways to do something right before executing a command line: the debug trap (which @chpe already mentioned) and since v4.4 $PS0 (which can execute commands if contains $(...)). However, I don't think either of these can modify the execution environment of the new process. If we only needed to notify Wayland that a program is starting up then this could be enough. But since we clearly need to pass a fresh token to that new app, I'm afraid bash would need to be patched.

But I went ahead and actually tested it with my setup with bash and it works! And since ghostty already uses trap DEBUG in the shell integration it shouldn't be a problem. Here's the complete setup:

#!/bin/sh
filename=$(mktemp -u /tmp/.xdg-activation.XXXXXXXXXX)
command mkfifo $filename
command niri msg action spawn -- sh -c "echo \$XDG_ACTIVATION_TOKEN > $filename" &
export XDG_ACTIVATION_TOKEN=$(cat $filename)
command rm $filename

# https://superuser.com/a/175802
preexec () { source ~/scripts/xdg_activation.sh 2>/dev/null; }
preexec_invoke_exec () {
    [ -n "$COMP_LINE" ] && return  # do nothing if completing
    [ "$BASH_COMMAND" = "$PROMPT_COMMAND" ] && return # don't cause a preexec for $PROMPT_COMMAND
    local this_command=`HISTTIMEFORMAT= history 1 | sed -e "s/^[ ]*[0-9]*[ ]*//"`;
    preexec "$this_command"
}
trap 'preexec_invoke_exec' DEBUG

Still needs testing on zsh and elvish, shell integration already has preexec hooks for them so I'm optimistic they'll work.

Personally, in addition to better focus stealing preventing I'm really interested in this because of the flexibility this gives compositors, in particular implementing xdg activation based window grouping in niri which enables really cool things like viewing man --html directly "in the terminal", launching mpv instances without ending up with many "usless" terminals in the tiling/scrolling layout, viewing images (can also be done with kitty gfx, but less interactively). It's approaching things like awrit from the complete opposite direction (though I still think have "in terminal single client wayland compositor abusing kitty gfx" in my project ideas list :P)

pluiedev · 2025-02-17T00:24:00Z

pluiedev
Feb 17, 2025
Collaborator

Maybe this is 1:30am brain talking, but I'm not sure what you're asking for in this discussion... is there something concrete that Ghostty should implement?

4 replies

bbb651 Feb 17, 2025
Author

My bad I've skimmed over the actual request, from ghostty's side every time an interactive command is ran:

The preexec hook in the shell integration asks ghostty for an activation token through an APC(?) sequence
Ghostty calls xdg_activation_v1.get_activation_token to get an activation token and return it to the shell integration
The shell integration sets XDG_ACTIVATION_TOKEN to the token for the command being ran (probably globally for the shell due to how to the hooks work, it's possible to unset it in a postexec but it's not that important since it'll be overridden by the next command)

This makes focus transfer properly to applications started from the terminal without compositor hacks.

pluiedev Feb 17, 2025
Collaborator

To me that sounds simultaneously awfully complicated and underspecified at the same time. How would the shell integration know that the program wants an activation token? What would the sequence look like? Do we need to invent a proprietary protocol or is there precedent/prior art? How would it be potentially standardized in other terminals?

bbb651 Feb 17, 2025
Author

How would the shell integration know that the program wants an activation token? What would the sequence look like?

You cannot know ahead of time if running a command will create a window so it has to be done preemptively for every command, since trying to make the launched program communicate back with the terminal isn't viable (communicating through stdin doesn't work with pipes, redirects, etc. so you'd need a different mechanism. Not to mention application will need support for this that is a lot more complex than the existing "if XDG_ACTIVATION_TOKEN is set, activate with it and unset it").
An alternative could be a new wayland protocol with looser requirements, namely a single token that could be reused for an entire tab/split and be set once inherited through an environment variable. There are discussions about it in the VTE issue, I personally don't like this idea because it creates a bunch of complexity for the compositor side and is strictly worse (it can easily lead to focus stealing from the terminal if an application forgets to unset XDG_ACTIVATION_TOKEN and new windows inherit it, and it gives less flexibility for compositors because they don't know where the activation exactly came from which can break my window swallowing use case), especially since we are already using trap DEBUG it can be used for this which was the biggest issue from VTE's perspective.

What would the sequence look like? Do we need to invent a proprietary protocol or is there precedent/prior art? How would it be potentially standardized in other terminals?

Like I said, afaik there's no sequence for this so we'll need to invent a proprietary one. It's only used to communicate between ghostty's shell integration scripts and ghostty itself so standardization doesn't matter as much (a different terminal could use a different sequence), from my understand APC is the most appropriate for things that are terminal application internal/specific.
It needs to have a single request from the shell integration side, which ghostty will respond to with the token (null terminated string), that's it.

pluiedev Feb 17, 2025
Collaborator

So essentially we create a new token for every subprocess launched. I have my doubts for that approach — since we have to block and wait for the compositor to respond with the activation token, that could be awfully slow for each and every single process. Plus, (and I might be wrong here), shell integration merely notifies us that something has changed (e.g. working directory has changed, currently running program has changed, etc.) and it doesn't really support bidirectional communication, let alone blocking bidirectional communication.

At least we should have an allowlist for the processes that are allowed to receive a token. That way the fast path stays fast for CLI programs. As for the protocol, I'd really prefer if we can at least talk to other terminals that are interested in auxiliary features (e.g. Kitty, WezTerm, iTerm2, ...) and see how they feel about the issue. We haven't found a need to invent new proprietary protocols for our own purposes yet and I don't think that would change soon.

rockorager · 2025-02-17T21:35:17Z

rockorager
Feb 17, 2025
Collaborator

Is there a realistic use case for this? I saw the sleep 60 && firefox example, but I'm wondering where this actually becomes a problem.

I don't think that solving this via shell integration is the the right way. Ideally, each application would ask the terminal for the token when it needs it (this can be done even if all IO is redirected by opening /dev/tty). But - I understand how this can be an issue because firefox shouldn't need to know escapes sequences.

If I understand the sequencing right, when an application launches it will check for XDG_ACTIVATION_TOKEN and if it is set, it activates with it and then unsets the value...this gives it focus but does not allow subprocesses to steal focus?

2 replies

bbb651 Feb 18, 2025
Author

Is there a realistic use case for this? I saw the sleep 60 && firefox example, but I'm wondering where this actually becomes a problem.

This is an edge case that isn't expected to work, the token will get invalidated by the time it launches, or focus might've switched from the terminal. The use case is making firefox work (without the current compositor hacks). Making this per command-line allows for pipes, scripts, etc. to work, and avoids needing to care about or parse shell syntax.

I don't think that solving this via shell integration is the the right way. Ideally, each application would ask the terminal for the token when it needs it (this can be done even if all IO is redirected by opening /dev/tty). But - I understand how this can be an issue because firefox shouldn't need to know escapes sequences.

Yeah I agree in theory, but it's very unrealistic to make clients support that due to the complexity

If I understand the sequencing right, when an application launches it will check for XDG_ACTIVATION_TOKEN and if it is set, it activates with it and then unsets the value...this gives it focus but does not allow subprocesses to steal focus?

After a token is activated it's invalidated, which if a subprocess tries to use the compositor would deny, so it's mostly good practice. In the proxy token protocol proposed in the VTE issue it would become a problem, since now you cannot determine if it was activated by a subprocess (potentially after a long time has passed) or directly by the terminal, which allows it to steal focus from the terminal when it shouldn't be able to.

roke-julian-lockhart Jun 27, 2026

This is an edge case that isn't expected to work.

@bbb651, shouldn't such “edge cases” be incorporated? This is already a fair bit of work, so doing it correctly sounds like a more sensible approach than half-heartedly implementing it.

pluiedev · 2025-02-17T21:58:38Z

pluiedev
Feb 17, 2025
Collaborator

Additionally, is there any other compositor that actually does the "correct thing" and prevent focus stealing per the spec? From what I can tell none of Mutter, KWin, Sway, etc. do this, and niri only recently — as of 25.01 (!) — introduced this opt-in option. As far as I'm concerned this would only affect a tiny sliver of our userbase (this makes absolutely zero sense on anything except Linux Wayland on very few select compositors) at a cost of an extensive change to our shell integration and Wayland integration logic.

If the situation changes and more compositors decide to support this or make this the default behavior, then I think this would be much more justifiable.

1 reply

bbb651 Feb 18, 2025
Author

Mutter also has a strict focus stealing prevention experimental setting (gsettings set org.gnome.desktop.wm.preferences focus-new-windows 'strict'), their current behavior is arguably too strict still (if you're familiar with their "app is ready" notifications) and is intrested in improving the situation. It a very much a chicken and egg problem, it is opt-in precisely because there's no support for it.

I agree that it's not really ghostty's problem especially when Gnome is interested in this and have both a compositor and a terminal emulator, I'll comment on the VTE issue with my findings about trap DEBUG being viable and my concerns with the proposed alternative protocol. I might try to make a draft PR, I think it'll be helpful to have an actual benchmark of how much this delays command startup, though I need to learn some more zig first.

By the way, I thought about it and an allowlist wouldn't really work because the invocation might not be direct (e.g. if it opens through a shell script or alias, or things like matplotlib started from python, wouldn't work with any shell constructs e.g. pipes without complex parsing), I think a global on off setting would make more sense. A did a very unscientific test to simulate a roundtrip (hyperfine --shell=none wlprobe) which takes ~1ms which is not great but not horrible, of course this is not measuring exactly the same thing and it can also vary wildly between compositors.

jcollie · 2025-02-18T02:09:13Z

jcollie
Feb 18, 2025
Collaborator

I think that we've got an XY problem here. I don't think that we understand what problem this is intended to solve. Is this somehow trying to hack around a limitation in niri? CLI commands needing an XDG activation token is rather novel. I think that we first need to understand the problem that you're trying to solve, rather than trying to bikeshed/debug the solution that you're attempting here.

0 replies

rockorager · 2025-02-18T02:23:29Z

rockorager
Feb 18, 2025
Collaborator

> If I understand the sequencing right, when an application launches it will > check for `XDG_ACTIVATION_TOKEN` and if it is set, it activates with it and > then unsets the value...this gives it focus but does not allow subprocesses > to steal focus? After a token is activated it's invalidated, which if a subprocess tries to use the compositor would deny, so it's mostly good practice. In the proxy token protocol proposed in the VTE issue it would become a problem, since now you cannot determine if it was activated by a subprocess (potentially after a long time has passed) or directly by the terminal, which allows it to steal focus from the terminal when it shouldn't be able to.

I think I understand what this is trying to fix now, thanks. But...I don't think this is the right way to do it. Shell integration will only inject at the beginning of a command. If these tokens have timeouts, no one can be sure if it will be valid in the pipeline of commands by the time the command spawning a window execs. Also, a command like `firefox && firefox` would cause one to activate, and not the other - it seems like the *last* command should be activated, but this would activate the *first*. I think a better solution is to design a protocol to get an activation token from a terminal, OSC would be fine for this. Then write a small program that obtains a token and spawns a child process to be activated with said token. Then nothing needs to be aware of what is happening *except* the user who has fine control over what gets activated. This also let's non-shells which spawn windows perform activation (the github CLI when authenticating, for example).

0 replies

pluiedev · 2025-11-09T11:53:42Z

pluiedev
Nov 9, 2025
Collaborator

I've actually thought a bit more of this and realized that the whole reason the proposal is so convoluted in the first place is because we are relying on the XDG_ACTIVATION_TOKEN env var being set in the preexec, despite that not being a requirement at all for the Wayland protocol. It is perfectly legal, in fact, to redeem the activation token after the window has been created, which should then allow the window to grab focus in a consensual way.

Given this fact, I think it would be reasonable if we implement a bidirectional protocol where the CLI client could ask for an activation token and we can respond asynchronously. Then the terminal would only have to deal with token requests on an individual basis and do not need to delay executing a program, while the program can go on with its initialization process while waiting for the terminal (and compositor) to respond.

2 replies

bbb651 Nov 9, 2025
Author

Yeah I agree I somewhat changed my stance on this, it's pretty much the only way to implement it cleanly and work in all edge cases, e.g. a script that only opens a window late into it's execution, but I still think applications won't implement it if it's too complex.
I don't think it even needs to be bidirectional, passing a pipe fd will work. Maybe mount_namespaces can be used to mount a pipe per spawned process from the shell? It's a bit complex on the terminal side but on the application side it's just reading the token from a predetermined file path.

pluiedev Nov 9, 2025
Collaborator

Honestly I don't think parsing a DCS or APC sequence would be too much work (compared to reading from an fd anyways, which also has to be passed along to the app in some way) mount_namespaces is also Linux specific, so that wouldn't work on macOS or the BSDs. We can offer parsing functionality via libghostty-vt if it makes it easier

kovidgoyal · 2025-11-09T13:27:06Z

kovidgoyal
Nov 9, 2025

XDG_ACTIVATION is just a terrible design. It needs to be fixed in Wayland, not have poor terminals (and really any other applications that can potentially launch unrelated applications) jumping through unneccesarily complex hoops to make up for it. For example, a simple, robust fix would be for wayland to allow an application to register a token (like a uuid4). Then if any application requests focus with that token the compositor can grant it focus if the application that registered the token currently has focus. Or if you want more control the compositor can first check if the application registering the token has focus, and if it does it can ask it whether to allow focus stealing.

0 replies

ross96D · 2026-03-19T20:31:38Z

ross96D
Mar 19, 2026

At least the XDG_ACTIVATION_TOKEN should be used when openUrl is called so the browser (or the corresponding app) gets focused when clicking on a link.

The code to get the token in gtk4 is not complicated

const token: ?[]const u8 = blk: {
    const display = gdk.Display.getDefault() orelse break :blk null;
    const context = gdk.Display.getAppLaunchContext(display);
    const tokenPtr = gio.AppLaunchContext.getStartupNotifyId(context.as(gio.AppLaunchContext), null, null) orelse null;
    break :blk std.mem.span(tokenPtr);
};

0 replies

Uh oh!

Support XDG activation for applications started from commands #5812

Uh oh!

bbb651 Feb 16, 2025

Replies: 8 comments · 9 replies

Uh oh!

Uh oh!

pluiedev Feb 17, 2025 Collaborator

Uh oh!

Uh oh!

bbb651 Feb 17, 2025 Author

Uh oh!

pluiedev Feb 17, 2025 Collaborator

Uh oh!

Uh oh!

bbb651 Feb 17, 2025 Author

Uh oh!

pluiedev Feb 17, 2025 Collaborator

Uh oh!

rockorager Feb 17, 2025 Collaborator

Uh oh!

bbb651 Feb 18, 2025 Author

Uh oh!

roke-julian-lockhart Jun 27, 2026

Uh oh!

pluiedev Feb 17, 2025 Collaborator

Uh oh!

bbb651 Feb 18, 2025 Author

Uh oh!

jcollie Feb 18, 2025 Collaborator

Uh oh!

rockorager Feb 18, 2025 Collaborator

Uh oh!

pluiedev Nov 9, 2025 Collaborator

Uh oh!

Uh oh!

bbb651 Nov 9, 2025 Author

Uh oh!

Uh oh!

pluiedev Nov 9, 2025 Collaborator

Uh oh!

kovidgoyal Nov 9, 2025

Uh oh!

Uh oh!

ross96D Mar 19, 2026

bbb651
Feb 16, 2025

Replies: 8 comments 9 replies

pluiedev
Feb 17, 2025
Collaborator

bbb651 Feb 17, 2025
Author

pluiedev Feb 17, 2025
Collaborator

bbb651 Feb 17, 2025
Author

pluiedev Feb 17, 2025
Collaborator

rockorager
Feb 17, 2025
Collaborator

bbb651 Feb 18, 2025
Author

pluiedev
Feb 17, 2025
Collaborator

bbb651 Feb 18, 2025
Author

jcollie
Feb 18, 2025
Collaborator

rockorager
Feb 18, 2025
Collaborator

pluiedev
Nov 9, 2025
Collaborator

bbb651 Nov 9, 2025
Author

pluiedev Nov 9, 2025
Collaborator

kovidgoyal
Nov 9, 2025

ross96D
Mar 19, 2026