Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Push to talk does not work in Wayland / Gnome3 #3243
I recently upgraded my OS to use Gnome 3.24.2 which now uses Wayland instead of X. Gnome has used Wayland by default since version 3.16, which was released in 2015.
Push to talk works as expected when Mumble has focus and also works when certain applications like Firefox or my text editor are in focus.
However, push to talk does not work when a Wayland native application is in focus, like gnome-terminal or gnome-files.
If I start pressing push to talk with Mumble focused and switch tabs to a Wayland native application, Mumble does not detect when I stop pressing the push to talk key and it stays open until I switch back.
I've tried using both keyboard and mouse hotkeys, but the behavior is the same.
I couldn't find any other issues related to this bug on GitHub, but I did find this similar bug report in the Red Hat bug tracker: https://bugzilla.redhat.com/show_bug.cgi?id=1417576
Any further information about what could be causing this bug would be appreciated, thank you!
Not allowing clients to sniff on the input when they don’t have focus is a feature in Wayland, and it will be kept that way. The use can then trust their input to go where it is expected.
There are four cases of global bindings that I am aware of:
Under Wayland, the first case will work as usual, since the compositor has control over the input. The second case would ideally be split between the first one (for launching stuff) and a non-problem, see below. Also, the average user doesn’t need such tweaks in the first place.
So the ideal protocol would:
To have both, I went with an action-based protocol. I sent a proposal a few years ago and more recently, I made a cleaner one to allow for global action bindings.
The action are namespaced, and you are expected to use fallbacks. For Mumble, it means you would ask for the
(I should probably write all that to the mailing list, for the record if nothing else.)
If there is any interest from Mumble developers for this solution, I am willing to implement the compositor side for Weston (and all libweston-based compositors) as well as WLC-based compositors (at least via an
Sorry for taking so long to respond.
@itsrachelfish Mumble currently defaults to using XInput2 to "sniff" keypresses and mouse events. However, it can also use raw evdev.
Usually, the default device node permissions OSes allow you to read mouse clicks, but keyboards are off-limits (obviously).
However, if you configure the device node permissions, Mumble can happily use your raw evdev device nodes to read key events, which should work on Wayland, or everything, really.
The setting is "shortcut/linux/evdev/enable". To configure it on Linux, you'd add
However, there is a bit of a misbehavior right now, where Mumble will fall back to XInput2 when no keyboards can be opened via evdev. This behavior is from back when evdev was our default.
That's the workaround until we figure something proper out for Wayland.
...But I'm not sure it's enough in its current incarnation. At least it doesn't map fully onto the way global shortcuts work in Mumble currently. (And obviously: it doesn't need to. We're willing to use different UI if we need to, for different platforms.)
It seems like, if we were to use the current API, Mumble would simply bind to "mumble/push-to-talk", "mumble/volume-up", "mumble/volume-down", etc. -- and we wouldn't be able to show the actual bound key to the user, because that part is handled by the compositor. That means the UI for shortcuts would be less than ideal for users.
Perhaps we need a way to query which keys/events are bound to an action, so we can show that to the user?
How would the flow work from a user perspective? Do you configure the actions outside the app itself?
Kind of ties into my previous comment, but I suppose the current API requires us to bind to the actions on startup, correct? If we don't, we won't receive notifications when the action is triggered?
Not allowing sniffing on evdev directly is also a goal (that most OSes now do right because device nodes are root, as you noticed).
For now, there is no code behind my proposal, because nobody actually had (code-backed) interest in it. If Mumble is willing to implement it, I can make a Weston implementation, but I think at least a GNOME or KDE implementation would be needed to really push that protocol forward.
The client (Mumble) binds actions at startup, as you guessed.
As for the UI/UX, it would be compositor-dependent. Each DE/compositor would have its own UI (for Weston, it would just be the configuration file, at first, but writing a GUI tool is not really hard to do either). I can imagine GNOME and KDE having a new thing in their control panel, with a list of action strings and the corresponding binding(s).
Mumble would bind actions in the
I was impacted by this and came up with a possibly solution.
I extended mumble's dbus api to include
Then I wrote a small program that looked for the mouse button event I was using for push to talk that I could run with root permissions that could then send startTalk on mouse button down and stopTalk on mouse button up.
In the long term the desktops need to define some wayland accessibility system where users can bind global hot keys, and having such a thing send dbus messages seems reasonable.
But for the moment my little hard coded program will get me through my next gaming session.
It seems weird to disallow broadcast (or multicast) in the spec. I can imagine examples where the user would want to broadcast voip/push-to-talk to multiple apps. Imagine a gamer with their friends on mumble, but also other randomly matched teammates using the voip built into the game. They'd have one key bound to mumble/push-to-talk for their friends, but when they want to talk to their whole team, they want one button to activate push-to-talk in both apps.
Of course, usually in this case, the game would be in focus and could get the key presses that way, but that seems like a weird and unnecessary limitation. As that user, I would expect my global push-to-talk key to keep working even if I alt-tab out to my browser for a second.
Could we leave it entirely up to the compositor to decide where to send the actions?
From my proposed protocol:
It is up to the compositor. With that protocol, anyway.
Would it be an idea to allow push-to-talk to be triggered through the RPC (to both start and stop)? That way, under wayland, one can simply configure their desktop environment to run
Edit: upon closer inspection, the dbus PR already covers my usecase.