Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Wayland #27

Open
joshgoebel opened this issue Jun 14, 2022 · 56 comments
Open

Support Wayland #27

joshgoebel opened this issue Jun 14, 2022 · 56 comments
Labels
enhancement New feature or request help welcome Help/contrib is esp welcome

Comments

@joshgoebel
Copy link
Owner

Basically we just need a way for Wayland to provide us:

If this is possible, hooking it up should be largely trivial and adding a way for a user to tell the keymapper if they are using X or Wayland. (perhaps we could even auto-detect based on ENV)

@joshgoebel joshgoebel added help welcome Help/contrib is esp welcome enhancement New feature or request labels Jun 14, 2022
@rbreaves
Copy link

rbreaves commented Jun 15, 2022

This would be a really huge addition to see & a big motivator for me to get Kinto over to keyszer yesterday if this gets added lol. I am not sure if I would daily drive Wayland or not still, but as it stands I can't even consider it realistically till this occurs but will likely have to occur on the DE level, aka Budgie, Gnome, KDE, Mate, XFCE, etc.

And just to link this back to an earlier thread on this topic over at xkeysnail. mooz/xkeysnail#108

@joshgoebel
Copy link
Owner Author

but will likely have to occur on the DE level, aka Budgie, Gnome, KDE, Mate, XFCE, etc.

Yeah I saw that longer thread the other day... my current understanding is we're waiting on the Wayland people/DEs to make this all possible... if the APIs and libraries existed it should be trivial... our X11 integration code is only like 20 lines.

We'll need some type of abstraction layer to a generic idea of a "window manager", but it makes no sense to think about that very hard until we know what the Wayland interface will be... obviously anything that gets us "app name" and "window name" would be great, and if that was pushed to us rather than us having to poll it, even better.

@rbreaves
Copy link

Yea imo the best possible solution is they add the API to Wayland or XDG I think it is/was. Regardless some layer that will be accessible to all DEs & apps therein.

@RedBearAK
Copy link
Contributor

@joshgoebel

It seems like it's going to be years before something like XDG or wlroots will add the ability to get the window class/name in "Wayland" in general, but there are existing ways to access the information in specific environments like GNOME (DBus calls) or sway. Meanwhile, even though Wayland is still a bit buggy and incomplete, a lot of users are seeing benefits like better multi-monitor support or high refresh rates, so I really can't fault them for wanting to use Wayland over X11 already.

I ran into a keymapper project that claims to provide per-application remapping abilities in both X11 and Wayland, although they appear to use the techniques specific to limited environments like GNOME, sway, and hyprland that already provide this information in some way.

Unfortunately this project is written in Rust rather than Python, so it would not exactly be a simple copy-paste job to try and bring the methods into keyszer to get some limited usefulness in Wayland. But, on the other hand, I don't think the techniques are particularly complicated, so it should be feasible to just look at the methods they are tapping into and do something similar in keyszer.

Question is, how open are you to starting to integrate some Wayland solutions for per-application mappings that currently will only work in a few different Desktop Environments? I get the feeling that the X11 user base is going to start shrinking pretty quickly now, between all the distros that are starting to use Wayland as the default, and users that are actually moving to and liking Wayland on their own for one reason or another. Feels like the balance is really starting to shift lately.

A large chunk of Linux users are on popular distros like Ubuntu and Fedora, which use GNOME by default, so it seems like a lot of users would be served by at least supporting GNOME's DBus method to get the window info in Wayland.

The project is here:

https://github.com/k0kubun/xremap

Doesn't look like they've implemented matching on the window "name" (title) as opposed to the application "class" just yet, but I think that's more of them needing to implement the logic rather than not knowing how to get the window title. With keyszer already having the working logic in place to do matching on the window "name" property, I'm hoping it will be possible to just feed that existing logic the window title from Wayland windows just as easily as getting the Wayland window class info.

Thoughts?

@joshgoebel
Copy link
Owner Author

I feel like we'd just go with modules/classes for the WM... so in your config you'd specify which module to use and that module would be responsible for providing the window context into the KeyContext. So anyone can use whatever WM they want, just so long as they can provide a module that (from Python) is able to learn a few key details about the windows (name and class hopefully?). I'm not sure the best way to structure that in Python off the top of my head, but it should be fairly trivial. Just a matter of how you inject/connect that module with KeyContext.

Assuming they both have concepts like "name" and "class" you could seamlessly switch between them [window managers] by just changing one line of your config file (or perhaps even auto-detecting, though I'm not sure I'm interested in building that into the key mapper itself)...

Right now you could start just by hacking the existing _query_window_context function to use Wayland/Gnome instead of X... you'd probably add a wm/wayland_gnome.py for the actual wrapper that talks to Wayland... (and move xorg there also, etc)...

Once you get that working we could circle back to how to make which WM module to use a configurable choice.

Assuming they both have concepts like "name" and "class"

If it turns out they are VERY different, than that would be a larger discussion.

@RedBearAK
Copy link
Contributor

Apps need to be able to display unique window or tab titles, so I have little doubt that info is in Wayland the same way it is in X11. Not too worried about that.

(or perhaps even auto-detecting, though I'm not sure I'm interested in building that into the key mapper itself)

I mildly disagree with this.

Since users do sometimes have good reason (at this early juncture in Wayland's life cycle) to need to switch back and forth between X11 and Wayland sessions, I feel like we should at least make some attempt to do auto-detection. It really shouldn't be that hard. There was at least one environment variable I found that seemed to reliably hold info on whether the session is Wayland or X11. Something that would have fixed a problem the Kinto installer sometimes has with failing to detect Wayland, but it was never merged.

A manual config setting as a backup, in case the auto-detect isn't working for some reason, shouldn't be too hard.

Right now you could start just by hacking the existing _query_window_context function to use Wayland/Gnome instead of X... you'd probably add a wm/wayland_gnome.py for the actual wrapper that talks to Wayland... (and move xorg there also, etc)...

Yeah, I was going to look at that, study how it gets into KeyContext, etc. But also looking at the methods and seeing if I can get the info and just have it show up in the log, to start with. Bit by bit.

As long as you're cool with a growing collection of patchwork solutions rather than waiting for a general "Wayland" solution to appear. Once the framework for adaptation is in place, it should allow a more general solution to just drop in place and replace the patchwork stuff, eventually. Seems like a good thing to waste some time on.

@joshgoebel
Copy link
Owner Author

As long as you're cool with a growing collection of patchwork solutions rather than waiting for a general "Wayland" solution to appear.

When things finally get organized (in the Wayland ecosystem) it would just be a matter of writing another short module to handle the "official" API - or upgrading the existing one... should be pretty simple. Since it's easy to add/remove/upgrade I'm not sure why I should oppose.

@RedBearAK
Copy link
Contributor

My new AI overlord says:

The XDG_SESSION_TYPE environment variable in Linux is used to specify the type of desktop session. The value of this variable is set by the desktop environment, and its possible values depend on the implementation. However, some common values for XDG_SESSION_TYPE include:

    x11: for X11-based desktop sessions
    wayland: for Wayland-based desktop sessions
    mir: for Mir-based desktop sessions

Note that the exact values may vary depending on the Linux distribution and desktop environment. It's also possible that some desktop environments use custom values for XDG_SESSION_TYPE.

In my testing, the values have always been either x11 or wayland, whereas XDG_SESSION_DESKTOP or XDG_CURRENT_DESKTOP often have customized values that mix the session type with the DE, like "gnome-xorg".

Since it's easy to add/remove/upgrade I'm not sure why I should oppose.

My thoughts exactly. Alright, I'll start chipping away at what I can, when I have the time. Probably will need some pointers now and then, as usual, if my own research fails me.

@RedBearAK
Copy link
Contributor

RedBearAK commented Feb 6, 2023

@joshgoebel

Looks like xremap is relying on a GNOME extension (also called xremap), specific to their project, to allow it to monitor focus changes and access the window attributes. But, and this is very interesting to me at first glance, there is a PyWayland module that appears to provide similar functionality. Both monitoring the changes to the focused Wayland "surface", and getting the window attributes (which look to have the same names as the X11 attributes).

import pywayland.server

def focus_handler(surface, event):
    # Get the WM_NAME and WM_CLASS properties of the focused surface
    wm_name = surface.get_label("WM_NAME")
    wm_class = surface.get_label("WM_CLASS")

    # Handle the focus event here, using the wm_name and wm_class variables
    pass

# Connect to the Wayland compositor
display = pywayland.server.Display()
display.add_global_listener(pywayland.server.Seat.FOCUS, focus_handler)
display.run()

Doesn't seem to be all that complicated to set it up, but it would need to be another installed Python module since it's not a built-in module at this time.

Any thoughts on which road to go down? I didn't write any of the code in the example, and haven't started testing anything yet, I'm still just looking into how it's supposed to work. Looks like either way there would need to be something "external" added into the project.

I feel like it can't possibly be as simple as it seems at first glance with PyWayland, because my source keeps saying "depends on the compositor", but it might at least work with GNOME shell right off the bat.

https://pywayland.readthedocs.io/en/latest/

Relevant sections might be (if they still exist):

pywayland.server.wl_surface.WlSurface
pywayland.server.wl_shell_surface.WlShellSurface

@RedBearAK
Copy link
Contributor

@joshgoebel

I have the oddest thing happening. I've got the window class and name in Wayland with dbus and the help of a GNOME extension that exposes the focused window attributes. But...

What seems to be happening is... The mapped combo will go through, but the input combo will also go through. I can set up a keymap for a specific application like GNOME Terminal, and that shortcut mapping will only work in that application, so the window matching is obviously working. But the keys from the input side end up not being suppressed, although the output keys also come out, as they should.

So this remap of CapsLock you see below definitely types the output string, and only does it in GNOME Terminal, yet it also toggles CapsLock at the same time. Still in GNOME Terminal. And the tab nav shortcuts (Shift-Cmd-Braces) end up doing tab movement (Shift-Ctrl-PgUp/PgDn) instead of tav navigation, because the Shift key press is leaking through to mingle with the Ctrl-PgUp/PgDn of the output combo.

keymap("What is wrong with gnome-terminal-server", {
    C("Shift-RC-Left_Brace"):    C("C-Page_Up"),
    C("Shift-RC-Right_Brace"):   C("C-Page_Down"),
    C("CapsLock"):              ST("What the heck"),
}, when = matchProps(cls="^gnome-terminal.*$"))

Perhaps this is happening in all circumstances, but in other cases it doesn't seem to cause an issue.

I'm just not sure how it's possible the output combo can get fired off without the input being properly suppressed. It should be doing either one or the other, not both. Is the suppression of a matching input combo decided in input.py, or in transform.py? I made a couple of minor modifications to transform, but I reverted even those minor changes and still get this odd behavior. I haven't touched input that I know of.

The new context module passes the exact same dictionary of information back to KeyContext when the "get context" function is called. It works just like xorg, but just pulling the class and name from the DBus connection to the shell extension. And, the window matching seems to work.

If I can get past this weird side effect I should be able to clean this up without too much more trouble. The only thing missing for now is the ability to walk up the window tree to the "parent" of a window with no WM_CLASS and WM_NAME, which means it wouldn't work with JetBrains quite yet. I don't know of any other app that requires that particular workaround.

The extension is called "Window Calls Extended".
https://extensions.gnome.org/extension/4974/window-calls-extended/

I tried to work with PyWayland, and pydbus to do this. Things did not go well in either case. Mainly I had the strangest problem with Python seeming to be unable to find most of the methods that I was trying to use from the modules, giving me constant AttributeError issues no matter how I did the installs or imports. Eventually I gave up, and managed to convert[*] some working gdbus terminal commands into what works with the regular dbus module. The gdbus commands could also be used directly via subprocess.run, but that seemed to be pretty slow and not very usable.

[*] (With a LOT of help from a certain, suddenly very popular, AI language model.)

@joshgoebel
Copy link
Owner Author

Are you running keyszer at boot BEFORE the window manager? Perhaps weyland tries to grad the keyboard directly itself or something. Otherwise no idea. We do not pass the raw input (other than some of the non key events as you already know).

@joshgoebel
Copy link
Owner Author

Almost all the heavy lifting is in transform.

@RedBearAK
Copy link
Contributor

Are you running keyszer at boot BEFORE the window manager? Perhaps weyland tries to grab the keyboard directly itself or something. Otherwise no idea. We do not pass the raw input (other than some of the non key events as you already know).

No, the usual venv setup for testing the constant changes. I can see all the keystrokes in the log. It’s definitely all going through keyszer. It shows the correct combo in the right keymap being triggered, but the original keystrokes also show in the log as if there is no remapping happening. More keystrokes than when I use the same combo in X11. I don’t recall ever seeing anything like it before.

It’s my understanding that the grabbing of the keyboard device should be independent of the display server, with the only interaction with the display server being the context queries to see what the attributes of the focused window are at the moment you press the keys. The dict is sending x_error: False to KeyContext just like the xorg module. So it’s pretty strange. The logic should be stopping the unmapped keys from coming out.

I’ll have to litter transform with debugging output and see if can catch it going somewhere it shouldn’t. Then figure out why.

@joshgoebel
Copy link
Owner Author

joshgoebel commented Feb 11, 2023

but the original keystrokes also show in the log as if there is no remapping happening.

I'm not sure what this means... the log always shows input and output... so it's hard to imagine what you're seeing... a small log of a single combo might be nice to glance at. (X11 vs Weyland)

that the grabbing of the keyboard device should be independent of the display server

That would also have been my assumption. But Weyland is a whole other thing, in many ways not like X11 at all - perhaps it's ALSO grabbing the keyboard - which is why I was asking about load order.

The logic should be stopping the unmapped keys from coming out.

Well there is no explicit logic to do that - we just only output what we choose... grabbing the input means that nothing else can hear it ... unless we proxy it - like we do in the case of non-key events, etc.

@RedBearAK
Copy link
Contributor

(--) Autodetecting all keyboards (--device not specified)
(+K) Grabbing AT Translated Set 2 keyboard (/dev/input/event1)

It shouldn't be possible for anything but evdev(?) to see the input from the real keyboard device after it's been grabbed, correct?

Maybe I'm misinterpreting what the logs are showing. There are "(OO)" lines for X11 for the input keys too. But in X11, the input keys don't actually end up doing anything.

Actually, now that I think about it, it may be more like the app is ONLY seeing the real input, and NOT seeing the mapped output. Even though the correct keymap seems to be getting triggered and supposedly the output keys should be going out. In the other apps like Firefox, shortcuts like this are probably working OK because the shortcut works after just modmapping the modifiers, without needing to be transformed further. That would explain why it also isn't working correctly in GNOME Text Editor.

Probably need to take a closer look at the actual window classes... If they're all slightly different from the X11 names... 😫

But wait, like I said, the string output I set up to output in the terminal absolutely works. Although, it should output "What the heck", but outputs "what the heck", with a lowercase "w".

These log examples show different keymaps being triggered, but that's just because I set up a special keymap for the terminal in Wayland to try and figure out what's going on. Before that the log showed it triggering on the same shortcut from "General GUI" just like it should.

This is one press of the physical keys Alt-Shift-Left_Brace (logical after modmap: RC-Shift-Left_Brace). Wayland. Has some added debugging lines.

(II) in LEFT_ALT (press)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) modmap: LEFT_ALT => RIGHT_CTRL [Conditional modmap - General GUI - not in remotes or terminals]
(DD) on_key RIGHT_CTRL press
(DD) suspending keys [RCtrl<Key.RIGHT_CTRL>]

(II) in LEFT_SHIFT (press)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key LEFT_SHIFT press
(DD) resuspending keys
(DD) suspending keys [RCtrl<Key.RIGHT_CTRL>, LShift<Key.LEFT_SHIFT>]
(DD) resuming keys: [<Key.RIGHT_CTRL: 97>, <Key.LEFT_SHIFT: 42>]
(OO) press RIGHT_CTRL 1676153137.7347488
(OO) press LEFT_SHIFT 1676153137.7348669

(II) in RIGHT_BRACE (press)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key RIGHT_BRACE press

(DD) WM_CLS: 'gnome-terminal-server' | WM_NME: 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'
(DD) DVN: 'AT Translated Set 2 keyboard' | CLK: 'False' | NLK: 'False'
(DD) KMAPS: ['User hardware keys', 'Wordwise - not vscode',
(DD)         'What is wrong with gnome-terminal-server', 'General Terminals',
(DD)         'General GUI']
(DD) COMBO: RCtrl-LShift-RIGHT_BRACE => Ctrl-PAGE_DOWN in KMAP: ['What is wrong with gnome-terminal-server']
(DD) spent modifiers []
(OO) release LEFT_SHIFT 1676153137.767943
(OO) press PAGE_DOWN 1676153137.768032
(OO) release PAGE_DOWN 1676153137.7680573
(OO) press LEFT_SHIFT 1676153137.768091

(II) in RIGHT_BRACE (release)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key RIGHT_BRACE release

(II) in LEFT_SHIFT (release)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key LEFT_SHIFT release
(DD) resume because of mod release
(OO) release LEFT_SHIFT 1676153137.893625

(II) in LEFT_ALT (release)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key RIGHT_CTRL release
(DD) resume because of mod release
(OO) release RIGHT_CTRL 1676153137.9142215

This is in X11:

(II) in LEFT_ALT (press)
(DD) modmap: LEFT_ALT => RIGHT_CTRL [Conditional modmap - Terminals]
(DD) on_key RIGHT_CTRL press
(DD) suspending keys [RCtrl<Key.RIGHT_CTRL>]
(DD) resuming keys: [<Key.RIGHT_CTRL: 97>]
(OO) press RIGHT_CTRL 1676153295.8804917

(II) in LEFT_SHIFT (press)
(DD) on_key LEFT_SHIFT press
(OO) press LEFT_SHIFT 1676153295.888439

(II) in LEFT_BRACE (press)
(DD) on_key LEFT_BRACE press

(DD) WM_CLS: 'Gnome-terminal' | WM_NME: 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto/kinto.py'
(DD) DVN: 'AT Translated Set 2 keyboard' | CLK: 'False' | NLK: 'False'
(DD) KMAPS: ['User hardware keys', 'Wordwise - not vscode',
(DD)         'General Terminals', 'General GUI']
(DD) COMBO: RCtrl-LShift-LEFT_BRACE => Ctrl-PAGE_UP in KMAP: ['General GUI']
(DD) spent modifiers []
(OO) release LEFT_SHIFT 1676153295.9391887
(OO) press PAGE_UP 1676153295.9392915
(OO) release PAGE_UP 1676153295.939376
(OO) press LEFT_SHIFT 1676153295.9395306

(II) in LEFT_BRACE (release)
(DD) on_key LEFT_BRACE release

(II) in LEFT_SHIFT (release)
(DD) on_key LEFT_SHIFT release
(DD) resume because of mod release
(OO) release LEFT_SHIFT 1676153296.0417325

(II) in LEFT_ALT (release)
(DD) on_key RIGHT_CTRL release
(DD) resume because of mod release
(OO) release RIGHT_CTRL 1676153296.0687592

@RedBearAK
Copy link
Contributor

RedBearAK commented Feb 11, 2023

There is definitely a trend of the same app using a different class/name in Wayland vs X11. Annoying. Or maybe it's more of a difference between Fedora and Ubuntu? But no, I have an app was installed from Flahub in either case, and GNOME Terminal has always shown something different in X11 on either distro, so I think it's more of some apps choosing to show a different class and "title" for Wayland.

But that should just be a matter of adding additional patterns to match those apps. Which was already a thing that often needed to be done for different distros somehow using a different WM_CLASS for the same app, and native packages vs Flatpak or other sources having a different WM_CLASS. Shouldn't be a big deal.

And again, I can activate a special keymap for some app like GNOME Text Editor, with the class it's showing me in Wayland (org.gnome.TextEditor vs gnome-text-editor) and those mappings will only work in that app window. So the matching on the window attributes (in Wayland) is absolutely working. But, things just don't come out quite right.

The string output when I press Grave in Text Editor is set up to be (after it clears isDoubleTap):

You have double-tapped the Grave key!

But instead what I get is:

you have double-tapped the grave key1

All lower case. It's very consistent. But there isn't even a Shift key involved in the input combo. So why are the shifted characters consistently losing their shifted state, while the unshifted characters remain unshifted? And the result is the same whether I have CapsLock enabled or disabled. (This is a new pull of keyszer, without the unmerged fixes for the string and Unicode processors to adjust behavior for CapsLock LED status). The lack of the CapsLock fix doesn't explain what's happening, since toggling CapsLock has no effect on the output.

I'm at a total loss to understand this so far.

@RedBearAK
Copy link
Contributor

Straining my brain here... It's as if the modifier keys (and only the modifier keys) in the output side of the mapping are not actually making it to output. Meanwhile, "normal" keys (letters, numbers, punctuation keys) that aren't modifier keys are making it to output and being "seen".

But, like those Shift key presses and releases that are supposed to be part of the string output, they are all there in the log. Just somehow they don't seem to be getting emitted by the virtual keyboard. So what the app sees is just lowercase letters, the number "1" instead of "Shift-Key_1", and so on.

@RedBearAK
Copy link
Contributor

This is really making no sense to me. It's like something is "catching" the output keystrokes (and input keystrokes?), but only when they are modifier keys. The events are happening just the way they are supposed to, AFAICT.

Press LEFT_SHIFT, press "Y", release "Y", release LEFT_SHIFT, etc.

(DD) DVN: 'AT Translated Set 2 keyboard' | CLK: 'False' | NLK: 'False'
(DD) KMAPS: ['OptSpecialChars toggles', 'OptSpecialChars - US',
(DD)         'User hardware keys', 'GNOME Text Editor',
(DD)         'Wordwise - not vscode', 'Cmd+Dot not in terminals',
(DD)         'General GUI']
(DD) COMBO: GRAVE => <function isDoubleTap.<locals>._isDoubleTap at 0x7f1213e7dfc0> in KMAP: ['GNOME Text Editor']
(DD) spent modifiers []
(DD) ## isDoubleTap: 
	Time diff (just right): 
	_tapTime - tapTime1=0.18475580215454102
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.LEFT_SHIFT: 42>, <Action.PRESS: 1>)
(OO) press LEFT_SHIFT 1676158231.0707545
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.Y: 21>, <Action.PRESS: 1>)
(OO) press Y 1676158231.0709589
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.Y: 21>, <Action.RELEASE: 0>)
(OO) release Y 1676158231.0710757
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.LEFT_SHIFT: 42>, <Action.RELEASE: 0>)
(OO) release LEFT_SHIFT 1676158231.0711682
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.O: 24>, <Action.PRESS: 1>)
(OO) press O 1676158231.0713558
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.O: 24>, <Action.RELEASE: 0>)
(OO) release O 1676158231.0714197

Maybe I should ask the evdev folks.

@joshgoebel
Copy link
Owner Author

Re: Shifting for "faking typing"... output looks like it's hitting shift, if that's not registering in Weyland, no idea.

Re: X11 vs Weyland, both outputs look like "what I'd expect" IF you have suspend turned down super low... the original keys DO get sent thru right away - then they are lifted before the real combos... turning suspend up is what prevents this and what I always preferred it. I don't see anything unexpected happened at a glance in either log.

@RedBearAK
Copy link
Contributor

RedBearAK commented Feb 12, 2023

IF you have suspend turned down super low... the original keys DO get sent thru right away - then they are lifted before the real combos... turning suspend up is what prevents this

Working in a Boxes VM on the same laptop with the touchpad that requires it be turned down for any kind of Mod+click to work. Thought maybe the device inside the VM would act more like a regular mouse, but doesn't seem to.

So... if the original keys are always getting sent through with a low suspend timeout, why don't they ever seem to actually do anything, at least when things are working normally? I know there are combos I'm using where the input combo would make something else happen, if it was getting through to the app. Seems like I still don't precisely understand what happens to the input combo. Or is it just that the original modifier key presses get through, then released, before the full "combo" with the regular key and transformed modifiers are pressed?

output looks like it's hitting shift, if that's not registering in Weyland, no idea

That does seem to be the case, unfortunately. Like a selective filter.

@joshgoebel
Copy link
Owner Author

joshgoebel commented Feb 12, 2023

Or is it just that the original modifier key presses get through, then released, before the full "combo" with the regular key and transformed modifiers are pressed?

Yes. Most software doesn't seem to care about modifiers until a non-modifier is hit - and only the modifiers held down at that moment seem to matter.

@RedBearAK
Copy link
Contributor

Yes. Most software doesn't seem to care about modifiers until a non-modifier is hit - and only the modifiers held down at that moment seem to matter.

I see, better now than before.

But the "normal" non-modifier key that completes the input combo, that doesn't end up going through before the input modifiers are released, right? (Or at all, if it's not in the transformed combo.) Or at least its not supposed to? Because that would actually make the app do whatever the input combo would do. But that normally doesn't happen. So there is ultimately one key press & release for each combo that should be getting suppressed, right?

@RedBearAK
Copy link
Contributor

Cross-posting from evdev issue thread:

Wait, adding a 0.05s delay to a part of the output module does have a significant effect. As in, it makes the intended transformed set of keystrokes "work" as they should.

    def __send_sync(self):
        time.sleep(0.05)
        _uinput.syn()

But there is still the very strange problem where the input combo, that the app window is never supposed to "see", is still being "seen" in addition to the transformed keystrokes. I've been working with this keymapper for a couple of years in X11 environments and have never seen this phenomenon. Which is why I came over here to see if there is a chance the input is not being successfully isolated from Wayland when the keymapper "grabs" the device, which has always worked fine in X11.

Alright, I slowed things WAAAY down with a 0.5s delay [in output.py sync function], and got this:

WWhhaatt  tthhee  hheecckk

The macro string is supposed to be:

What the heck

The CapsLock key that triggers the macro is "seen" by the app immediately (I see an on-screen notification for CapsLock and NumLock keys) and then this macro string is slowly typed out, with each character typed twice. Maybe because the long delay is before sending a sync, and something is interpreting as the key being tapped again? But there are two capital "W", which would require Shift-W in both cases.

I'm not completely familiar with how the event code sequences work, and why this long delay causes this doubling up of the characters.

@RedBearAK
Copy link
Contributor

RedBearAK commented Feb 13, 2023

Well, good news, in a way. The issue with CapsLock was actually a red herring. For some reason, when Boxes has the keyboard focus inside the VM window, it isn't really isolating the keyboard input within the VM. I guess it's doing the "shared keyboard" kind of thing, even though I have to move the mouse up to the window's top bar to Cmd+Tab away from the Boxes window. The CapsLock notifications were from the HOST outside the VM. Not inside. Which is why the output of the strings inside the VM never changed. Duh.

Adding the delay in the sync function, if the delay is long enough completely fixes the behavior of transformed shortcuts not doing what they are supposed to do. But it has to be pretty long. Like 0.01 to 0.05 or so [Edit: After removing an additional 0.1s delay from the Unicode function, the minimum delay really seems to be at least 0.05s for any kind of reliability]. So once again we're talking about a pretty significant delay if you want to spit out a macro string or something. And it's still not 100% reliable. Stops part way through macro strings sometimes, even with the delay.

But, in a very technical sense, you could say that I have actually succeeded in bringing support for app-specific mappings using keyszer on Wayland+GNOME.

Yay me. I'm amazing.

It just really seems to have a major reliability problem, like when I was trying to get my Option-key special characters to work correctly in Kubuntu, and the only thing that seemed to help was a similar delay (before the Enter keystroke in the sequence that comes back from the Unicode processor) or disabling the sync entirely.

😞

I'm kind of expecting this to work better on a bare metal install, but have no direct evidence to support that yet. Will have to try that next.

@RedBearAK
Copy link
Contributor

😡 😠 👿
Of course, the sleep delay is so long that it actually gets in the way of normal typing, even when not dealing with transformed shortcuts or macros. Without finding an explanation as to why everything is so unreliable without the delay in certain situations, this is an unusable solution.

@RedBearAK
Copy link
Contributor

RedBearAK commented Feb 13, 2023

How is this even possible, when the keystrokes that should produce the "!" character should be intrinsically connected to each other as a single combo, and the Shift key shouldn't even be pressed until after the "y" key is released?

You have double-tapped the Grave keY1

Seriously:

You have double-tapped THE Grave key1

I feel like this is a powerful clue.

@joshgoebel
Copy link
Owner Author

joshgoebel commented Feb 14, 2023

"You have double-tapped THE Grave key1"
should be intrinsically connected to each other as a single combo [emphasis mine]

No such thing at the low-level, it's just a sequence of keyboard events - combos aren't really separated by anything... that said it should still be sequential, so that is quite confusing to me also - but what does the output log show for that?

I imagine that could happen if you ripped out the SYNC events... so I'm not sure if you are still hacking such things... Because using sync is how you signal multiple events happened at the same time... so if you weren't SYCNing you could wind up where it's entirely ambiguous when shift was pressed if it was just part of a huge set of characters...

@RedBearAK
Copy link
Contributor

RedBearAK commented Feb 14, 2023

Logs are fine. Both the keyszer log, and the log showing what evtest sees coming out of the virtual keyboard device. Perfect order. But whatever is receiving the keystrokes seems to be jumbling things if they are too close together time-wise. Whether that's X11/Wayland, the window manager/shell, the kernel, or just the app window, I have no idea yet how it works on that side of things.

I'm leaving the sync event alone in send_key_action, leaving it disabled in send_event since the real keys send their own sync events through there, as far as I understand. So that's not the problem.

It really feels like a router with a buggy algorithm failing to put packets back together in the correct transmission sequence.

Anyway, I found that regular typing can still be usable if we move the delay into send_combo. A delay wrapped around the "normal" key press-release is pretty effective, with the same delay before the modifier press and after the release making it slightly more effective. But the delay needs to be at least 0.03s in this situation, or things start to go wrong.

This way, shortcuts are working as expected, with no obvious delay. Macros are quite slow, about like a 40wpm typist, but they don't get screwed up anymore even with sequences of several shifted characters together, and normal typing input is not messed with because the delay is not attached to the sync function.

        key_delay_testing = 0.03    # delay to insert between mod+key press/release

        for key in mod_keys_we_need_to_lift:
            # time.sleep(key_delay_testing)
            self.send_key_action(key, RELEASE)
            released_mod_keys.append(key)
            time.sleep(key_delay_testing)

        for key in [mod.get_key() for mod in mods_we_need_to_press]:
            time.sleep(key_delay_testing)
            self.send_key_action(key, PRESS)
            pressed_mod_keys.append(key)
            # time.sleep(key_delay_testing)

        # normal key portion of the combo
        time.sleep(key_delay_testing)
        self.send_key_action(combo.key, PRESS)
        self.send_key_action(combo.key, RELEASE)
        time.sleep(key_delay_testing)

I've tried many different iterations and this is about the best I've achieved. Tried leaving out the delay if there's no modifier involved, but the results were always bad.

@RedBearAK
Copy link
Contributor

Nope, not reliable at 0.03s after further testing. Had to push it to 0.04s. And the only delay lines needed are around the "normal" key press-release event. The others didn't really help.

        key_delay_testing = 0.04    # delay to insert between mod+key press/release

        for key in mod_keys_we_need_to_lift:
            self.send_key_action(key, RELEASE)
            released_mod_keys.append(key)

        for key in [mod.get_key() for mod in mods_we_need_to_press]:
            self.send_key_action(key, PRESS)
            pressed_mod_keys.append(key)

        # normal key portion of the combo
        time.sleep(key_delay_testing)
        self.send_key_action(combo.key, PRESS)
        self.send_key_action(combo.key, RELEASE)
        time.sleep(key_delay_testing)

@RedBearAK
Copy link
Contributor

capslock 🌹🌹€€‡‡ÿÿ‡e‡-tapped11111

I was really confused by this phenomenon, but think I figured out why this duplication of Unicode characters happens when the timing issue is particularly bad. Something is interpreting the keystroke sequence as somehow being both Unicode entry methods:

  • Shift-Ctrl-U, hold modifiers, type Unicode address, release modifiers (no Enter key)
  • Shift-Ctrl-U, release modifiers, type Unicode address, hit Enter key

Thus, two identical Unicode characters sometimes. I can't replicate it manually, but this is the only possible explanation.

Submitted a description of the overall issue in the libinput GitLab. We'll see if anyone has a response.

@RedBearAK
Copy link
Contributor

I have to save this for posterity. What the actual...

🌹ck —€ÿ—‡ 1234% 12#$% \\ double-TAPPeD11111

Supposed to be:

CapsLock 🌹—€—‡—ÿ—‡ 12345 !@#$% |\ Double-Tapped!!!!!

@RedBearAK
Copy link
Contributor

I'm seeing window class and name output for JetBrains IntelliJ IDEA (Community Edition) in Wayland. At least with the Flatpak from Flathub.

(DD) ## ctx_gnome_dbus.py - get_gnome_dbus_context():
	wm_class = 'jetbrains-idea-ce'
	wm_name = 'asdf \u2013 src/Main.java [asdf]'

Interesting that this appears to be using an "En dash" Unicode character in the window title/name, rather than a simple dash/hyphen. The "\u2013" appeared to just be a hyphen in the terminal, before I copied the output and pasted it here.

And in PyCharm (Community Edition):

(DD) ## ctx_gnome_dbus.py - get_gnome_dbus_context():
	wm_class = 'jetbrains-pycharm-ce'
	wm_name = 'first_run.txt - .../share/jetbrains-flatpak-wrapper'

So I'm not sure that it will actually be necessary to try to find a way to climb the window tree to get the parent window in Wayland the way the xorg module does. Installing the JetBrains apps on the host OS running X11 to confirm.

IntelliJ IDEA:

WM_CLASS(STRING) = "jetbrains-idea-ce", "jetbrains-idea-ce"
_NET_WM_NAME(UTF8_STRING) = "testing – Main.java"

Weird. Pretty sure that's still an "En dash" character, but it doesn't get converted into "\u2013" when copying and pasting from the host OS. Maybe that has something to do with copying inside the Boxes VM guest OS and pasting in a browser running on the host OS.

PyCharm:

WM_CLASS(STRING) = "jetbrains-pycharm-ce", "jetbrains-pycharm-ce"
_NET_WM_NAME(UTF8_STRING) = "first_run.txt - .../share/jetbrains-flatpak-wrapper"

And, due to the usage of a non-ASCII character, these apps would have always triggered the "(COMPOUND_TEXT)" encoding bug, if anyone had tried matching on the window name before the fix that changed it to using _NET_WM_NAME. Xlib would have returned an empty "bytes object" for this every time:

WM_NAME(COMPOUND_TEXT) = "testing – Main.java"

I must admit that at the moment I don't understand why the get_actual_window() workaround exists.

Ah, this is the problem:

#########################
  get_actual_window():
	wmclass = ('Focus-Proxy-Window', 'FocusProxy')
	wmname = 'FocusProxy'

When xorg tries to get the WM_CLASS, it sees "FocusProxy" instead of seeing what xprop shows in the terminal. So Xlib has to be forced to retrieve the real window class, to show the same result as xprop.

But I'm already printing the retrieved class and title of these Java apps from within the get_gnome_dbus_context() function, so... this doesn't appear to be a problem that affects Wayland. Or maybe it's already taken care of in the shell extension that provides the window attributes to us.

So the function to climb up the window tree to the "actual" window doesn't appear to be necessary in a Wayland environment. Well, at least in the Wayland+GNOME "talking to shell extension via DBus" environment.

That means I could technically say the Wayland+GNOME context module is finished. In the sense that it is doing everything it needs to do.

🎉

@RedBearAK
Copy link
Contributor

Still messing with some things, but I made a branch with everything that I've put together so far.

https://github.com/RedBearAK/keyszer/tree/wayland_gnome_dbus

It's going to show some extraneous changes like the unmerged updates to the Unicode and string processors, and some logging changes.

This is the first time I'm attempting to do something this big, and using VSCode to do the commits.

The strategy here is, I made a "connector" that uses environment info to transparently link keycontext to the proper module for the compatible detected environment. Should allow further context modules for additional environments to be plugged in easily.

@joshgoebel
Copy link
Owner Author

Please open a draft PR so it's easier to see what you're working on over there.

@RedBearAK
Copy link
Contributor

Please open a draft PR so it's easier to see what you're working on over there.

It's still kind of a mess so you'll have to be patient if you want any major changes. But I'll do that shortly.

Oh, I forgot to mention it also includes the throttling delays I submitted in the most recent PR (#134), without which it would be totally unusable in the VM.

@RedBearAK
Copy link
Contributor

Reminder to self:

TODO: Implement an API function to allow manual override of the detection of session type and desktop environment from the user's config file, in case the automated detection fails for some reason but there's a chance keyszer would still work in the user's environment.

Shouldn't be too difficult. I'll model it on the throttle delay API function that's already allowing me to inject keystroke delays from config.

@joshgoebel
Copy link
Owner Author

If auto-detection is possible (see my other question about env vars) it should be handled inside the display context manager with a simple state machine...

stateDiagram-v2
   [*] --> autodetect


    autodetect --> connected
    connected --> connection_error
    connection_error --> autodetect

@RedBearAK
Copy link
Contributor

If auto-detection is possible (see my other question about env vars) it should be handled inside the display context manager with a simple state machine...

I'm not sure exactly which part you're referring to as the "display context manager". The new ctx__connector module? Or you're just speaking hypothetically?

And the individual context module "getter" functions are passing back the same dict as xorg, just with the x_error renamed to context_error to be more generic. Isn't that what causes all the remapping/transforming stuff to be disabled while the window attributes are inaccessible (such as when there is a problem connecting to the X server)?

So I feel like this is just describing what's already happening in what I've put together... ? Unless I'm doing something wrong.

@joshgoebel
Copy link
Owner Author

That code runs just once (that I see)... while for true auto-detection everytime you lost touch with the window manager you'd have to assume they might be logging out and switching into a different DE/WM, right?... hence my suggesting a class to manage that functionality and the current "state" of whether we are "connected" for a WM and which.

@RedBearAK
Copy link
Contributor

That code runs just once (that I see)... while for true auto-detection everytime you lost touch with the window manager you'd have to assume they might be logging out and switching into a different DE/WM, right?... hence my suggesting a class to manage that functionality and the current "state" of whether we are "connected" for a WM and which.

Ah, I see what you're saying. You're thinking of it as a system service that just hangs around when the user logs out, but stops remapping things. Right.

Honestly, I feel like the much more correct way of doing multi-user support will be a user-level service rather than a system service. Which will naturally stop running when the user logs out, and start again when the user logs in. In which case there would be no need for a state machine, as the environment is re-scanned at startup.

Although I suppose there may be a benefit to having the system service "just work" when you do a "switch user" kind of thing, where it would just re-evaluate the environment as you log into the other user's desktop. But no, thinking about it, it seems like having a user service run a separate instance of keyszer from that user's config, especially since that user may have a completely different config, would still be the more appropriate way of doing things. If both users are "you", that's one thing, but what if it really is a system with multiple people using it, with completely different remapping needs?

Having a keymapper service running at system level really seems problematic, actually. Kinto's service file has always had the path to the user's config file hard-coded, making the idea of multi-user support difficult. Even if it's possible to have the service file draw from a per-user config file path, something would need to tell the service to restart in order to change to the new user's config. And if you're restarting to switch to a different config, surely there is no need for a state machine, again.

If I'm thinking about this correctly, even if you did a switch-user or go-to-login-screen kind of thing without logging out, that would disable the remapping while the window attributes are inaccessible, but once you got back into your own graphical environment the window attribute getter should just automatically start working again. The environment itself can't actually change unless you really log out of your session, AFAIK, so what would trigger the need to re-evaluate the environment within that single run of the app?

Am I way off base on something here?

You know I like automating things, so it's not that I'm disagreeing that state machines are great when they are needed.

The only thing I'm wondering about is whether a user-level service from a user that stays logged in would interfere with a user-level service in another account trying to grab the same input device(s).

@RedBearAK
Copy link
Contributor

FYI, I just did a proof-of-concept setting up another module to use the GNOME shell extension maintained by the xremap project, as an alternative/backup to the "Window Calls Extended" shell extension. It's unfinished but already showing the window attributes as a one-time test at startup. The output is a bit different so it needs some tweaking to fit the mold, but it should work just as well as the other extension.

This ability to use multiple GNOME extensions should help alleviate the potential "fragility" of relying on third parties to keep their extensions updated to be compatible with each new release of GNOME.

TODO: Add the ability to specify the extension to use as part of the environment config injection, if the auto-detection of the available extensions fails. (Seems problematic at the moment, talking to the extensions over D-Bus works but for some reason they don't show up in the list of all available D-Bus interfaces. Really odd.) Hmm. maybe I should just put it together so that it tries both and only returns the context error if neither of the D-Bus queries give anything back. I'll have to think about that.

Oh, another FYI that I should toss in: The window classes are indeed sometimes different than expected in Wayland, but I think that's because the class string I'm getting back from these extensions is the first element of the "pair" that comes back for WM_CLASS even in Xlib/xprop, rather than the second element:

WM_CLASS(STRING) = "gnome-terminal-server", "Gnome-terminal"

So what I ended up seeing in Wayland for GNOME Terminal was the first string, whereas Kinto/xkeysnail/keyszer has always wanted to match on the second string of the pair, so that's what I've always looked at.

In many cases these two elements are the same, or with one capitalized and the other uncapitalized, so with case-insensitive matching they end up identical. But in some cases the elements are very different.

I'll have to look into whether these extensions are actually using the "wrong" element or somehow the first element is the only thing available to them. Or it might actually be more correct to use the first element, which often has a more "technical" appearance rather than a "pretty" appearance. In which case it may be smart to change the way the xorg method is working.

The only thing for sure right now is that there will be a difference for some apps currently, between when the xorg method is used, and when these GNOME extensions are used. This will require some expansion of configs like Kinto.

It doesn't appear to be the case that the applications themselves are choosing to present a different string specifically for Wayland. It's just a different element of the existing WM_CLASS pairing.

@RedBearAK
Copy link
Contributor

Aha, finally found a working dbus-send command that reveals the installed extensions, although with the way things change in GNOME it's hard to say how reliable this will be over time, and with different versions of gnome-shell:

dbus-send --session --dest=org.gnome.Shell.Extensions --print-reply /org/gnome/Shell/Extensions org.gnome.Shell.Extensions.ListExtensions

@RedBearAK
Copy link
Contributor

I finished a state machine for working with the (two, for now) compatible GNOME Shell extensions. So the user could install, uninstall, enable, or disable any compatible extension that keyszer knows about, and the context manager will adapt on-the-fly, only giving back the NO_CONTEXT_WAS_ERROR object if an assessment of all compatible extensions fails.

To make it as light as possible, it tries to only work with the extension that was "known good" the last time the function was called, keeping the extension uuid in a global. Which should mean that if the extension is never switched to a different one in the middle of a desktop session, it will just keep hitting only the one that already returned good results the last time. It will only re-evaluate the other extension(s) if the working one has an exception (i.e., because it's suddenly disabled or something). Then it will drop back through the function (one extra time) and try to check all extensions for a good result. This makes it capable of immediately recovering context functionality if there is another active and compatible extension present. Without so much as a hiccup.

If there is no active compatible extension, it will be like the very first key press, attempting to find a working extension with each key press and returning NO_CONTEXT_WAS_ERROR and some helpful logging output until an extension try succeeds. Then it will go back to just using the one that works and ignoring the other possibilities.

Lather, rinse, repeat. Or rather: Lather, repeat, and only rinse if the soap runs out. 😆

As long as nothing goes wrong with the working extension, it's kind of like the context connector module, but operating independently so that even the context connector doesn't need to care about trying to use a specific GNOME Shell extension. This module will figure that out all by itself.

Should be trivial to add other extensions to the module if anything else pops up that can provide the same information.

I don't expect to find any others anytime soon. I was actually surprised to find even two different extensions capable of providing the window properties we need.

@joshgoebel
Copy link
Owner Author

Sounds nifty, lets see it. :)

@RedBearAK
Copy link
Contributor

Sounds nifty, lets see it. :)

No. 😆

Just kidding. I'll be syncing the commits shortly. I'm sure you'll see some things that should be done better. I'm not convinced that the round-robin style of checking the extensions when there isn't a working "last" uuid is the most elegant way of getting the job done, for instance. But it is adapting automatically to the extent that I had intended.

@RedBearAK
Copy link
Contributor

Alright, the changes should be showing online. The new module is:

src/keyszer/context/wl_gnome_dbus_shell_ext.py

ctx__connector.py identifies Wayland+GNOME and then links get_window_context to get_wl_gnome_dbus_shell_ext_context, which then will give back the specific D-Bus communication function for the first extension that responds. The connector module doesn't attempt to identify what extensions are available and enabled. Such a task turned out to be surprisingly more complicated than just attempting to use the respective D-Bus interfaces and going with whatever works.

And the separate per-extension modules should probably go away at some point:

src/keyszer/context/wl_gnome_dbus_windowsext.py
src/keyszer/context/wl_gnome_dbus_xremap.py

Since all the context modules are in the context folder now, I tried to simplify some of the module names, rather than having "ctx_" on everything. With "wl" being a common abbreviation for Wayland among the programming frameworks.

@RedBearAK
Copy link
Contributor

I was also thinking about the state machine thing earlier, for the rest of the environment, and remembered that xhost has always been necessary for xkeysnail to get access to the user's X display server, to retrieve the window properties while running as root. I assume the same thing holds when running as a separate user like "keymapper". You have to somehow give the other user access to your display server.

I haven't found an equivalent tool for Wayland, just some whisperings about certain window managers possibly providing their own methods to give another user access to the Wayland display server. Unless I'm misunderstanding the situation entirely, the security model of Wayland being so different from X11 should mean that trying to run keyszer as a system service with Wayland involved will be either A) impossible or B) highly dependent on the specific DE or WM providing a tool to open up the security model to a separate user. In other words, impossible or a huge, complicated pain that wouldn't be worth the trouble, when it should just be run from a user service file instead, so that it shuts down when the user logs out.

But I'll still have to set up a system service and separate user to verify any of this.

@RedBearAK
Copy link
Contributor

A user on the xremap project seems to be making good progress with a way to get both the application class and window name while in a Wayland+KDE environment. Unfortunately that project is written in Rust, but with the help of some AI analysis it should be possible to port the technique over to keyszer at some point.

xremap/xremap@master...N4tus:xremap:kde-wayland

It's still kind of in flux, but seems to combine a KWin script and a D-Bus interface that asks the Plasma shell to give it the window attributes whenever the active window changes. Sort of the opposite of how the GNOME shell extensions work, advertising a D-Bus interface to be queried. But basically the same concept.

Some of their comments indicate that they were not able to get the window information when running as root, due to the security model of Wayland. This confirms my belief that the proper way to run the keymapper in most situations is as a user service.

I have a user service working on my machine, with a tray icon to allow stopping and restarting the service. But actually there are two services, one of which stops the keymapper when the user's "session" is no longer active. This seems to allow for a true multi-user setup, where the first user running keyszer can do a "Switch user" and log into a different user's desktop (without actually logging out of their own desktop), and the instance of keyszer running on the session that is no longer using the screen (i.e., no longer "active") will be stopped until the first user logs back into their session. The second user should be able to run their own keyszer service with this technique. Which would then be automatically stopped if they switched back to the first user's desktop, where the keyszer service would reactivate.

This relies entirely on loginctl, which is part of systemd. I'm not yet aware of any other way to get information about whether the user's desktop session is actively using the screen, or is just hanging around in the background. Still looking, since it would be nice to have this sort of thing working on distros that don't use systemd.

@RedBearAK
Copy link
Contributor

@joshgoebel

OK, so I think I managed to fix the things that were broken in my own branches, so I have:

  • a working installer for my variation of Kinto, that I've tested out on most popular distros so far, with...
  • a branch of keyszer that provides window context on X11/Xorg and Wayland+GNOME environments, from...
  • an entirely class-based window_context module that should be easy to add new "providers" into.,,
  • which is told what the environment is via a new API function (defaults to 'x11' session for now)...
  • which in turn is automatically given the environment values at runtime from the env module used by my config file.

I think I'm doing pretty good at this point, approaching the end goal of having a highly flexible variant of Kinto that runs on as many Linux distros as possible, while requiring minimal user input.

I am now, for the first time, operating in Wayland on my main Fedora 36 install, as well as having tested the Wayland install in some VMs and on an old MacBook. It seems to usually need a bit more throttle delay than most bare metal installs, but that could be partially due to the need to use the GNOME Shell extensions to get the window info. Not sure if that could ever be improved as we wait for a more native Wayland solution for getting window context info.

Should be possible to translate the Rust-based module, that makes Wayland+KDE_Plasma work now for Xremap, into a new provider class for keyszer that could just automagically work if the user injects the correct environment API values.

I noticed during the testing of the class-based providers that the KeyContext object is literally being re-created for every key press, so my classes where having difficulty "remembering" things like which extension was last used. Obviously KeyContext needs to be passed the device so it knows where the input is coming from, but I feel like that should be revisited so that a new instance of the object is only created when the device is different from the last time. In many cases the user will go for a long period of time using the same input device, so unless I'm missing something there is technically no need to be constantly instantiating a new class object. And even if the device changes, it seems like it should be possible to just change the device without creating a new KeyContext instance entirely (but I'm not sure about that).

In any case, what I did feel a need to do was create the instance of the window_context object outside of on_event() and then just pass it to the KeyContext instance as an argument/parameter. This seems to work out pretty well, since the window context object shouldn't need to change during a session.

Of course it helps that my variant of Kinto is running from systemd "--user" services, or just shell scripts for verbose logging. It is not attempting in any way to be system-wide or operate outside the user session, where it might encounter a different session type during a single run of the keymapper. So that part might need to be looked at more closely. But I don't know how even my fancy config file powered by the env module would handle changing the environment on the fly. Operating the keymapper in a way that would not terminate somehow when the user logs out or switches users just seems like a problematic state to be in, given that the user could do something like leave an X11/Xorg session running and "switch" into a different user account running a Wayland session.

I have not tackled yet the possibility of injecting a "custom" provider object from the config file side, although I've gone over how that might work with my online assistant. Seems like it will be a bit finicky trying to grab the custom provider class from memory by name, after the config is execed. But it could be useful for testing new providers before pulling them into keyszer. I just don't know if I'll be able to get to it anytime soon.

I've heard that at least one big distro has now declared that Xorg will be deprecated, so the time when the most popular distros will start actually removing Xorg and making users install it themselves is continuing to come closer.

Let me know if you see any serious issues with the latest attempt at Wayland support infrastructure: PR #157

@RedBearAK
Copy link
Contributor

@joshgoebel @rbreaves

Well, it's still a bit rough around the edges as far as setting up the support files (a KWin script and a Python script that creates a D-Bus service that acts as a bridge between KWin and keyszer), but it looks like I finally broke through the wall that had me stymied for a couple of days. I now have working app-specific remaps on Wayland+KDE_Plasma. It's only in the dev_beta development branch of my Kinto-like variant, and a branch of keyszer that I haven't submitted for a PR yet, but still... 🎉 Translating the technique from the Xremap+KDE module written in Rust was less straightforward than I had hoped, even with AI helping analyze it.

Other than some continuing observations of minor differences in application names in the equivalent of WM_CLASS (resourceClass) under Wayland, which sometimes requires some new identifiers to be added to the config file for the relevant keymaps to engage, I'm not seeing much of an issue so far. Pretty much the same situation as when I got Wayland+GNOME working.

It's looking good for quite a high percentage of desktop Linux users to be able to use Wayland with keyszer. Wonder what percentage of Linux desktop users are covered by support for both GNOME and KDE Plasma. I'm guessing at least 50%.

Only tested completely on Fedora 38 KDE spin so far, but should work on other KDE distros equally well.

@RedBearAK
Copy link
Contributor

Have tested the Wayland+KDE solution now on several distros with a KDE variant, or the ability to install a KDE desktop with a Wayland session. So far it's working pretty well. Ironed out a few bugs in the setup process while going through getting the installer working on openSUSE Tumbleweed, and I think it's pretty robust at this point.

https://github.com/RedBearAK/toshy

This installer now works on numerous different distros, and the end product will work if you are on X11/Xorg, Wayland+GNOME, or Wayland+KDE (Plasma). Any other environment will cause the branch of keyszer that this installs to exit with a log message about the compatible environments.

@RedBearAK
Copy link
Contributor

RedBearAK commented Aug 29, 2023

FYI to anyone who wants to use keyszer on Wayland, but is not interested in the rest of what Toshy does (making shortcuts work like a Mac), I added an installer flag a while back that will install a "barebones" config file that does none of the Mac-like stuff.

./toshy_setup.py --barebones-config

When using this installer option, Toshy becomes just a convenient way to install keyszer with one command, and provides a tray icon menu and terminal commands to manage the services that A) make keyszer more multi-user friendly, and B) give it some Wayland support (but still GNOME and KDE Plasma only, at this point).

So Toshy can technically be useful now for those who just want to do some other kind of keymapping with keyszer. Whether in X11/Xorg or Wayland+GNOME/Wayland+KDE.

For the Wayland+KDE support in particular, Toshy has to use multiple components separate from the keymapper to get the window attributes from KWin, so it is unlikely the complete solution can be integrated into keyszer anytime soon.

Good news is the Toshy installer works on a pretty long list of popular distros now. And smoothly handles moving from one desktop environment to another on the same system, auto-detecting the environment.

https://github.com/RedBearAK/toshy

I'd like to hear from anyone who is using Toshy this way.

@RedBearAK
Copy link
Contributor

I've added support for the sway window manager to the branch of keyszer that my project (Toshy) is using.

Anyone interested can install for now from a zip downloaded from this this beta branch snapshot of the working version in the dev_beta branch:

https://github.com/RedBearAK/toshy/tree/sway_beta

At some point it will be merged into main, so if it's been a while since this was posted check the main branch README for references to sway.

The same branch also includes some code for Hyprland, but I can't test it myself. Anyone interested should try it and report any issues here:

RedBearAK/toshy#86

@RedBearAK
Copy link
Contributor

Added support for the upcoming Cinnamon/Muffin Wayland session. This required creating a Cinnamon shell extension modeled off the GNOME extensions I've been using for Wayland+GNOME support.

Also confirmed that the Hyprland method is working, by installing Hyprland on Fedora 39 with this:

https://github.com/JaKooLit/Fedora-Hyprland

@ddshore
Copy link

ddshore commented Mar 25, 2024

Hi @RedBearAK, I just found out about Toshy and keyszer. I'm on hyprland and plasma playing with it, moving from xkeysnail. I've had my xkeysnail keyboard mappings set up for emacs, but I'm a little lost on how to get toshy to pull a new config file. I tried to look at the config files you have, like barebones. My understanding is that I should add them to toshy_config.py? But I'm not sure I'm understanding correctly.

@RedBearAK
Copy link
Contributor

@ddshore

If you are not doing the Mac-style keymapper config, you will want to install Toshy with the --barebones-config option added to the installer command.

./setup_toshy.py install --barebones-config

This will install the barebones config file, or convert a standard Mac-like config file to the barebones config file.

After that you just add whatever you want into the marked "slices" where your custom additions will be protected from a Toshy reinstall or upgrade.

If you have trouble or need some pointers, open an issue on the Toshy repo. It uses a custom branch of keyszer with a number of differences from the version on this repo.

https://github.com/RedBearAK/toshy/issues/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help welcome Help/contrib is esp welcome
Projects
None yet
Development

No branches or pull requests

4 participants