Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Arbitrary Modifiers (Accessibility Keybindings) #425

Closed
TTWNO opened this issue Jan 3, 2024 · 10 comments
Closed

Support for Arbitrary Modifiers (Accessibility Keybindings) #425

TTWNO opened this issue Jan 3, 2024 · 10 comments

Comments

@TTWNO
Copy link

TTWNO commented Jan 3, 2024

Just want to say thank you for all the hard work on libxkbcommon. It's a great project and I'm grateful that somebody else has written the complexity of keeping keyboard state :)

I'm here to talk to you about accessibility, and there's a bit of a backstory to this suggestion.
In the end, I'm looking for support for arbitrary modifiers.

The Problem

Assistive technologies, in particular screen readers are broken on the Wayland display server.
This is because keybindings can no longer be randomly trapped by various processes with no permissions system (a la X11); everything flows from the compositor.

Why Are Screen Readers Broken on Wayland?

Screen readers (what blind people use to access their computer), often require an extra modifier key to work.
They enable users to use actions like "Go to Next Heading (in the document)" or "Speak the title of the window I'm focused on", etc.
These commands are often triggered with Capslock+h, or Insert+h, or at least Capslock and Insert have become the default modifiers for screen readers since screen readers have existed.

In X11 times, Orca, the most used screen reader for open-source systems, handled its own keybindings via AT-SPI.
AT-SPI got all key-bindings from the toolkit (like GTK, Qt, etc.); and this led to a situation where if the application was not accessible, screen reader key bindings simply did not work at all.
This left screen reader users unable to navigate their desktop once they opened an inaccessible application: Zoom (the meeting company's app), for example.

In order to remedy this, Orca has considered many alternatives, including, but not limited to:

Potential Solutions

  • Grabbing key events directly from evdev
    • evdev provides a low-level interface for grabbing keybindings straight from the kernel.
    • This means that screen readers will become inherently unsafe for multple reasons:
      • It bypasses any semblance of security.
      • The screen reader getting locked up, could end up locking up the entire system.
      • Increases complexity of the screen reader as to handle two completely separate bits of state: the keybindings, and what to do with the assistive technology APIs in response to those bindings.
    • This was also attempted by the Odilia screen reader, with poor results.
      • Implementing keybindings correctly is hard, and we often came across edge-cases where pressing the keys in the wrong order could trigger an event, but also cause the repetition of various keys that are a part of that combo.
  • Moving the key-handling to the compositor (via an XDG portal).
    • This helps with the downsides of the first option, but introduces its own issues.
    • It massively increases the complexity of the keybinding code in Mutter, since it has to work against libxkbcommon in handling the state of the keyboard.
      • For example, adding key bindings for Capslock is hard because Capslock now has to be treated as a normal modifier (not a latched one).
      • And Insert is that much harder, as it would require working around the fact that Insert is not treated like a modifier at all.
    • Even if this could be implemented by Mutter, it would cause the long-term complexity and maintenance of the project to be quite heavy.
    • And, even if the maintenance costs were not enough, it means that accessibility keybindings will be hard to implement for other desktop environments and window managers.
    • The people working on xmonad, sway, unity, etc. will also, independently, have to work against libxkbcommon in their own way, increasing complexity at all points.
    • At the end of the day, if accessibility isn't easy (at least relatively) to implement, then it simply will not be done.

libxkbcommon's Part In This

After briefly attempting to solve this problem with a Wayland protocol extension, an XDG Desktop Portal, and with an attempt at implementation in Mutter, and after many talks with others involved in accessibility: Joannie Diggs (Orca maintainer), Carlos Garnacho (GTK/Key Handling dev), and Matthew Campbell (accesskit maintainer), we have found ourselves in a bit of a pickle.

I'm trying to find out if the ability to add arbitrary modifier keys is in scope or not for libxkbcommon?

Perhaps there is some solution that I am unaware of that makes this whole discussion moot.
If so, please feel free to tell me this is the wrong place to ask.
To me, it seems that the ecosystem is mostly coalescing around libxkbcommon, or at least adopting their naming and syntax for binding descriptions.

It would be really nice if there was at least some possibility of this type of behaviour being supported upstream, rather than individually working against this cornerstone library across the whole ecosystem.

(NOTE: I am being payed by GNOME to work on this, so responses should be quick during my working hours; I have some every day, and reside in the mountain (UTC-7) timezone).

@wismill
Copy link
Member

wismill commented Jan 10, 2024

Accessibility is a very interesting and important topic, but unfortunately I do not have the time to dive into all your links. I am going to focus on xkbcommon.

The following is my take on the question. CC @whot @bluetech for other opinions.

So your question is:

I'm trying to find out if the ability to add arbitrary modifier keys is in scope or not for libxkbcommon?

xkbcommon does not provide a mean to modify a keymap once loaded. You can only query the keyboard state or load a (new) keymap. So if you want to change the behavior of a key on the fly (e.g. Insert), it is not possible in xkbcommon.

You could, however, add an option to set a fixed modifier in the common keyboard layout database xkeyboard-config. When we talk about modifier, there are two points to consider:

  • The modifier keysym: there are specialized modifiers keysyms (e.g. Shift_L, Caps_Lock, ISO_Level3_Shift, etc.) but theoretically any keysym could play this role. But if you go that road, it would make sense to propose a new dedicated keysym in xorgproto.
  • The modifier mask: this is a bit mask that is used to encode the state of a modifier. Please read our documentation.
    Now, you probably want a very specific modifier that cannot be mixed with any other one. In this case, we could define a new real modifier in xkbcommon dedicated to this use. Note that this would not be supported by X11-only applications. And require some work to ensure xkeyboard-config is usable by both X11 and Wayland.

But it seems to me that it would be easier to process the key handling for accessibility in the compositor, right before the compositor updates the keyboard state using XKB.

@garnacho
Copy link

But it seems to me that it would be easier to process the key handling for accessibility in the compositor, right before the compositor updates the keyboard state using XKB.

That is part of the challenge. A compositor could devise ways to work on top of xkb_state state machine so that Caps Lock doubles up as an a11y keycombo modifier and as its traditional Caps Lock toggle role without triggering both things at the same time and keeping XKB state/leds happy. But this will be complex enough that not every compositor might follow, while a11y should arguably be made easy to implement so good compositor a11y practices are most ubiquitous.

The other part of the challenge is that this intended to fuel a portal so that sandboxed screen readers can take over keycombos with these modifiers, that also means exchanging descriptions about these keycombos across a wire, thus some string serialization/deserialization where XKB is often handy too, and most ideally, a shortcut handling from keyboard events that is in line with existing code. Mutter could re-implement part of what xkb does usually give us "for free", but again wouldn't be a too low bar for every other compositor to follow.

I am concerned that a full-on "arbitrary modifiers" request will not be actionable beyond a who-does-what discussion, it feels like a gigantic task. Since Orca currently only supports either CapsLock or combinations of KP_Ins/Ins, I think it would be much more manageable to focus on supporting only these keys as modifiers and not try to add more requirements than necessary out of the blue. For the sake of ease of use, perhaps a xkb option would be an easy switch?

As I optimistically imagine things working, Mutter (or any compositor) would toggle this xkb option when a screen reader flares up, enabling the configured keys as a special modifier, xkbcommon would be helpful in maintaining correct state (and leds where applicable), distinguishing "used as keyboard shortcut modifier" from "pressed/released alone" cases by itself, and the compositor would not require substantial changes in extending the existing keycombo code to handle keyboard shortcuts with the special modifier (and communicate these actions to the screen reader, through the portal or otherwise).

@TTWNO
Copy link
Author

TTWNO commented Jan 26, 2024

Since Orca currently only supports either CapsLock or combinations of KP_Ins/Ins, I think it would be much more manageable to focus on supporting only these keys as modifiers and not try to add more requirements than necessary out of the blue.

Seems reasonable to me. I only asked for arbitrary modifiers since writing documentation for "you can also use these three specific keys as modifiers, which are not normally modifiers" would be somewhat strange to justify. Unless this is just "screen readers, therefore ..."

I'm fine with either, to be clear, I'm mostly interested in seeing some way to do this commonly, such that a re-implementation of key handling is not required for each compositor. Even if that is a reduced set of keys that are just hard-coded as being permitted for this option.

CC @bluetech @whot (as instructed by first comment, for second opinions)

@whot
Copy link
Contributor

whot commented Feb 7, 2024

[apologies if you got the comment twice, I hit the wrong button before I was finished typing it]]

I've experimented around a bit so let me add my 2 dollaridoos, this gist has the results.

First: it's relatively easy to add a new virtual modifier to the insert key, see this patch in the gist for an example that toggles the new A11yLock virtual modifier with Shift+Insert. To test that, use xkbcli compile-keymap, then apply the diff and test with xkbcli interactive-evdev --keymap $file.

fwiw, modifying that behaviour to latch instead of lock is easily possible, same with giving it a shift-like behaviour.

Modifier limitations

But there are some significant limitations:

  • virtual modifiers are just name aliases and need to be bound to real modifiers. In my example I bind it to Mod3, dropping the existing Level5 mapping and thus rendering any layout that requires level 5 keys unusable. And I do that because:
  • we're out of real modifiers - there are only 8 and we have them assigned (this has also come up in the need for an Fn modifier). The only way we can work around that bit is by adding support for more real modifiers to libxkbcommon (the API already supports it) but that would also make the resulting keymaps incompatible with xkbcomp. @wismill already had some "fun" with that in Add support for conditional comments #432.

When it comes to CapsLock there are more limitations:

  • a keycode cannot have two modifier maps and CAPS is already mapped to Lock, we can't add Mod3 to this like I did with Insert in the gist.

So basically: even assuming we can add new modifiers to libxkbcommon XKB as it is now does not let us use capslock the way you want it to be used (though it may be possible with Insert).

Keysyms for orca's actions

Assuming we can get the modifier to work, the next step would be binding the keysyms. Modifiers do little but alter which keysym level is used. If the Lock modifier is on, the second level of each alphabetic key is used, if the NumLock modifier is on, the second level of the keypad is used. For the new modifier to do something we need to map all affected keys to handle this particular case. Which would look roughly like this patch in the gist. That is doable (with a a lot of effort...).

This of course assumes that we'll have special keysyms for whatever actions orca wants to do.

But the effect of all this (were it all to work) is that now you can press your orca modifier and have it return the custom keysym for that key you have defined. It's all in a keymap so you can re-assign the various keys fairly freely.

It looks like it works because:

  • orca gets the keysyms directly without having to do anything beyond XKB keymap handling
  • clients will get keysyms they don't know so they'll ignore those
  • compositors can send the same key event to orca and the client without having to worry about anything

So... win? Not quite: this means we do need keysyms for all actions, which means orca is now bound to xorgproto + libxkbcommon + libX11 to update the keysyms.

No orca keysyms then!

We don't actually need keysyms, we can make orca work on <A11yLock>h (like any other keyboard shortcut) but then we can no longer send the key events to the client - the client will interpret <A11yLock>h as just h, same as <NumLock>h is h.

So the big advantage of the compositor not caring goes away, the compositor now has to re-route events depending on the A11yLock modifier state. Which means we have to handle Shift down, A11yLock down, h down, h up, Shift up, A11yLock up so the Shift modifier is correctly released on th eclient. If you blindly re-route events between A11yLock down and up, the Shift up just disappeared from the other client.

So the compositor needs to keep the keymap state up-to-date while routing to orca and then sync that modifier state back to the client once the A11yLock modifier is unset again. This is the only bit I can see that having all this in libxkbcommon will really help with.

However, orca has the behaviour that pressing the orca modifier key twice triggers the key itself (capslock twice -> capslock). This is (afaik) only possible by making the orca modifier a latching modifier (action= LatchMods(modifiers=A11yLock)) but that also requires the compositor to track that modifier and toggle the re-routing when it sees the modifier key pressed with A11yLock already down. Including modifier state sync again.

But this is incompatible with how CapsLock usually works - we cannot have it both latching for A11yLock and locking for CapsLock so... uhm... 🤷

Stepping back

Why do we actually want a modifier? The only effect of those is to change which key level they're assigned to. And XKB just gives us fancy keycode->keysym remapping ability (read: layouts), afaict orca does not have a need to be able to arbitrarily remap the keys.

As above, the double-press isn't really possible with XKB and neither is having different modifiers on the same physical key so using CapsLock is almost out a-priori. And the compositor still needs to track the keys as they are being re-routed (except in the case of custom keysyms). So I'm not sure this approach is more robust than special-casing the orca hotkeys in the compositor.

IMO the key to special-casing the orca hotkeys in the compositor is to redirect them before XKB takes effect (see @wismill's comment above). Orca cares about few specific keys and we know where they are 1 so the compositor code can be a variation of:

// This is obviously pseudo-code for illustration purpose only
if (press && keycode == KEY_CAPSLOCK)
    self.orca_active = true;
} else if (self.latch_to_orca) {
   send_key_to_orca(keycode);
   self.orca_active = false;
   self.ignore_next_key_release_for = keycode;
} else if (release && keycode == self.ignore_next_key_release_for) {
   // discard
} else {
   keysym = xkb_state_get_one_sym(keycode);
   xkb_state_update_key(keycode);
}

The key here is that the trigger key must vanish from the event flow.

It's a tad more complicated because we still need to track the trigger key/press release for the double-press, and sync modifiers across the latched state. I think the only way to do that is this approach:

void process_while_sending_to_orca(int keycode) {
    assert(orca_active == true);
    tmp_state = copy_previous_keymap_state(real_state);
    changed = xkb_state_update_key(tmp_state, keycode);

    if (any_modifier_has_changed(changed)) {
       // feed the modifier to the real state
       xkb_state_update_key(real_state, keycode);
    } else {
        ignore_next_key_release_for = keycode;
        send_key_to_orca(keycode);
        orca_active = false;
        sync_modifiers_to_client(real_state);
    }
}

IOW: while routing keys to orca, check if a current key has an effect on the modifier state and

  • if yes, feed it to our keymap so the state is correct once we stop routing
  • if no, assume it unlatched the orca hotkey, sync our current modifier key to the client

This is the "fighting against xkb's state" but I don't see any other way that gives us what is needed.

Summary

There are some technical limitations in xkb that require a lot of effort to overcome to support an accessibility modifier. And even if we do, the only real benefit we get is to be able to reassign orca's shortcuts via an XKB keymap - which isn't something we actually need. The keymap state handling doesn't work as well as it should - the compositor still needs to special-case the key events either way so overall it seems like a lot of effort for something that doesn't actually help us in any significant way.

Footnotes

  1. we care about KEY_CAPSLOCK, not "the key that is assigned the CapsLock keysym" which could be any key

@aral
Copy link

aral commented Feb 18, 2024

Why Are Screen Readers Broken on Wayland?

I just stumbled onto this issue while trying to debug why Orca’s modifiers were not working for me on the latest Fedora Silverblue.

Surely this cannot be correct, right? I mean is Fedora, Ubuntu, and every other distribution that ships Wayland by default inaccessible? I must be missing something.

@TTWNO
Copy link
Author

TTWNO commented Feb 18, 2024

Surely this cannot be correct, right?

It is.

I mean is Fedora, Ubuntu, and every other distribution that ships Wayland by default inaccessible?

If they ship Wayland by default, yes.

I must be missing something.

Nope.

@TTWNO
Copy link
Author

TTWNO commented Feb 18, 2024

I'd like to clarify, @whot that that Orca will (in the futute) not handle keybindings directly, but rather, will be sent events over an XDG portal (similar to Global Events).

This can not be implemented as is, because:

  1. One can not bind <CapsLock>h or <Ins>h in the compositor (since Mutter assumes a valid kxb string).
  2. Now, handling those shortcuts would involce its own entire codepath: 1) check that a11y modifiers is pressed 2) check if any of the keysyms match those in the table 3) send an event to Orca, like a11y-next-heading.

The thing is, there is aleady a code path for this, that handles modifiers, the state of the keyboard, resolving keybindings, etc.

If I am understanding correctly though, handling configurability of keyboards in this way is out of scope for kxbcommon?

@whot
Copy link
Contributor

whot commented Feb 19, 2024

One can not bind <CapsLock>h or <Ins>h in the compositor (since Mutter assumes a valid kxb string).

This bit seems to be a mutter bug? We can't bind other keybindings either (e.g. you can't bind volume keys to a custom shortcut) but that's "just" bits missing in mutter, right? It's not an architectural restriction.

The thing is, there is already a code path for this, that handles modifiers, the state of the keyboard, resolving keybindings, etc.

Yes, that's is mostly true but it's not in the specific way you want. I think the best explanation I can provide here: XKB has modifier keys but Orca wants an activation key. Those two have very different semantics and by using the same term for both you're just heading down the wrong path.

Orca's key isn't a modifier, it just messages that you want the next key(s) to be handled by orca. It doesn't affect the subsequent keys at all, aiui.

XKB modifiers (may) change the level of a predefined layout to generate a different keysym. But that's not dynamic and it needs to be applied to the keymap beforehand anyway1.

Best example (US layout): your Q key produces q by default or Q with the Shift or Lock modifiers on. Notably, any other modifier has no effect on this particular key. Notably - the modifiers don't disappear, a client still knows about them, just chooses not to handle them.

Modifiers in XKB work per keycode, not per keysym. So in your case if you want capslock to be a modifier for both Lock and A11y that's just not possible 2. I think you might be able to get the <Ins>h approach working but only because that one isn't already a modifier. So unless you're willing to drop Caps and reassign the key to some other key that isn't already used as modifier, this cannot work with XKB, or at least not without code paths in the compositor that nullify any benefits you'd get from this.

But even then, right now we do not have any real modifiers left so unless you're willing to drop other functionality, this just cannot work. FTR, we have been experimenting with adding more modifiers in #450 and #447 but that's a future feature and definitely won't work right now.

AUIUI, you also want <Caps><Caps> to work as <Caps> and <Ins><Ins>, neither of which is possible with XKB as-is.

Side-note: Another potential option may be overlays but they're missing from libxkbcommon (#124) so it's a bit moot. And I don't know enough about them to really say that's the best solution.

TLDR: an Orca modifier key and XKB modifier key are two very different things and one doesn't work like the other.

Footnotes

  1. arguably the compositor could send a different keymap without that option to the client

  2. to more specific: you cannot have something on level2 that doesn't also trigger level1 modifiers, but you can have a modifier on level1 that doesn't trigger level2.

@TTWNO
Copy link
Author

TTWNO commented Feb 19, 2024

XKB has modifier keys but Orca wants an activation key. Those two have very different semantics and by using the same term for both you're just heading down the wrong path.

Ahh, now this is making sense. Thank you for your explanation.

Perhaps it is time to go back to the drawing board with this one.

@whot
Copy link
Contributor

whot commented Feb 20, 2024

Closing for now, we can re-open this if we come up with a way to use this but right now I'm not sure how.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants