Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modal dialog: consider allowing Tab/Shift+Tab to escape to browser chrome #1772

Open
nolanlawson opened this issue Feb 22, 2021 · 21 comments
Open
Labels
Feedback Issue raised by or for collecting input from people outside APG task force question Issue asking a question

Comments

@nolanlawson
Copy link
Member

Under keyboard interaction for a modal dialog, it says the behavior for Tab should be:

If focus is on the last tabbable element inside the dialog, moves focus to the first tabbable element inside the dialog.

For Shift+Tab, it recommends the reverse:

If focus is on the first tabbable element inside the dialog, moves focus to the last tabbable element inside the dialog.

The native <dialog> element disagrees with the spec in this behavior. Instead of cycling through elements inside the modal dialog, Tab and Shift+Tab cause focus to eventually escape to the browser chrome itself (URL bar, etc.). In other words, when you reach the end of the list of tabbable elements, the "next" or "previous" tabbable element is inside the browser chrome, not at the opposite end of the modal dialog. This occurs in both Chrome 87 and Firefox 85 (with the flag dom.dialog_element.enabled).

The <dialog> spec only says that, when showModal() is called on the dialog, every other node in the document should become inert. So when focus escapes to the browser chrome, it's effectively just a product of the fact that this is how a normal web page works even without a dialog. On example.com you can press Tab and Shift+Tab to eventually arrive at the browser chrome.

Because of this, it seems reasonable to me to modify the WAI-ARIA Authoring Practices to allow for Tab and Shift+Tab to escape to the browser chrome. I would propose making it optional – focus can either cycle through the tabbable elements inside the modal dialog, or it can eventually escape to the browser chrome. (In either case, tabbable elements in the document but outside the modal dialog would not be reachable.)

There is actually a precedent here with window.alert and window.prompt. As I understand it, these used to be browser-modal (i.e. would block the entire browser), but are now content-modal (only blocking the relevant browser tab) in modern version of Chrome, Firefox, and Safari. (See relevant Firefox bugs.) This means that you can Tab through the focusable elements inside the modal, and eventually wind up in the browser chrome. So arguably, the WAI-ARIA guidelines for a modal dialog should be the same.

It's also worth mentioning that using the native <dialog> is apparently the only way to allow for keyboard interaction with closed shadow roots and user agent shadow roots (such as <video controls> and <audio controls>) when inside a modal dialog. And this behavior cannot be combined with the "cycling" Tab behavior – it's not possible for web authors to control focus change inside of closed/UA shadow roots. So if <dialog> is considered inaccessible per WAI-ARIA, that would leave web authors with no good options. (I'm disregarding other accessibility problems with <dialog>, which at least seem solvable on the part of web authors or screen reader vendors.)

@mcking65
Copy link
Contributor

mcking65 commented Feb 22, 2021

I'd like to propose the inverse; the behavior of the HTML <dialog> be revisited.

One lens to consider is how web pages are increasingly serving the purpose of apps and browsers are in that case analogous to operating system environments. In native OS dialogs, tabbing through a dialog does not allow focus to move into the operating system chrome, e.g., the Mac Dock or Windows task bar and start menu. In native OS apps, there are special keys, like cmd+tab and alt+tab and the Windows key for moving focus outside the app to other apps and the operating system itself. Similarly, browsers have dedicated keys for moving to other tabs, to the menubar, and to the address bar. In fact, I'd argue that even without a modal open, it would be better if tabbing from the last element on a web page were to wrap to the first element, at least by default. I'd similarly argue that browsers should do a much better job of helping keyboard users learn how to navigate efficiently.

Perhaps more important, though, is to consider why native OS modals behave the way they do. A primary benefit of modals is to keep focus on a task. Allowing focus to jump outside can not only negatively impact efficiency by adding many elementts to the tab ring (easily a couple dozen or more in some browsers) but enables users to get lost and unsure of how to return to the task. This problem is exacerbated by the geographic distance between the browser chrome and many modals. In other words, enabling users to tab outside a modal defeats the purpose of the modal.

Note that this is my own view; I'm not speaking for the APG task force. I'd like to get more input.

@mcking65 mcking65 added Feedback Issue raised by or for collecting input from people outside APG task force question Issue asking a question labels Feb 22, 2021
@mcking65 mcking65 added this to Next Steps in Dialog Patterns and Examples Development Project via automation Feb 22, 2021
@sinabahram
Copy link

HTML is wrong on this point. It's really that simple. Modal dialogs should be modal. screen reader users, and keyboard-only users at large, have enough to deal with. I very much hope and trust that we will not take steps to align against HTML on this incorrect and harmful behavior to users with disabilities. Dialogs should be modal. That has worked well across platforms from televisions to the web to desktop applications to mobile phone apps to my toaster oven.

+1 to every single thing that @mcking65 said above.

@MarcoZehe
Copy link

I would like to add my support to what Sina and Matt said. I also think, that browsers should adapt and the dialog element should be truly modal.

@jscholes
Copy link
Contributor

There isn't much I can add to what @mcking65, @sinabahram and @MarcoZehe have already said, in terms of expressing my opinion. I do want to raise one point though:

@nolanlawson

There is actually a precedent here with window.alert and window.prompt. As I understand it, these used to be browser-modal (i.e. would block the entire browser), but are now content-modal (only blocking the relevant browser tab) in modern version of Chrome, Firefox, and Safari.

I've put together a test page which allows you to open several types of built-in modals:

https://jscholes.github.io/modal-alerts.html

The first three use window.alert, window.prompt and window.confirm, respectively. The last three are related to file/directory pickers, so are platform-specific. But they don't allow access to the browser chrome, and therefore may contribute to user expectations of how modal dialogs work in a browser.

I haven't had a chance to test in macOS Safari. But in Google Chrome Canary, I can't reach the browser chrome after invoking any of the six examples. In Firefox 85, I can reach the browser chrome from the first three; the second three don't seem to work at this time.

@jscholes
Copy link
Contributor

Additional note of unexpected behaviour: on the sample page, activate the "Ask for confirmation" button to invoke a window.confirm-based modal. In both Firefox and Chrome on Windows, you can use Right/Left and Up/Down Arrow to move through the two buttons, which is standard dialog behaviour.

In Firefox, pressing Down Arrow or Right Arrow while the "Cancel" button has focus ends up moving into the list of open tabs. From that point, you can't press Left or Up Arrow to reverse your action, because both will result in the tab being changed. Or nothing, if you only have one tab open. Both possibilities are pretty rubbish.

The same also applies in reverse. If you have the "OK" button focused and hit Left/Up Arrow, you end up in whatever is the last focusable area before the page in the browser's chrome, e.g. your history sidebar or a toolbar with add-on-related buttons. Again, you can't reverse your action by using the opposite keypress.

I don't think most people will implement this custom arrow key behaviour when making a custom modal. But arguably, the browser should when displaying a <dialog>. Tab/Shift+Tab is one thing, but having the arrow keys move out of the page entirely is bewildering.

@MarcoZehe
Copy link

CC @jcsteh for awareness re the Firefox arrow key behavior from the previous comment.

@accdc
Copy link

accdc commented Feb 23, 2021

Personally, I support keeping the browser chrome out of the tab order when dealing with ARIA modal dialogs. There is no accessibility benefit by ensuring that this is preserved, it is confusing for non-sighted screen reader users who find themselves within then suddenly outside of a modal dialog, and there are already browser keystrokes designed for moving focus to the browser chrome anyway which will continue to work from within an ARIA dialog. E.G F6 and Alt+D, so there is nothing that is being made less accessible by ensuring tab focus remains within the dialog.

@nolanlawson
Copy link
Member Author

nolanlawson commented Feb 24, 2021

Thank you everyone for the feedback! And thanks @jscholes for the demo page. I can confirm in Chrome on both Linux and macOS (with "use keyboard navigation to move focus between controls" enabled) that it matches the behavior you describe. I must have tested Chrome incorrectly.

I also tested Safari on macOS, and it appears to follow Firefox's behavior: focus escapes to the browser chrome for all three of window.alert, window.prompt, and window.confirm when pressing Tab. But as others have said, this isn't necessarily the ideal behavior – perhaps Chrome is the only one doing this correctly.

If the consensus is that WAI-ARIA is correct and <dialog> is wrong, then it seems to me there are two possible solutions on the <dialog> side of things:

  1. Define Tabing behavior in such a way that focus cycles through the DOM rather than escaping to the browser chrome, even on pages without a <dialog> (as argued by @mcking65). This would "automatically" fix the modal <dialog>, since it is just defined as an element where every other element on the page is inert.
  2. Define Tabing behavior in such a way that focus cycles through the DOM, but only for a modal <dialog>. For instance, define the concept of a "focus trap" and apply it to the modal <dialog>.

I couldn't find any existing issue on the HTML spec (although whatwg/html#2171 is close). Should we open an issue and move discussion over there?

@sinabahram
Copy link

+1,000 to option 1, but option 2 is a fine fallback in case that does not gain support.

@smhigley
Copy link
Contributor

smhigley commented Mar 9, 2021

Restricting tab from entering the browser chrome makes sense to me, though I have some qualms specifically about using aria-practices to put that responsibility on authors.

The reason is that we'd create an issue where aria-practices implies that using the inert attribute doesn't fulfill the stated keyboard interaction model. While having authors correct for the current browser experience sounds nice in theory, the problem I run into in practice is the way individual devs and component libraries actually implement focus traps ends up being much buggier than just using inert and letting the browser handle it.

Could we have some wording along the lines of "Tab does not move focus into the inactive parts of the page while the modal is open", then follow up with browsers and the HTML spec?

@MarcoZehe
Copy link

I must admit I am a bit torn about this one. I think the focus trap should have been included in the spec for inert and dialog in the first place, when the ARIA Authoring Practices/Design Pattern had already agreed on the focus trap. Also, the focus trap existed when browsers were still doing per-window alerts and other dialogs, and only started "softening" when they went to per-tab "modal" alerts etc. I put modal in quotes on purpose here, since they aren't really modal in the classic desktop dialog sense, but for the best user experience, they should be.

So while I agree that there are developer implementation burdens right now, the ARIA version of "modal" gives the much much better user experience for keyboard and screen reader users than the current browser implementations of "modal". Losing context by going into the browser chrome from a web dialog is just a cognitive burden that needs to go away ASAP.

I advocate for strongly urging the spec and browser vendors to implement inert and dialog properly to be truly modal to the tab, without going into the browser chrome with the tab key when such a dialog is open. It is modal for a good reason and needs to be absolutely confined.

My 2 Cents.

@sinabahram
Copy link

Given that there seems to be rather strong consensus around option 1 as stated above e.g. defining tab behavior not to escape to the chrome of the browser, how can we move forward to make sure there is alignment both within ARIA for the time being and then HTML eventually, so that there's no gaps in this definition?

@carmacleod
Copy link
Contributor

carmacleod commented Mar 9, 2021

@mcking65
You asked about the spec for tabindex: https://html.spec.whatwg.org/multipage/interaction.html#attr-tabindex

@css-meeting-bot
Copy link
Member

The ARIA Authoring Practices (APG) Task Force just discussed Modal Dialog - Tab Ring.

The full IRC log of that discussion <MarkMccarthy> TOPIC: Modal Dialog - Tab Ring
<Jemma> https://github.com//issues/1772
<MarkMccarthy> Matt_King: next item, issue 1772. Says APG should allow focus to go outside the dialog. After many other comments... Sina is asking if this is anything for ARIA to handle or not.
<MarkMccarthy> Matt_King: but sarah_higley's last comment was suggesting not putting as much onus on authors
<Jemma> sarah's comment https://github.com//issues/1772#issuecomment-793285322
<MarkMccarthy> sarah_higley: the reason i put that in is that when i've suggested using inert to handle focus managment, i've had pushback citing ARIA practices.
<MarkMccarthy> Matt_King: does our break anything really?
<MarkMccarthy> sarah_higley: so this isn't exactly about our example, but the wording in practices. generally, active focus management tends to be buggier
<MarkMccarthy> sarah_higley: basically, it just seems like inert is making things funky
<MarkMccarthy> Matt_King: this is one of those reasons the APG redesign project is discussing scope increases, so we can better test and write for things like this. this'd be lovely for something like that
<Jemma> sarah -"Could we have some wording along the lines of "Tab does not move focus into the inactive parts of the page while the modal is open", then follow up with browsers and the HTML spec?"
<MarkMccarthy> Matt_King: in the meantime, do you think we should add a note to the pattern, if so what to add? or what to do?
<MarkMccarthy> sarah_higley: could we have something in the language like, "we think the best UX is to keep focus trapped in the dialog, but this should be handled by browsers..." etc etc. lots of wordsmithing, of course
<MarkMccarthy> Matt_King: i think it'd be better if we raised the issue to the right places first. as well as some broader consensus - I don't want APG to seem so monolithic
<MarkMccarthy> Matt_King: maybe something like "Do your best to make this work, we know it's rough in spots" or something similar
<MarkMccarthy> sarah_higley: Alice basically mentioned it'd be hard for browsers to make this change, but seemed a little optimistic
<MarkMccarthy> s/this change/this change because people are used to it
<MarkMccarthy> Matt_King: Well, I'd love to see it be more general, so tabbing stays in the webpage completely regardless of a dialog. it'd be so much easier in so many ways
<MarkMccarthy> Matt_King: especially on Mac, it's so hard to skip the browser chrome
<MarkMccarthy> sarah_higley: I thought that was just me!
<MarkMccarthy> [various comisserating about tabbing in browser chrome]
<MarkMccarthy> Matt_King: i don't have a good answer about _that_, but I'm hopeful we could find some consensus about the modals
<MarkMccarthy> sarah_higley: so HTML doesn't specify how browsers handle their chrome, right?
<MarkMccarthy> carmacleod: yep
<MarkMccarthy> Matt_King: could it be part of spec for a dialog element?
<MarkMccarthy> sarah_higley: _that_ could be part of HTML, though I don't think i've seen it. but something specific for browser modals and webpage modals?
<MarkMccarthy> s/something specific/adding something specific
<MarkMccarthy> sarah_higley: to be clear, i don't think we need an HTML change or addition, i think it'd be a behavior thing. (and I don't think we need a specific example for each either)
<MarkMccarthy> Matt_King: what spec does tabindex live in? i vaguely remember somethign about that, some stuff to do with tabindex=-1 and ARIA...but I don't remember the spec. Maybe this would go in _that_ spec
<carmacleod> https://html.spec.whatwg.org/multipage/interaction.html#attr-tabindex
<MarkMccarthy> s/somethign/something
<MarkMccarthy> carmacleod: maybe it'd go in this one (pasted above)
<MarkMccarthy> Matt_King: so i'd support that proposal, adding some language around inert. then we can publically try to rally support
<MarkMccarthy> sarah_higley: sounds good to me!
<MarkMccarthy> Matt_King: i'll get to this unless anyone else wants to file that issue
<MarkMccarthy> github: https://github.com//issues/1772

@alice
Copy link

alice commented Mar 9, 2021

Just to try and collect my thoughts on this in one place:

  • It seems like we have a rough consensus here that focus moving off the page into browser chrome is confusing, regardless of whether any modal UI is showing.
    • There also seems to be agreement that in the case of a modal dialog, wrapping focus to the top of the dialog is the preferable experience (compared to focus getting "stuck" at the end). Is that also true for a document?
    • This seems particularly acute for screen reader users who have to listen through each focus target in the browser chrome to determine whether they're back on the page yet, and this would be especially likely to happen when a modal dialog is showing, since it would tend to have drastically fewer focusable elements than an average page.
    • Have we heard from any non-AT users who don't use pointing devices about how they feel about this behaviour?
  • I agree with @smhigley that it makes sense to have the browser handle this as much as possible, rather than encouraging developers to write fragile focus-handling code.
  • It doesn't make sense to me to try to have inert responsible for preventing focus from moving to browser chrome. inert makes a subtree inert; it can't have any impact on the page as a whole.
  • Having focus wrap around to browser chrome has been the default behaviour for as long as I can remember, so changing it would need to be approached carefully (even if changing it would be an improvement). A few potential ways forward I could imagine:
    • Add a setting which users would need to opt in to individually. This would mean that users wouldn't have the expected behaviour change unexpectedly, but they would need to discover the preference in order to turn it on.
      • We could potentially turn the setting on automatically when AT is detected, but that might cause more problems than it solves.
      • We could think about showing some kind of information bubble, like when new security UI is added, informing users who tab off the end of the page that the setting exists. (Possibly as a native dialog which appears in the focus order after the last element on the page)
    • Iterating on the native skip links idea, what if the browser added a "return to top" link at the end of the focus order on each page (including when a dialog is showing)? Honestly I don't love this idea for a number of reasons, but throwing it out there anyway.

@jscholes
Copy link
Contributor

@alice

There also seems to be agreement that in the case of a modal dialog, wrapping focus to the top of the dialog is the preferable experience (compared to focus getting "stuck" at the end). Is that also true for a document?

I think focus going somewhere is preferable to it getting stuck at the bottom, particularly in terms of users adapting to a change of this magnitude if applied in browsers. It can then be phrased as keyboard focus skipping the browser chrome unless specific keyboard shortcuts are used to reach it, rather than focus just... stopping. It may also be worth noting that I believe this to be the default behaviour in JAWS when quick nav is used, e.g. trying to find the next form field from the bottom of the page will wrap and start searching from the top.

Iterating on the native skip links idea, what if the browser added a "return to top" link at the end of the focus order on each page (including when a dialog is showing)?

Agreed that this doesn't sound good. It would make it more difficult to determine, at a glance, which links were in the DOM and which were part of the page, and create duplication if the page already had such a link. Unless it would only be in the keyboard focus order and not in the DOM, but that also sounds bewildering e.g. if a screen reader user could tab to a link (which is a type of control associated with the web) but not find it on the page they were browsing.

@jcsteh
Copy link

jcsteh commented Mar 10, 2021

We're comparing web modals to OS modals here, but they actually have some different properties:

  1. The dialog is modal, yes, but it's modal to the web document. For example, you can't click a button in the document outside of the dialog, but you can click the address bar. Not allowing users to tab to the browser UI when you normally can and it's still perfectly visible (not obscured or diminished in any way) feels like we're lying to the user.
  2. You can tab to the browser UI from a web document (no dialog). I find it hard to justify why a modal dialog should be different, especially given point 1. If nothing else, it's inconsistent.
  3. In contrast, OS modal dialogs (Open, Save, etc.) are truly modal: you can't interact with the rest of the browser UI at all while they're open.
  4. If we opened a pop-up window instead of a web modal (arguably used to serve the same purpose in many cases), you'd be able to tab to the address bar.

IMO, if we're going to argue that tab shouldn't reach the browser chrome in web modals, to be consistent, tab shouldn't reach the browser chrome even in documents, as others have said. I don't think we should be deploying one change without the other; doing so is inconsistent. In theory, I actually agree with this. Pragmatically speaking, there are problems with that, though:

  1. This behaviour has existed for many, many years. Changing it is likely to open up a huge can of worms.
  2. Like it or not, keys such as f6 to move between panes are not well known, even among relatively experienced screen reader users, let alone non-screen reader keyboard users. As an example, we've had a lot of comments concerning f6 moving to popup notifications in Firefox not being discoverable, even from advanced screen reader users. On top of that, I think (but am not certain) that f6 is a Windows centric thing; I don't think it's standard on Mac, though browsers probably support it there.

@jcsteh
Copy link

jcsteh commented Mar 10, 2021

Not allowing users to tab to the browser UI when you normally can and it's still perfectly visible (not obscured or diminished in any way) feels like we're lying to the user.

To be extra clear, it wouldn't be a problem if we didn't ever allow tabbing to the chrome for documents, since there'd be no change. My problem here is the inconsistency. Nothing changed as far as the chrome is concerned, yet we change tabbing behaviour.

@sinabahram
Copy link

sinabahram commented Mar 11, 2021

I don't think most users differentiate web and OS modals, just as you, @jcsteh, point out that f6 behavior may not be known by all.

We can't keep using reductionism to maintain the status quo when the status quo is harmful to users. If we accept that no matter what, there's some level of change required, which I desperately hope that we do, then I submit that we should incur that cost in such a way to minimize user pain and maximize utility, even if that utility must be learned in order to be fully realized.

To that end, I strongly advocate, as have many other users on this thread, for modals, OS or otherwise, to be tab-trapped. I have seen user after user get lost in browser Chrome. They simply don't wish to go there when they are interacting with a document or dialog. Yes, it is visibly there, but that simply does not matter. We should provide good affordances for accessing the chrome, whether it is the myriad shortcut keys available from alt+d and navigating from there to ctrl+L to quickly type in a location to f6 to navigate around, to whatever else exists, up to and including a browser-specific shortcut. This is preferable for all users, whether it is switch users that can macro such a chrome-jumping key to keyboard users and screen reader users that must not be forced to hit tab a million times to get to the chrome.

TLDR: can we please let tab be trapped within the context for which it is the most helpful e.g. what the user is currently interacting with?

As for consistency, I can't agree more. Never tab to the chrome. That's quite consistent.

@nolanlawson
Copy link
Member Author

Hi folks, I'm sorry to dredge up this issue again. But I took another look at this problem, and my conclusion is still that it's impossible to build a modal dialog that meets the APG criteria, if the dialog contains a closed shadow root or user-agent shadow root like <video controls> or <audio controls>.

The closest I can get is by using <dialog>, but it still has the issue of Tab escaping to the browser chrome. Anything other than <dialog> requires hacks that don't fully address the problem.

Summarizing the discussion above, it seems there are 3 ways forward:

  1. Get browsers to change the way Tab works on all web pages, such that it doesn't escape to the browser chrome.
  2. Introduce a new "focus trap" primitive (perhaps similar to focusgroup).
  3. Change the APG to make Tabing to the browser chrome acceptable.

The first option seems to be the one this group prefers, but it sounds to me like a bit of a tall order. As @alice said:

Having focus wrap around to browser chrome has been the default behaviour for as long as I can remember, so changing it would need to be approached carefully (even if changing it would be an improvement).

Or as @jcsteh said:

This behaviour has existed for many, many years. Changing it is likely to open up a huge can of worms.

It would be great to get all browsers on-board with this change, but it seems hard without a dedicated champion. And given that it isn't technically part of any spec (AFAIK), can we ensure that various forks of browsers with their own implementations of browser chrome (Edge, Brave, Vivaldi, Tor Browser, GNOME Web, etc.) all honor this contract?

The second option would meet the current APG criteria without needing to change longstanding browser behavior, but it could still cause non-compliant dialogs to appear in the wild, if naïve developers use <dialog> without using the focus trap primitive as well. The <dialog> spec could be changed to use this primitive, though.

The third option is a bit drastic, and many in this group are opposed to it. But I wonder if it isn't the best compromise, given the de-facto state of <dialog>, and assuming that neither of the first 2 options can be achieved. Because of issues with shadow DOM and userland dialogs, I concluded in my research that the only reasonable solution today is to use <dialog>, even if it doesn't meet the APG criteria. (It's better to be able to Tab through a <video controls> at all, even if Tab escapes to the browser chrome.) And given that it doesn't require any fragile userland JavaScript to make it work correctly, it's a very attractive option.

The only other option is for web authors to avoid closed/user-agent shadow DOM entirely, which seems nonviable, especially for generic dialog components that don't know what their content will be.

@JAWS-test
Copy link

@nolanlawson I don't find it problematic that the HTML dialog differs slightly from the ARIA dialog in terms of keyboard operation. Keyboard operation in ARIA patterns is not normative, but merely a sensible recommendation. If browsers do not follow the ARIA recommendations for HTML elements, this can be annoying, but there is no necessary reason to change either operation unless they directly contradict each other or differ greatly (which is not the case here). There are also other ARIA patterns that deviate from the HTML standard in terms of keyboard operation, e.g.

  • Multiple selection at listbox (with ENTER or with SHIFT or CTRL)
  • SHIFT+TAB on radio buttons (focus first or last)

See: #2193

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feedback Issue raised by or for collecting input from people outside APG task force question Issue asking a question
Development

No branches or pull requests