Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explainer: WebDriver Extension for Accessible Nodes, etc. (potential solution for #197) #203

Open
cookiecrook opened this issue Sep 5, 2023 · 12 comments

Comments

@cookiecrook
Copy link
Collaborator

cookiecrook commented Sep 5, 2023

Rather than muddy the problem issue #197 with a specific proposed solution, I'm posting this as a standalone issue. Ideally we could turn this Issue into an Explainer and eventually a Spec, but the goal is to get wider approval of the idea first, during a few meetings at TPAC 2023 Sept 11–15 in Spain.

Note: I will be editing this problem description, so expect changes.

Note on WebDriver-BiDi

This explainer does not use BiDi examples, but we don't anticipate problems converting to the other format and welcome accessibility additions to Classic and/or BiDi. It's been suggested that this be added to the BiDi roadmap.

Current State of Cross-Browser Web Accessibility Testing

Existing WebDriver accessibility testing methods go through DOM Element, to AX Element, then to its label or role.

get a string value from the backing accessibility object (if it exists) of a given DOM element

session/{session_id}/element/{element_id}/computedrole
session/{session_id}/element/{element_id}/computedlabel

In 2023, we added over 1000 automated accessibility tests to the WPT Interop 2023 Accessibility Investigation using the above two WebDriver methods, but there is so much more to test, and no way available to test it in WPT/WebDriver.

Potential Changes

See also: #197
…a new WebDriver accessibility extension might look something like this:


1. Way to access the backing "accessible node" of a DOM element (if one exists).

Note

Only one of the following two accessors 👇 are needed, not both

get accessible node from its mainstream DOM element (if one exists)

EITHER a new method in a new accessibility-specific webdriver extension.

session/{session_id}/accessibility/element/{elId}/accessiblenode
#                    ^^^^^^^^^^^^^

OR a new method on the existing webdriver element interface.

session/{session_id}/element/{elId}/accessiblenode

Note

Only one of the preceding two accessors 👆 are needed, not both. Currently prototyping option #2.


2. Way to access an "accessible node" by its webdriver ID.

Regardless if an accessible node is associated with a DOM element (some are not), once you already have the accessible node id:

get accessible node by its WebDriver ID

session/{session_id}/accessibility/node/{axId}

3. Way to Trigger an Accessibility Event/Notification.

We also need a way to trigger a notification on the accessibility object, too.

session/{session_id}/accessibility/node/{axId}/synthesizeevent

Note

synthesizeevent is just a draft name. Very open to change on every aspect of this.

Common Events

Click/Press

where the minimum payload is the notification type (e.g. a screen reader “click” would fire):

{ "type": "press" } 

Explanation: “AX Press” almost always results in a DOM “click” but the event object on a “press on AX object” event can end up very different from a “click on DOM element.” For example:

  • Event target can be different with leaf nodes inside the interactive (e.g. a span in a button that intercepted the mouse event, versus the interactive itself, that the AT "cursor" is on... there are a number of known and unknown implementation differences here.)
  • Event timing can be different (e.g. mouseup and mousedown possibly in the same or adjacent event loops, which is unlikely for mechanical mouse or trackpad users.)
  • Other event properties can be different... Often these also allow some heuristic detection surface between AT users and mainstream others. See Several core architectural features of the Web Platform may allow heuristic detectability of assistive technology w3ctag/design-principles#293
  • See also: WebDriver’s element click intercepted error code; relevant because an accessibility click can bypass pointer hit point obscuration.
AT Focus (pulls keyboard focus if the element is focusable)

AT Focus should be verifiable, b/c it will pull standard keyboard focus along with it, if the AT focused elements is keyboard focusable.

{ "type": "focus" } 

Other Events/Notifications

Trigger “Action” (lower priority for v1/MVP)

It could also be used for non-default “actions” (e.g. trigger the associated “reply” action):

{
  "type": "action",
  "label": "reply" /* possible this should use something other than the translatable user string label */
} 

This one 👆 has native precedent, but the proposed Web API hasn’t yet shipped, so it may be lower priority.

Scroll into view a.k.a. “scroll to visible” (lower priority for v1/MVP)

This might not be needed as it’s usually called downstream from focus, rather than directly from AT.

{ "type": "scrollToVisible" } 
Show Menu (lower priority for v1/MVP)

Show menu (VO and other AT’s equivalent to show the “right-click” menu). This sometimes results in a different AT-vs-mainstream behavior when web site has overridden the “right-click” mouse behavior.

{ "type": "showMenu" } 

I don’t know how if “showMenu” would be interoperable on other systems, but it’s in WebKit because Mac VO and other AT support it. I assume Windows has something similar.


4. Test-Only (WebDriver-only for now?) Interface for accessible node.

Return value for the accessibleNode would be a static snap shot of the element at the time of the request:

  • including attributes like
    • checked
    • selected
    • label (equiv to el.computedlabel)
    • role (equiv to el.computedrole)
    • etc.
  • including ID references to parent and children in the accessibility tree, as well as the element ID of the mainstream DOM element (if there is one)
  • probably including relationships to other elements (label/for, aria-controls, etc.)
  • See more perf and implementation discussion points below.

Example return object for accessible node getter.

{ 
  "domnode": "<webdriver_dom_id>", /* optional, as not all axnodes will have domnodes, and vice versa */
  "label": "First Name", /* equivalent to /session/{sId}/element/{eId}/computedlabel */
  "role": "textbox", /* equivalent to /session/{sId}/element/{eId}/computedrole */
  "parent": "<parent_id>", /* WebDriver accessibleNode IDs not DOM IDs, */
  "children": ["<child_1_id>", "<child_2_id>", "<child_n_id>" ], /* ditto */
  "checked": undefined, /* checked n/a on text fields, perhaps ommitted in this returned object? */
  "required": "true", /* from `required` or `aria-required` attrs */
  "…": "…" /* dozens more accessibility relevant properties… */
}

Discussion Points

Getter Interface for ~“accessibleNode”

There’s a balance between whether to return a limited scope of known things to query" or to return "over-expose” as much as possible about the backing accessible object… Some relationships or properties are costly or slow to return, so we’ll probably need to start with a subset of the things that all implementations can return reasonably quickly.

Perhaps multiple getters: a default set of the easy ones (role, label, required, checked, yadda, yadda) and then we don’t include the ones with a significant perf cost or other complications unless requested specifically.

  1. /session/{sID}/accessibility/~ax_element/{axID} for defaults
  2. /session/{sID}/accessibility/~ax_element/{axID}/~full for everything
  3. /session/{sID}/accessibility/~ax_element/{axID}/~partial for a specific set, with an array of keys in the post payload

Note

Note that ~ above indicates TBD draft name proposals... Open to changes, of course.

Object/Node Persistance

API should be clear that Accessibility Objects/Nodes are not expected to persist once removed from the accessibility tree. Though this may be possible in some implementations, it is unlikely to be readily achievable in all implementations, so:

  • there is no expectation that a hidden or ignored element, for example, will return a backing accessibility object
  • when a DOM element is hidden and then re-displayed, there is no expectation that the current accessibility object bears any relationship to the backing accessibility object that existed before the DOM element was hidden. IOW, the WebDriver ID may be different, and there may be no way to reconcile the accessible object other than the earlier (now destroyed) object and the current object both reference the same DOM element WebDriver ID.
@jcsteh
Copy link
Collaborator

jcsteh commented Sep 6, 2023

Thanks for the detail here, @cookiecrook. This is a really solid start.

  1. Way to access the backing "accessible node" of a DOM element (if one exists).

Would this return an axId or an accessible node snapshot?

Return value for the accessibleNode would be a static snap shot of the element at the time of the request:
...
There’s a balance between whether to return a limited scope of known things to query

Can you explain the desire to return multiple properties at once here? This is very different to elements, for example, where WebDriver only returns one thing at a time. For example, to get an element attribute, you use /session/{session id}/element/{element id}/attribute/{name}, which only gets a single attribute. Obviously, returning multiple things in one call is better for performance. However, it does add some complexity in the spec; e.g. we have to work out what set of things to return as per your discussion section. If we return a single thing at a time, we can avoid some of that complexity.

Having different methods for every single thing would be tedious for extensibility. But perhaps we could have a simple attribute getter with a defined set of attribute keys we can expand over time? For example, /session/{session id}/accessibility/node/{axId}/attribute/{name}, where {name} could be "label", "role", "pressed", etc.

Further down the line, some thought needs to be given to how axId is specified. Some engines have simple globally unique 32 bit numeric ids for accessible nodes. I think Chromium does? I'm not sure about WebKit. However, Gecko does not, instead having a 64 bit unique id which is only guaranteed to be unique within the document, not across documents. So, some care needs to be taken in terms of what assumptions are made. I see WebDriver specifies that a node id is created as "a new globally unique string" and it also specifies that there is a "node id map". We might need to do something similar for axId. Or perhaps we can just specify that the id is globally unique but opaque and implementation defined? I'm not sure if that's reasonable.

Bikeshedding: Maybe this is just me, but I don't love the name event or notification for things we perform on an accessible node. I tend to think of events and notifications as things that a node fires. I would have suggested "action", but that gets conflated with default or custom actions. Maybe "interaction"?

@cookiecrook
Copy link
Collaborator Author

cookiecrook commented Sep 7, 2023

@jcsteh wrote:

Would [the "ax node from element" getter] return an axId or an accessible node snapshot?

That's an open question. Initially I thought either "ax node from element" or "ax node from id" should return the same snapshot object, but I don't have a strong preference for or against making the additional call...

Can you explain the desire to return multiple properties at once here?

Mainly to avoid tedium and perf hits... In the tree walker use case, for example, making each attribute/property separate calls could turn one call per element into dozens or hundreds per. But I acknowledge it could work either way.

I see WebDriver specifies that a node id is created as "a new globally unique string" and it also specifies that there is a "node id map". We might need to do something similar for axId.

The Gecko GUID question seems worthy of researching sooner rather than later. Obviously the spec should be limited to features anticipated to be readily implementable in all engines.

I agree with all your other points, and I acknowledge those are open questions too.

@jcsteh
Copy link
Collaborator

jcsteh commented Sep 7, 2023

Initially I thought either "ax node from element" or "ax node from id" should return the same snapshot object, but I don't have a strong preference for or against making the additional call...

It probably doesn't matter that much. If these calls return a snapshot, the snapshot should include the axId. The answer to this question will depend heavily on whether we go with snapshots or individual getters.

Mainly to avoid tedium and perf hits... In the tree walker use case, making each attribute/property separate calls could turn one call per element into dozens or hundreds per.

That's certainly true. This seems to be something that was considered acceptable for DOM elements in WebDriver and it'd be nice to have a similar interface for simplicity/consistency. On the flip side, we don't need to use WebDriver to walk the DOM tree, whereas we have no choice for the accessibility tree, so I realise the use case is quite different.

@jcsteh
Copy link
Collaborator

jcsteh commented Sep 7, 2023

The Gecko GUID question seems worthy of researching sooner rather than later.

If an opaque, implementation defined, globally unique id string is acceptable, I think this should be implementable in all engines. That said, when I first raised this, I didn't realise that a WebDriver session had a "current browsing context". As I understand it, a browsing context is associated with a document. If the accessibility methods use this browsing context, that means we only need to look at the document associated with the current browsing context, not all documents everywhere. That does make this a lot more feasible. I guess we probably still want the id string to be globally unique though, even across browsing contexts?

@cookiecrook cookiecrook changed the title WebDriver Extension for Accessible Nodes, etc. (potential solution for #197) Explainer: WebDriver Extension for Accessible Nodes, etc. (potential solution for #197) Sep 8, 2023
@alice
Copy link
Member

alice commented Sep 12, 2023

Some scattered (sorry, it's that kind of day) thoughts:


It might be helpful to guide the discussion if we could document some of the types of things we'd like to be able to test in WPTs using these APIs. Specifically, I think questions like those @jcsteh is asking around returning a property bag vs. returning discrete properties (as is done for Element properties), and questions around including/excluding ignored nodes when tree walking, might be easier to answer with a solid understanding of what we're going to do with the output.


Some of this makes me wonder whether we'd want to require accessibility to be "enabled" before an accessiblenode can be retrieved, so that we can ensure that the accessibility IDs are consistent (AFAICT currently Chrome at least implements computedname/computedrole on top of the CDP getPartialAXTree command, which creates a one-shot partial tree which is destroyed immediately.)


I guess all the property names will be based on the ARIA names, as the best platform-independent vocabulary we have available?

@cookiecrook
Copy link
Collaborator Author

Possibly need a way to register for outward notifications too… e.g. When a live region changes.

@jcsteh
Copy link
Collaborator

jcsteh commented Sep 13, 2023

Live region changes in particular might be tricky to standardise. Each API does them differently, which I suspect means core browser implementations vary wildly. Notably, IAccessible2 and ATK don't have specific live region events, but instead rely on generalised text inserted/removed events and the client checking live region properties on the object.

This is not to suggest that registering for outgoing events isn't something we need. It very probably is. However, I think it might take longer to iron out the details there and it might not make sense to block this work on standardising live region events.

Is there some other outward notification we can start with to get the core concept working? Focus or selection perhaps?

@cookiecrook
Copy link
Collaborator Author

Spoke with @OrKoN today who mentioned a related use case for accessibility in webdriver... possibly w3c/webdriver-bidi#443

@cookiecrook
Copy link
Collaborator Author

TPAC-related updates summarized in #197 (comment)

@cookiecrook
Copy link
Collaborator Author

cookiecrook commented Oct 3, 2023

Most relevant from above linked notes:

Interesting point from @jgraham that most proposed features ("get axnode from el", "get axnode by id", "synthesize AT event") could be in both classic and bidi, but a few ("intercept outbound notification" and "get full/partial tree") seem like they would be easier to implement as bidi-only.

So for the sake of near-term interop, the minimum viable product could focus on those that could ship near-term in all three engines:

  1. Way to access the backing "accessible node" of a DOM element (if one exists).
  2. Way to access an "accessible node" by its webdriver ID.
  3. Way to Trigger/Synthesize an Accessibility Event/Notification.
  4. Test-Only (WebDriver-only for now?) Interface for accessible node.

But not an outgoing notification snarfer, for example.

[Update Nov 7: As an example, this likely means that outgoing ARIA Live Region notifications would not be testable in WebDriver Classic.]

@cookiecrook
Copy link
Collaborator Author

cookiecrook commented Oct 3, 2023

Actually we could even remove (3. Trigger/Synthesize Accessibility Event/Notification) from the MVP, but it seems achievable and useful, so I'm keeping it in the short list for now.

@cookiecrook
Copy link
Collaborator Author

Potential error codes:

code name description
404 stale accessible node reference A command failed because the reference accessible node is no longer attached to the implementation's accessibility tree.
404 no related accessible node A command failed because the referenced element does not have a backing accessibility node in the implementation's accessibility tree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants