Skip to content

Questions on Mobile ATs Screen Reader Capture approach #1367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
howard-e opened this issue Apr 9, 2025 · 2 comments
Open

Questions on Mobile ATs Screen Reader Capture approach #1367

howard-e opened this issue Apr 9, 2025 · 2 comments
Assignees
Labels
agenda To be added to community group agenda discussion

Comments

@howard-e
Copy link
Contributor

howard-e commented Apr 9, 2025

Background

We need to collect accurate screen reader utterances from mobile devices to help with testers manually collecting test results. That is for:

  1. TalkBack on Android
  2. VoiceOver on iOS

There are two primary approaches under consideration:

1. On-Device Collection

Collect utterances directly from the device through available system APIs, integrations, hooks, etc.

Pros

  • Direct and real-time collection.
  • Output should be easier to parse and more reliable.
  • Better suit the needs of future efforts to automate this collection. Having direct access to on-device accessibility-related APIs and logs would maintain consistent results.

Cons

  • Feasibility in iOS is currently unknown (due to Apple's sandboxing of VoiceOver on iOS).
  • Susceptible to breakage from software updates.

2. Video Processing of Screen Recording

Better described in #1315 (comment)

Processing videos where the Assistive Technology's utterances are displayed on screen to be extracted via OCR and collected. These recordings should ideally be at the "system" level (device mic not included) to avoid any outside noise.

Pros

  • Platform-agnostic
  • No dependency on internal APIs

Cons

  • Potentially resource intensive (background worker(s) processing multiple videos at once).
  • In the long term, we'd like to re-use our solution in the automation system and this may not be well suited. Unknowns around screen recording in automated environments, no access to on-device debugging logs in the case of unexpected failures and others.

Current Status

TalkBack on Android, On-Device

Work started here with success through scripts hosted at bocoup/aria-at-talkback-capture. Is able to capture utterances from Android when a test's "Run Test Setup" button is activated.

VoiceOver on iOS, On-Device

A similar effort has not yet been started but has been planned. Will most likely circumvent OS' "protections" and likely to require discussions and approval directly from Apple.

Video Processing of Screen Recording

Planned. Not yet started.

Questions

  1. Should we make an effort to build a user interface so others can more easily evaluate the aforementioned TalkBack on Android solution?
  2. Should work on the on-device approach be halted until the VoiceOver on iOS prototype is in a place to better facilitate discussions with Apple?
  3. With short-term collection for manual testing being the focus, the video processing route seems to be the fastest. Should any work be started on this as a "just-in-case"?
  4. If 3 is done, would a hybrid approach of the TalkBack collection + video processing for iOS be appropriate in the short term while discussions with Apple continues fully commit to the video processing route until discussions with Apple are resolved?
@ccanash ccanash added the agenda To be added to community group agenda label Apr 9, 2025
@mcking65
Copy link

mcking65 commented Apr 17, 2025

@howard-e

Could you add a comment with a high-level outline of what the user flow would be like for the on-device android prototype?

Would users be able to run individual tests on an android device and then go to the test runner on a laptop and fill in assertion verdicts? Could they have the runner open on both at the same time and see AT responses get populated after running each command?

Would the responses be fed to the test runner via browser? Or, would we connect the android device to a laptop and collect the responses via a laptop that is running the test runner?

@howard-e
Copy link
Contributor Author

howard-e commented Apr 17, 2025

Sharing follow up thoughts on a workflow here:

  • User opens test page on computer
  • User ensures Android device is connected to computer
  • User can click "Open Test Page" or a new button button saying "Open Test Page on connected Android" (since this would be an Android-only test, the language could be more direct).
    • Some appropriate error displayed if device not found.
    • If TalkBack isn't already enabled, it can be automatically done at this point.
  • User activates the "Run Test Setup" button on their Android device.
  • User is notified that the TalkBack utterances are being captured up until they get to the end of the example or close the window on the device.
  • Relevant utterances will be getting captured in the background as they move through the test on their device. User completes their testing (getting to end of example or closing the test window on the device)
  • Utterances are available on connected computer's clipboard which they can then paste in the relevant test page's output textbox

@mcking65 @jscholes does this align with our earlier discussion?

cc @ccanash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agenda To be added to community group agenda discussion
Projects
None yet
Development

No branches or pull requests

3 participants