Skip to content

Gecko vbuf: Don't render anything if the document is dead. #14647

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 27, 2023

Conversation

jcsteh
Copy link
Contributor

@jcsteh jcsteh commented Feb 17, 2023

Link to issue number:

This is intended to deal with the NVDA buffer infinite loop issue tracked in Mozilla bug 1786676, but it may also address NVDA vbuf crashes tracked in Mozilla bug 1691928 and Mozilla bug 1737688 (also tracked here as #13540).

Summary of the issue:

NVDA kills buffers for dead documents when a focus event is received. However, sometimes, when the user closes a tab, the closed document might die after the focus event for the newly active tab. Also, the focus event can be delayed sometimes; e.g. if loading a new document takes a while. When this happens, the buffer is still alive and listening for events, but the ids of nodes in that buffer might be reused by other documents! Trying to render those (completely arbitrary) nodes could cause a broken buffer tree (e.g. with loops), leading to instability later.

Description of user facing changes

Hopefully fixes occasional crashes and complete hangs in Firefox.

Description of development approach

Hold a COM reference to the document. When an event is received which ostensibly matches a node in the buffer, check if the document is alive. If it isn't, skip the node and clear the buffer.
To avoid excessive COM calls, only check this once per buffer update tick.

Another fix could be to have Firefox fire EVENT_OBJECT_DESTROY when a document dies. NVDA's vbuf could then catch that and clear the buffer. While I'm happy to look at implementing this in both Firefox and NVDA in future, this is currently rather tricky to do in Firefox because some of the information we need to get the HWND for the event is already gone by the time we destroy the document. I can probably cache this somehow, but there are obscure edge cases, made more complicated by supporting both the old and new accessibility engine architectures at the moment. I think this approach is going to be more reliable in the short term, plus it will work with existing versions of Firefox.

Testing strategy:

These hangs/crashes only occur occasionally and I've not found a way to reproduce them reliably. However, I've been running with this change locally for a while, watching for the log message that indicates this new code was triggered. (I changed the LOG_DEBUG to a LOG_CRITICAL locally to ensure I learned about these occurrences immediately.) I've seen it more than 5 times in the space of 12 hours, but no crashes or hangs. If any of these nodes had led us to render a broken tree (e.g. a tree with a loop), this could very well have caused a crash or an infinite loop without this code.

Known issues with pull request:

This will cause one additional COM call per buffer update tick (1 call per 100 ms maximum). I don't think this is going to be noticeable to the user, but it's worth noting.

Change log entries:

Bug fixes

  • NVDA no longer occasionally causes Mozilla Firefox to crash or stop responding.

Code Review Checklist:

  • Pull Request description:
    • description is up to date
    • change log entries
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • API is compatible with existing add-ons.
  • Documentation:
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • Security precautions taken.

NVDA kills buffers for dead documents when a focus event is received.
However, sometimes, when the user closes a tab, the closed document might die after the focus event for the newly active tab.
Also, the focus event can be delayed sometimes; e.g. if loading a new document takes a while.
When this happens, the buffer is still alive and listening for events, but the ids of nodes in that buffer might be reused by other documents!
Trying to render those (completely arbitrary) nodes could cause a broken buffer tree, leading to instability later.

To fix this, hold a COM reference to the document.
When an event is received which ostensibly matches a node in the buffer, check if the document is alive.
If it isn't, skip the node and clear the buffer.
To avoid excessive COM calls, only check this once per buffer update tick.
@jcsteh jcsteh requested a review from a team as a code owner February 17, 2023 00:23
@jcsteh jcsteh requested a review from seanbudd February 17, 2023 00:23
@jcsteh
Copy link
Contributor Author

jcsteh commented Feb 17, 2023

Another fix could be to have Firefox fire EVENT_OBJECT_DESTROY when a document dies.

I filed a bug to consider implementing this in future: https://bugzilla.mozilla.org/show_bug.cgi?id=1817349

@michaelDCurran michaelDCurran merged commit 872db60 into nvaccess:master Feb 27, 2023
@nvaccessAuto nvaccessAuto added this to the 2023.2 milestone Feb 27, 2023
seanbudd pushed a commit that referenced this pull request Jul 29, 2024
Fixes Chrome issue 41487612
Fixes Chrome crash introduced in NVDA pr #14647

Summary of the issue:
Google has detected crashes in Chrome when users are running NVDA.
Exact steps to reproduce are not known, but it is when an NVDA virtual buffer is destroyed. Could be on Chrome exit, NVDA exit, or closing a Chrome window.
Chrome issue: issues.chromium.org/issues/41487612
It shows that a COM object tries to be released from an RPC worker thread by NVDA's in-process code, which causes Chrome to crash.

NVDA pr #14647 introduced code to hold a reference to the document root accessible on the virtual buffer, so that NVDA could check if the document had died. However, it is this COM object that is automatically released when the virtual buffer id destroyed from an RPC worker thread.
The COM object should really however be released when the virtual buffer is terminated in the correct UI thread, before destruction.

Description of user facing changes
No longer cause Google Chrome to crash when closing a document or exiting Chrome.

Description of development approach
VBufBackend_gecko_ia2::renderThread_terminate: correctly release the document root accessible.
VBufBackend_gecko_ia2's destructor: in the very unlikely case where the VBufBackend_gecko_ia2::renderThread_terminate has not been called, detach the document root accessible, leaking it rather than inappropriately releasing it on the wrong thread.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants