New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follow up of #14312: ensure IO completion routines and waitable timers are safe #14627
Conversation
See test results for failed build of commit 37c02035ea |
Never mind, I think it's too much for 2023.1. I really want to kill this issue with APC's once and for good, and it's probably best to bundle that. |
See test results for failed build of commit f66ba17393 |
See test results for failed build of commit 61f2cba87a |
See test results for failed build of commit 366b48eca7 |
@michaelDCurran and @seanbudd I leave it up to you whether this can go in 2023.1. I think there are some aspects counting against delaying this to 2023.2.
|
See test results for failed build of commit 400e746420 |
I'm somewhat unsure if I'm writing this to the right place but I thought to ask if you have encountered following errors. I'm using Albatross. Build is based on current main branch code. I think this is rarely occuring issue. I may have encountered this some time ago but this time I investigated log file. When NVDA has run several days (which is common in my case) investigating log file, especially when debug level is in use, may be quite hard due to size of file. I can attach more detailed log entries from this session if needed. "..." means skipped lines): ... |
These access violation errors are exactly the type of errors this pr wants to fix indeed. |
@michaelDCurran and @seanbudd I went again over this, ensuring that this is backwards compatible now. Would love it if this could go into 2023.2. |
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
See test results for failed build of commit 9e9742ec38 |
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
See test results for failed build of commit 157546c997 |
Related to #14899, #14312, #14627 Fixes #14895 Summary of the issue: Despite several attempts to fix this, NVDA's IoThread can crash without a clear cause. Description of user facing changes Less crashes, most likely, as tests indicate that this is the case. Description of development approach As proposed by @jcsteh , rather than creating a new function pointer for every APC or completion routine call, use a single internal APC and completion routine and use an internal cache to store the python functions, not the actual APC functions.
Fixup of #14924 Summary of the issue: In #14627, we introduced weak references for APCs called as part of a waitable timer. In #14924, this was made more robust by using a single internal APC func. However in the porting process, a part of the logic was reversed, therefore in the internal APC store, we still stored strong rather than weak references. Description of user facing changes None. Description of development approach Store references instead of functions in the apc store.
Link to issue number:
Follow up of #14312
Summary of the issue:
In #14312, we decoupled the background I/O thread from the braille module. However, the thread was still bound to IoBase in it's _initialRead method unless you'd override it.
While I hoped the changes in #14312 would decrease crashing of braille, this was indeed the cause. However, crashes are still occurring here and there. I'm pretty sure I know the cause lies in the IO done completion routines that are still executed on the IO thread without ensuring that the instances of the routines still exist. I've seen this causing several access violation errors.
Furthermore, #14312 saved bound methods in a dictionary on the iOThread. While this shouldn't be a big problem, this could cause potential issues with garbage collection (i.e. instances being kept alife forever because the APC of an instance was never called and it would be stuck in the cached apc dictionary).
Description of user facing changes
Hopefully, more stability.
Description of development approach
Testing strategy:
Known issues with pull request:
This creates a new CFunc instance for every called completion routine and therefore for every async read. I think this is the safest method to both avoid crashes and ensure that routines won't be left behind.
Change log entries:
Bug fixes:
For Developers:
API deprecations:
Code Review Checklist: