Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that all in-process log calls never block an app's threads #11431

Merged
merged 9 commits into from
Jul 30, 2020

Conversation

michaelDCurran
Copy link
Member

Link to issue number:

Re #11369 , #11398

Summary of the issue:

While NVDA is running code within other processes (such as for visutlaBuffers, displayModels and other helper functions), it occasionally logs messages back to NvDA for debugging purposes. Log messages are sent to NvDA via rpc. But currently these rpc calls block while the log message is being sent to NVDA, which may mean that an app's main thread could block for a time or in the worst case deadlock with NvDA, if the Python garbage collector in NVDA happens to run within the incoming rpc call and happens to release a COM object for the thread the log call originated from.

Description of how this pull request fixes the issue:

logMessage (in NVDAHelperRemote) will now do the following:

  • First it will log the message via the Windows debugging API (so that it shows in a debugger if conected). Previously this was done after the log message was sent to NVDA.
  • If NVDA's inproc manager thread is not yet running, or the current thread is not the inproc manager thread, then the log message is put in a queue for later retrieval in NVDA's inproc manager thread, and an APC function is queued to the inproc manager thread for later fetching and sending to NVDA.
  • If the current thread is the inproc manager thread, then any previously queued log messages are fetched and sent to NVDA, then the current log message is sent to NVDA. We deliberately process any previously queued messages before sending the current message to ensure that log messages remain in the correct order.

Just before the inproc manager thread finally exits, we also process any queued log messages to catch any queued after the manager thread main loop completed.

Testing performed:

Ran NVDA, interacting with Microsoft Word, Notepad and the Run dialog.
Added some temporary log calls in various places in the in-process code, such as dllMain, and when handing typed characters in the app's main thread, to ensure that message ordering is correct, and that log calls were always happening from the manager thread.

Known issues with pull request:

None.

Change log entry:

Bug fixes:

  • NVDA should no longer sometimes freeze in Microsoft Word when rapidly arrowing up and down.

LeonarddeR
LeonarddeR previously approved these changes Jul 28, 2020
Copy link
Collaborator

@LeonarddeR LeonarddeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i'm correct, we no longer limit inproc log messages to a level higher than debug, right? Is this intentional?

nvdaHelper/remote/log.cpp Outdated Show resolved Hide resolved
@michaelDCurran
Copy link
Member Author

@LeonarddeR Ah yes, I should have mentioned I removed the limitation around not logging debug level to NVDA. This limitation was originally there as there are so many debug calls and they would majorly slow down an app's main thread (say when debuggind displayModel) as each call would contact NvDA from that thread. Now as it is in a separate thread it shouldn't be so bad. However, note that by default (and therefore in all official builds of NVDA) the nvdaHelperLogLevel is lower than debug, and therefore the LOG_DEBUG macro is a no-op anyway. In other words, even now, to get messages at level debug, you still need to compile NvDA with nvdaHelperLogLevel at debug or higher.

@michaelDCurran
Copy link
Member Author

@LeonarddeR I have fixed the indentation now.

nvdaHelper/remote/log.cpp Outdated Show resolved Hide resolved
nvdaHelper/remote/log.cpp Outdated Show resolved Hide resolved
nvdaHelper/remote/log.cpp Outdated Show resolved Hide resolved
nvdaHelper/remote/log.cpp Show resolved Hide resolved
@michaelDCurran
Copy link
Member Author

@feerrenrut thanks for the useful review. I think the main reason why I had maxCount as an argument in the first place came from the fact I was trying to avoid an extra function and using log_flushQueue also as the apc function.
I have refactored things based on your feedback. Now:

  • log_flushQueue no longer takes a maxCount argument. It just flushes the number of items available in the queue. Note though it does still only call queue.size once at the beginning, one wouldn't want the possibility that another thread kept adding to the queue while flushQueue was fetching and it just never finished.
  • There is now a dedicated apc function which just calls log_flushQueue.
  • log_terminate has been removed and log_flushQueue is used instead.
  • log_flushQueue is also now called near the top of the manager thread once to fetch any previously queued log messages before the manager thread was started.
    I hope this addresses all of your feedback.

@feerrenrut
Copy link
Contributor

Yes, these changes look good.

Note though it does still only call queue.size once at the beginning, one wouldn't want the possibility that another thread kept adding to the queue while flushQueue was fetching and it just never finished.

I need to look into how APC works to really make assertions around this, but if these two functions can run concurrently then I'm concerned that there should be concurrency control on on this deque. Check the thread safety header on the following page also: https://en.cppreference.com/w/cpp/container

I'll come back to this.

@michaelDCurran
Copy link
Member Author

michaelDCurran commented Jul 29, 2020 via email

@feerrenrut
Copy link
Contributor

feerrenrut commented Jul 29, 2020

certainly two threads could push to the queue at the same time

I think we will have to protect this deque from this. See the detailed rules under the "thread safety" header on the following page also: https://en.cppreference.com/w/cpp/container

The gist of it is that reading from the container (eg const member functions) are fine, modifying the container not so.

@feerrenrut
Copy link
Contributor

There seems to be two thread safety problems:

  • Flush can happen at the same time as items being added, these both modify the queue.
  • Two threads could add items to the queue at the same time.

To prevent this we need to introduce locks. Preventing multiple threads from modifying the queue. The main concern is to watch for deadlocks, though data races should be kept in mind. By ensuring we only hold a lock while making changes to the queue, we can prevent deadlocks. The trick to this is localizing the code that needs a lock.

So, I see two options:

  1. Use two queues (it may no longer make sense to use deque specifically, but that is another concern). Have an active receive queue, and an inactive queue. When flush is called a lock is acquired just while they are swapped. Flush then empties the inactive queue. The same lock is acquired in log_message to ensure the active / inactive queues aren't swapped during the insertion/append/push.
  2. Continue with a single queue, but use a lock to protect during insert, and during the flush. While you could lock for each call to pop in flush, I'm not really comfortable to recommend it. Instead, after carefully considering the thread safety rules for STL containers, it might be ok to iterate through the elements logging them (up to maxCount) then lock and remove them in one call.

Assumptions:

  • Flush is only executed on the inproc Manager Thread, never concurrently. It would be good if we could ensure this by checking at the start of flush, and if not queuing it with APC and exiting early.

@michaelDCurran
Copy link
Member Author

@feerrenrut in log_flushQueue I now lock and swap. And in logMessage I now also lock.

Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this generally looks good. Just a couple of small questions, minor suggestions.

nvdaHelper/remote/log.cpp Show resolved Hide resolved
nvdaHelper/remote/log.cpp Outdated Show resolved Hide resolved
michaelDCurran and others added 2 commits July 30, 2020 19:27
…ally, just let them all get destroied when the queue is destroied.

Co-authored-by: Reef Turner <feerrenrut@users.noreply.github.com>
…r thread, queue the call back to the manager thread with apc.

Co-authored-by: Reef Turner <feerrenrut@users.noreply.github.com>
@AppVeyorBot
Copy link

* Only try to queue a call to log_flushQueue via APC if the manager thread actually exists yet.
* Declare log_flushQueue_apcFunc early so it actualy compiles :)
* Remove a redundant blank line.
@michaelDCurran
Copy link
Member Author

@feerrenrut I committed your suggestions but made a couple of small changes: We can only queue via APC if the manager thread actually exists yet. And log_flushQueue_apcFunc needed to be declared first. Does this all look okay now?

feerrenrut
feerrenrut previously approved these changes Jul 30, 2020
Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelDCurran michaelDCurran merged commit 10976a7 into master Jul 30, 2020
@michaelDCurran michaelDCurran deleted the asyncInprocLogging branch July 30, 2020 21:34
@nvaccessAuto nvaccessAuto added this to the 2020.3 milestone Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants