Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and log objects being cleaned up by the Python cyclic garbage collector, and remove several more reference cycles #11499

Merged
merged 7 commits into from Aug 18, 2020

Conversation

michaelDCurran
Copy link
Member

@michaelDCurran michaelDCurran commented Aug 16, 2020

Link to issue number:

Summary of the issue:

The lifetime of objects in Python are managed by reference counting. When an object's reference count hits 0, it is deleted straight away. However, if an object directly or indirectly holds a reference to itself, then the reference count will never reach 0 and therefore the object would not normally be cleaned up. Python therefore has a Garbage Collector (gc) which periodically searches through all objects that can hold references, and forcefully deletes any object found to be unreachable. As this check is periodic, and can happen in any thread, objects that participate in reference cycles may get deleted at strange or unpredictable times. This is especially problematic for comtypes COM objects as when they destruct, they call their IUnknown::Release method, which may block or cause other issues as it may be called from the wrong thread.

We try not to cause reference cycles in NVDA, but they easily can happen by accident.

It would be good if we were able to detect reference cycles, or at very least, when an object is being deleted by the garbage collector, and clearly log this.

In future we may also wish to take complete control of garbage collection and perform it at predictable times in our main thread, but for now at least knowing when it is already going wrong would be a big step forward in debugging.

Description of how this pull request fixes the issue:

This pr adds a new garbageHandler module to NVDA which hooks into the Python garbage collector, such that it knows when a garbage collection run starts and stops. It provides a module-level function (notifyObjectDeletion) which an object can call from its __del__ method. If this function detects that the caller is currently inside a garbage collection run (one is in progress and it is in the same thread) then a warning is logged stating that this object is being deleted by the Garbage collector.
Further to this, the very first time a deleted object is logged in this garbage collection run, an error is logged to clearly notify us that the garbage collector has found one or more important objects in this run. this error message also contains the full stack at this point, so that we can clearly see exactly where the garbage collector run happened, before anything bad happens (such as a freeze).
Of course the long-term goal is to not see any errors/warnings at all, as we will have removed all reference cycles.

the __del__ method on comtypes COM objects has been patched to call this function, thus we now can see in the log as soon as a COM object is about to be released due to the garbage collector.
So if for instance a COM object happened to be released by the garbage collector which caused a freeze, we now have very clear logging to understand what happened.

garbageHandler also provides a new TrackedObject class, which also provides a __del__ that does this. And now all major NVDA classes inherit from TrackedObject. Thus we now can see in the log when one of these objects is deleted due to the garbage collector. Object's that inherit from TrackedObject include nvwave.WavePlayer, the sayAll reader classes, eventHandler._eventExecutor, and baseObject.AutoPropertyObject (which therefore provides it for appModules, synthDrivers, globalPlugins, NVDAObjects, treeInterceptors etc).
Note that before Python 3.4 we could not have done this, as it used to be the case that any object with a __del__ would not be cleaned up by the garbage collector.

Thanks to this functionality, several more reference cycles were identified, and also addressed in further commits on this pr.

They include:

  • NVDAObjects.IAccessible.ia2Web.Editor was holding one of its descendants as _lastCaretObj (cached by MozillaCompoundTextInfo). As NVDAObjects usually cache their parents, this created a reference cycle. As we don't need _lastCaretObj once the Editor NVDAObject loses focus, it is now removed in event_loseFocus.
  • eventHandler._eventExecutor holds its 'gen' generator object on itself, so that its 'next' method can be called by various event levels. This reference was correctly cleaned up after the event was executed, but if there was an exception, this did not occur, and therefore the _event Executor was in a reference cycle until the garbage collector could later forcefully delete it. This may have also held around many execution frames containing local variables. Now, the reference is correctly removed in the 'finally' clause so that it always occurs even if there is an exception.
  • Our logHandler module had a bug (or is it Python's bug) where if a message was logged from within an 'except' clause where the Exception was exposed as a variable in the clause (E.g. except Exception as e), a reference cycle would be created on the frame holding that variable, thus all the frames (and local variables in each) to do with that exception were held until the garbage collector could forcefully delete them. This bug would have been new to Python 3, as Python 3 holds the traceback and frames on the Exception object. Python 2 does not do this. The bug was specifically in logHandler.getCodePath, where when fetching the given frame's local variables, they were being cached on the frame. It seems that a function Frame's f_locals member creates a new locals dictionary and caches it on itself each time this member is accessed. Therefore, we now clear this dictionary once we have fetched what we need.

Testing performed:

Ran NVDA for extended periods performing daily tasks. Running with just the first commit, causes a lot of errors to be logged about deleting unreachable objects. After the rest of the commits, I am so far not seeing any errors at all. Though there very well may be more -- but the whole point of this pr is to keep detecting them and fixing them.

Known issues with pull request:

None.

Change log entry:

tbd.

… collector. Includes COM pointers, and most NVDA-specific classes (AutoPropertyObject, _eventExecutor, WavePlayer, sayAll readers).
…dded by MozillaCompoundTextInfo, in order to break a reference cycle.
…hin an exception caluse where the Exception is exposed as a local variable within the clause.
…ce cycle when there is an exception when executing an event.
@michaelDCurran michaelDCurran added this to the 2020.3 milestone Aug 16, 2020
leonardder
leonardder previously approved these changes Aug 17, 2020
Copy link
Collaborator

@leonardder leonardder left a comment

Looks good to me

I think there are additional objects that can benefit from being a subclass of TrackedObject.

  • updateCheck.UpdateChecker
  • updateCheck.UpdateDownloader
  • Several GUI classes, including gui.SettingsPanel and gui.SettingsDialog

@josephsl
Copy link
Collaborator

josephsl commented Aug 17, 2020

feerrenrut
feerrenrut previously approved these changes Aug 17, 2020
Copy link
Member

@feerrenrut feerrenrut left a comment

Looks good! Thanks for the detailed PR description! One minor thing, but happy for you to merge this when you are ready.

_reportCountDuringCollection = 0
elif action == "stop":
if _reportCountDuringCollection > 0:
log.error(f"Found at least {_reportCountDuringCollection} unreachable objects in run")
Copy link
Member

@feerrenrut feerrenrut Aug 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice if this message said that it marked the end of this garbage collection run. Note that this error means that two errors will be logged if there are any object deletions.

@michaelDCurran michaelDCurran dismissed stale reviews from feerrenrut and leonardder via 0c401fe Aug 18, 2020
@michaelDCurran michaelDCurran merged commit 0098a31 into master Aug 18, 2020
1 check passed
@michaelDCurran michaelDCurran deleted the garbageHandler branch Aug 18, 2020
@leonardder
Copy link
Collaborator

leonardder commented Aug 18, 2020

I'm getting lots of errors when arrowing through a content recognition result.

@aaclause
Copy link
Contributor

aaclause commented Aug 18, 2020

Same here. Also I receive many errors as follows:

ERROR - garbageHandler.notifyObjectDeletion (09:33:00.254) - MainThread (16596):
Garbage collector has found one or more unreachable objects. See further warnings for specific objects.
Stack trace:
  File "nvda.pyw", line 215, in <module>
  File "core.pyc", line 550, in main
  File "wx\core.pyc", line 2134, in MainLoop
  File "gui\__init__.pyc", line 1050, in Notify
  File "core.pyc", line 519, in run
  File "IAccessibleHandler\__init__.pyc", line 907, in pumpAll
  File "IAccessibleHandler\__init__.pyc", line 573, in processGenericWinEvent
  File "appModuleHandler.pyc", line 132, in update
  File "appModuleHandler.pyc", line 117, in getAppModuleFromProcessID
  File "appModuleHandler.pyc", line 95, in getAppNameFromProcessID
  File "importlib\__init__.pyc", line 127, in import_module
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 963, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 906, in _find_spec
  File "<frozen importlib._bootstrap_external>", line 1280, in find_spec
  File "<frozen importlib._bootstrap_external>", line 1254, in _get_spec
  File "<frozen importlib._bootstrap_external>", line 1235, in _legacy_get_spec
  File "<frozen importlib._bootstrap>", line 441, in spec_from_loader
  File "<frozen importlib._bootstrap_external>", line 594, in spec_from_file_location
  File "comtypesMonkeyPatches.pyc", line 104, in newCpbDel
  File "garbageHandler.pyc", line 63, in notifyObjectDeletion
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <POINTER(IAccessible2) ptr=0x818ff34 at 9608670>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <NVDAObjects.Dynamic_EditableTextWithAutoSelectDetectionEditorMozillaIAccessible object at 0x0963CCF0>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <POINTER(IAccessibleAction) ptr=0x8190d44 at 58d2c60>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <POINTER(IAccessibleText) ptr=0x8190714 at 58d26c0>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <POINTER(IAccessible2) ptr=0x8191494 at 9608580>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <NVDAObjects.IAccessible.mozilla.Mozilla object at 0x0963CD90>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <POINTER(IAccessibleAction) ptr=0x8191764 at 98a4850>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <POINTER(IAccessibleText) ptr=0x81917f4 at 96083f0>
WARNING - garbageHandler.notifyObjectDeletion (09:33:00.270) - MainThread (16596):
Deleting unreachable object <POINTER(IAccessibleHypertext) ptr=0x8190f3c at 9608620>
ERROR - garbageHandler._collectionCallback (09:33:00.270) - MainThread (16596):
Found at least 9 unreachable objects in run
ERROR - garbageHandler.notifyObjectDeletion (09:35:49.142) - MainThread (11396):
Garbage collector has found one or more unreachable objects. See further warnings for specific objects.
Stack trace:
  File "nvda.pyw", line 215, in <module>
  File "core.pyc", line 550, in main
  File "wx\core.pyc", line 2134, in MainLoop
  File "gui\__init__.pyc", line 1050, in Notify
  File "core.pyc", line 519, in run
  File "IAccessibleHandler\__init__.pyc", line 907, in pumpAll
  File "IAccessibleHandler\__init__.pyc", line 573, in processGenericWinEvent
  File "appModuleHandler.pyc", line 132, in update
  File "appModuleHandler.pyc", line 117, in getAppModuleFromProcessID
  File "appModuleHandler.pyc", line 95, in getAppNameFromProcessID
  File "importlib\__init__.pyc", line 127, in import_module
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
  File "appModules\explorer.pyc", line 25, in <module>
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
  File "NVDAObjects\window\edit.pyc", line 11, in <module>
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
  File "oleTypes.pyc", line 13, in <module>
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
  File "comtypes\persist.pyc", line 83, in <module>
  File "comtypes\__init__.pyc", line 241, in __new__
  File "comtypes\__init__.pyc", line 329, in __setattr__
  File "comtypes\__init__.pyc", line 698, in _make_methods
  File "comtypesMonkeyPatches.pyc", line 18, in new_WINFUNCTYPE
  File "garbageHandler.pyc", line 23, in __del__
  File "garbageHandler.pyc", line 63, in notifyObjectDeletion
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <NVDAObjects.Dynamic_EditableTextWithAutoSelectDetectionEditorIa2WebIAccessible object at 0x09429070>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <POINTER(IAccessibleAction) ptr=0x7f4ffbc at 93bd670>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <POINTER(IAccessibleText) ptr=0x7f4ff74 at 95b6c60>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <POINTER(IAccessible2) ptr=0x7f500dc at 95b6d50>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <POINTER(IAccessibleHypertext) ptr=0x3e87234 at 95b6c10>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <NVDAObjects.IAccessible.ia2Web.Ia2Web object at 0x040E7D50>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <POINTER(IAccessibleAction) ptr=0x7ed1f6c at 95b67b0>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.163) - MainThread (11396):
Deleting unreachable object <POINTER(IAccessibleText) ptr=0x3e86394 at 95b6bc0>
WARNING - garbageHandler.notifyObjectDeletion (09:35:49.166) - MainThread (11396):
Deleting unreachable object <POINTER(IAccessible2) ptr=0x7ed208c at 95b6b70>
ERROR - garbageHandler._collectionCallback (09:35:49.166) - MainThread (11396):
Found at least 9 unreachable objects in run

@michaelDCurran
Copy link
Member Author

michaelDCurran commented Aug 18, 2020

@feerrenrut
Copy link
Member

feerrenrut commented Sep 1, 2020

As expected, there will be a number of issues detected by this code with lot of log entries. Found at least 96 unreachable objects in run using Firefox with Gmail: #11539

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants