-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importlib: module.__spec__ leaks importlib namespaces (test_importlib leaked xxx references) #84231
Comments
https://buildbot.python.org/all/#/builders/206/builds/119 test_importlib leaked [6303, 6299, 6303] references, sum=18905 Issue reported at: https://bugs.python.org/issue1635741#msg364845 It seems like the regression was introduced by the following change: commit 8334f30 (HEAD)
|
I can reproduce the leak with the following test, stored as Lib/test/test_leak.py: from test import support
import unittest
class Tests(unittest.TestCase):
def test_import_fresh(self):
support.import_fresh_module('importlib.machinery',
fresh=('importlib',),
blocked=('_frozen_importlib', '_frozen_importlib_external')) Extract of "./python -m test -R 3:3 test_leak" output: |
Before commit 8334f30, at the first _imp.create_builtin() calls, it calls _PyImport_FixupExtensionObject() which stores the created module in an internal "extensions" dictionary. PyInit__weakref() returns a module object. Next calls to _imp.create_builtin("_weakref") simply returned the cached _weakref module. At commit 8334f30, _imp.create_builtin() doesn't store it in the internal "extensions" dictionary, but calls PyModule_FromDefAndSpec() instead. PyInit__weakref() returns a module definition (PyModuleDef_Type: "moduledef"). Each _imp.create_builtin("_weakref") creates a new module. |
I can reproduce the leak with command: ./python -m test -R 3:3 test_leak File 1: import unittest
import sys
from importlib import _bootstrap as BOOTSTRAP
def _save_and_remove_module(name, orig_modules):
"""Helper function to save and remove a module from sys.modules
def _save_and_block_module(name, orig_modules):
"""Helper function to save and block a module in sys.modules
def import_fresh_module():
name = 'importlib3'
fresh = ('importlib',)
blocked = ('_frozen_importlib', '_frozen_importlib_external')
orig_modules = {}
names_to_remove = []
_save_and_remove_module(name, orig_modules)
try:
for fresh_name in fresh:
_save_and_remove_module(fresh_name, orig_modules)
for blocked_name in blocked:
if not _save_and_block_module(blocked_name, orig_modules):
names_to_remove.append(blocked_name)
with BOOTSTRAP._ModuleLockManager(name):
spec = BOOTSTRAP._find_spec(name, None)
module = BOOTSTRAP.module_from_spec(spec)
sys.modules[spec.name] = module
spec.loader.exec_module(module)
del sys.modules[spec.name]
class Tests(unittest.TestCase):
def test_import_fresh(self):
import_fresh_module() File 2: Lib/importlib3.py import _imp
import sys try: if case == 2:
_bootstrap = type(sys)("_bootstrap")
_bootstrap._weakref = sys.modules['_weakref']
class ModuleSpec:
@staticmethod
def method():
pass
spec = ModuleSpec()
spec.name = "_weakref"
module = _imp.create_builtin(spec)
module.__spec__ = spec
--- |
With the latest Lib/importlib3.py, there are less leaked references, but there are still leaked references: test_leak leaked [139, 139, 139] references, sum=417 |
Oops, you should read: File 1: Lib/test/test_leak.py |
Code extremely simplified: import unittest
import sys
import _imp
class ModuleSpec:
pass
class Tests(unittest.TestCase):
def test_import_fresh(self):
spec = ModuleSpec()
spec.sys_weakref = sys.modules["_weakref"]
spec.name = "_weakref"
module = _imp.create_builtin(spec)
module.__spec__ = spec
sys.modules["_weakref"] = module I still get a leak: I understand that sys.modules["_weakref"] is overriden by the test with a new module, and the new module holds a reference somehow to the previous _weakref module. The old and the new modules remain alive. |
Relationship between origin unmodified code and the extremely simplified code:
When importlib._bootstrap_external._setup() calls _bootstrap._builtin_from_name('_weakref'), _init_module_attrs() copies the "spec" into module.__spec__. But the spec contains a reference to importlib._bootstrap namespace. Before _weakref is converted to multiphase initialization: $ ./python
>>> import _weakref
>>> id(_weakref)
140119879580176
>>> id(_weakref.__spec__.__init__.__globals__['_weakref'])
140119879580176 => same module After the change: $ ./python
>>> import _weakref
>>> id(_weakref)
140312826159952
>>> id(_weakref.__spec__.__init__.__globals__['_weakref'])
140312826366288 => two different objects The problem is that module.__spec__ pulls the whole importlib package which contains tons of objects. |
Quick & dirty workaround: diff --git a/Lib/importlib/_bootstrap_external.py b/Lib/importlib/_bootstrap_external.py
index 7353bf9a78..d988552f2d 100644
--- a/Lib/importlib/_bootstrap_external.py
+++ b/Lib/importlib/_bootstrap_external.py
@@ -1620,7 +1620,10 @@ def _setup(_bootstrap_module):
setattr(self_module, '_thread', thread_module)
# Directly load the _weakref module (needed during bootstrap).
- weakref_module = _bootstrap._builtin_from_name('_weakref')
+ if '_weakref' not in sys.modules:
+ weakref_module = _bootstrap._builtin_from_name('_weakref')
+ else:
+ weakref_module = sys.modules['_weakref']
setattr(self_module, '_weakref', weakref_module)
# Directly load the winreg module (needed during bootstrap). But I think that the issue is larger than just _weakref.
|
That's because modules which don't use the multiphase initialization are cached in _PyImport_FixupExtensionObject(): see msg364914 for details. Next calls to _imp.create_builtin("_weakref") simply returned the cached _weakref module. It's loaded exactly once. |
From Victor:
You will have to be more specific than that as there is an import_state() context manager to control import state via the sys module.
Are you saying change everything on __spec__ objects to be static methods? That won't work because the whole point of __spec__ objects is to essentially be data classes to store details of a module.
No. You would break the world and undo years of work to clean up import semantics with no good way to go back to the old way. Can I ask why you suddenly want to throw __spec__ objects out due to a leak tied back to _weakref switching to multi-phase initialization? Also note that the use of weakrefs by importlib isn't tied to __spec__ objects but to per-module import locks in importlib._bootstrap. |
I close the issue. I pushed commit 83d46e0 which should not be controversial. In short, the fix is to remove two unused imports :-D The fix doesn't remove module.__spec__ nor avoid usage of _weakref, it only fix this issue (reference leak) by copying code from _bootstrap.py to _bootstrap_external.py. In fact, modules like _io were already correctly imported by _bootstrap_external._setup(). Only _thread, _weakref and winreg imports caused this bug. I don't think that it's worth it to backport the change to stable branches 3.7 and 3.8, since stable branches don't use multiphase initialization for _weakref. The two unused imports don't cause any harm in importlib._bootstrap_external.
I was thinking just aloud to try to find a solution for this reference leak. |
Since this reference leak is fixed, I reapplied Hai Shi's change for _weakref in bpo-1635741: New changeset 93460d0 by Victor Stinner in branch 'master': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: