-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test.regrtest has way too many imports #85884
Comments
Follow-up of bpo-40275. While investigating a crash on AIX (bpo-40068), I noticed that test_threading crashed because the test imports the logging module, and the logging has a bug on AIX on fork. I created an issue to reduce the number of imports made by "import test.support": https://bugs.python.org/issue40275 Thanks to the hard work of Hai Shi, "import test.support" now imports only 37 modules instead of 171! He split the 3200 lines of Lib/test/support/init.py into new helper submodules: bytecode, import, threading, socket, etc. For example, TESTFN now comes from test.support.os_helper. I measured the number of imports done in practice using the following file, Lib/test/test_sys_modules.py: import unittest
from test import support
import sys
class Tests(unittest.TestCase):
def test_bug(self):
modules = sorted(sys.modules)
print("sys.modules:")
print("")
import pprint
pprint.pprint(modules)
print("")
print("len(sys.modules):", len(modules))
def test_main():
support.run_unittest(Tests)
if __name__ == "__main__":
test_main() master:
3.9:
3.5:
2.7:
|
If I hack test.libregrtest.runtest to not import test.libregrtest.save_env, test_sys_modules imports only 148 instead of 233 modules. |
In general, it's nice to have the following 4 checks:
The problem is that because of these checks, **any** unit test file of the 424 Python test files import asyncio, multiprocessing and urllib. As a result, **any** unit test starts with around 233 imported modules. We are far from an "unit" test, since many modules have side effects. I wrote PR 22089 to remove these checks. "import test.libregrtest" is reduces from 233 to only 149 imports (on Linux), which is way better. |
On Windows with current master, the baseline for running anything with 1 import (">>> import sys; len(sys.modules)") is 35 imported modules. Adding "import unittest" increases this to 80. What slightly puzzles me is that running import unittest
import sys
class Tests(unittest.TestCase):
def test_bug(self):
print("len(sys.modules):", len(sys.modules))
if __name__ == "__main__":
unittest.main() increases the number to 90. Perhaps unittest has delayed imports. The current startup number for IDLE is 162, which can result in a cold startup of several seconds. I am thinking of trying to reduce this by delaying imports of modules that are not immediately used and might never be used. For tests, I gather that side-effect issues are more important than startup time. |
You could save/restore this data only when corresponded modules was imported, like it was done in clear_caches() in refleak.py. For example: # Same for Process objects
def get_multiprocessing_process__dangling(self):
multiprocessing_process = sys.modules.get('multiprocessing.process')
if not multiprocessing_process:
return set()
# Unjoined process objects can survive after process exits
multiprocessing_process._cleanup()
# This copies the weakrefs without making any strong reference
return multiprocessing_process._dangling.copy()
def restore_multiprocessing_process__dangling(self, saved):
multiprocessing_process = sys.modules.get('multiprocessing.process')
if not multiprocessing_process:
return
multiprocessing_process._dangling.clear()
multiprocessing_process._dangling.update(saved) |
It was my first idea, but some large modules like multiprocessing and asyncio are only imported by tested when the test file is imported, whereas save_environment() is called (enter) before the import in libregrtest.runtest. |
|
Serhiy: "You could save/restore this data only when corresponded modules was imported, like it was done in clear_caches() in refleak.py." That's a very good idea! I implemented it in PR 24934. But I modified runtest() to use *two* saved_test_environment instance. One before the test module is imported, one after. The one before is needed to check if the import itself has side effect, for example if the module body has side effect. The second is to check if running tests has side effect. The second one is more likely to have modules imported. The first one may miss some bugs, but IMO it's an acceptable trade-off. |
len(sys.modules) of msg376374 test_sys_modules:
The master branch imports 102 less modules than Python 3.9 (233 => 131)! Almost the half. asyncio, logging, multiprocessing, etc. are no longer always imported by default. |
Ok, the most important changes have been merged. Thanks everyone who helped on this large project! See also my summary email on python-dev: |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: