-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
locale can be imported at startup but relies on too many library modules #53757
Comments
Consider the two following commands, from an SVN working copy: $ ./python -c 'import heapq; print(heapq.heapify)'
<built-in function heapify>
$ cat | ./python -c 'import heapq; print(heapq.heapify)'
<function heapify at 0x7f5d456025a0> In the second case, the "from _heapq import *" in heapq fails silently. The reason was a bit difficult to find, but it goes as following:
The issue might seem rather exotic, but it is a more general symptom of the problem that importing Lib/locale.py when initializing std{in,out,err} brings too many import dependencies. A possible solution is to define a subset of locale.py that is used for bootstrap and rely on as little modules as possible. |
Attached patch fixes the issue by creating a separate "_bootlocale" module, used at bootstrap, which doesn't import collections, functools and friends. |
Updated patch also replaces imports of locale in site.py (since it can be imported early on startup). |
Chances are that the patch will also fix the test_heapq failures on some buildbots (see e.g. http://www.python.org/dev/buildbot/3.x/builders/x86%20Tiger%203.x/builds/759/steps/test/logs/stdio ). |
Antoine Pitrou wrote:
I don't think that's the right way forward. It's much easier to either remove the need to import those Both collections and functools are used in two distinct places. It would be worthwhile adding a note to the top of the module |
The latter has various issues, such as being overly tedious and The former is easier said than done, and we may end up writing poor
Use of non-builtin stdlib modules is legitimate for many functions which (note, this is already done elsewhere, for example _abcoll.py or |
Antoine Pitrou wrote:
Really ? collections is only used in format_string and can easily
If you look at the code, that decorator is not needed at all. Furthermore
Please be more specific. The modules that have an impact
I said "warn about the use of non-builtin modules", not disallow
Those are special cases since they are needed for ABCs. locale |
The purpose of the exercise is to avoid, as much as possible, the You are proposing that we commit a one-time fix to fix the current
"special enough" is in the eye of the beholder. |
Just to comment on the import lock issue when importing from within a function, it is a problem. If an import triggers the launching of a thread which calls these functions that have an import inside of them, deadlock will occur. This has been reported as an issue before so people do it (unfortunately). Importing within functions is something to be avoided when possible. |
The problem only exists for developers, not for an installation copy of Python? Another approach is to call required site code earlier (eg. rewrite it in C and execute it before loading the locale module). |
This particular problem indeed (for developers and for buildbots - see
Indeed, but since it calls sysconfig.get_platform(), I'm not sure how |
Oh, the function is prefixed by the following comment: # XXX This should not be part of site.py, since it is needed even when
# using the -S option for Python. See http://www.python.org/sf/586680
def addbuilddir(): Issue bpo-586680 was closed as wontfix. -- Oh yes, sysconfig.get_platform() is complex :-/ Brett wrote "If we think once can reliably add the directory based purely on whether it starts with "build/lib.", and ..." (msg73411). |
Antoine Pitrou wrote:
No, what I'm proposing is to make "import locale" safe during The changes I'm suggesting will be beneficial to all |
I do, and my experimentations show it. Believe me, variations of this problem have been bothering us often "Trying to be careful with imports in a large stdlib module" doesn't cut (you might ask why this problem hasn't affected 2.x, and that's because
It doesn't. When sitecustomize.py gets run, everything is already set up and there's
Standard uses of the module aren't problematic at all, and importing |
As we move more and more infrastructure into Python code, we're going to see this pattern (i.e. a bootstrap module underlying the real module) more and more often (e.g. I seem to recall Brett has to do something similar when playing with the pure Python __import__ implementation). I don't have an issue with it - it's a solid, standard solution to a recurring problem with otherwise circular dependencies. The only changes I would suggest to Antoine's patch are to explicitly scope "interpreter startup" in the _bootlocale docstring (to make it clear that sitecustomize.py should use the ordinary locale module) and to mention the nature of _bootlocale in a comment just before the "from _bootlocale import *" line in locale.py. |
Antoine fixed bpo-9589 by rewriting site.py code in C and calling it more much earlier: r83988. This commit fixes the initial problem of this issue: $ ./python -c 'import heapq; print(heapq.heapify)'
<built-in function heapify>
$ cat | ./python -c 'import heapq; print(heapq.heapify)'
<built-in function heapify> Can we close this issue, or do you consider that it is still very important to not load too much modules at startup? |
The patch eases the bootstrap constraints by creating a It also depends whether it is important to have a lightweight |
Nick Coghlan wrote:
I don't object to such strategies in general, but in this case, The locale module already uses a C helper module. By now adding |
It looks like only locale.getpreferredencoding() is needed to initialize standard streams. Another option is to add a locale encoding codec. I already proposed the idea in bpo-13619 but it was rejected. |
Updated patch for 3.4. |
+1 This seems like a reasonable solution. |
The io module doesn't need to set temporarly the LC_CTYPE locale (which is a good thing because the change is process-wide!). If we ignore systems where CODESET is not available, the _bootlocale can be simplified to a few lines: if sys.platform.startswith("win"):
def getpreferredencoding():
import _locale
return _locale._getdefaultlocale()[1]
else:
def getpreferredencoding():
result = nl_langinfo(CODESET)
if not result and sys.platform == 'darwin':
result = 'UTF-8'
return result This code can probably be implemented in C, directly in the _locale module. Would it be acceptable to modify the io module to replace locale.getpreferredencoding(False) with _locale.getpreferredencoding(False)? Does anyone know if Python does still support systems where CODESET is not available? Which OS does not support CODESET? Would it be acceptable to fallback to locale.py if CODESET is not available? |
The locale module uses only collections.abc.Mapping. The import of the entire collections module can be avoided if collections.abc is renamed to _abcoll again. functools.wrap() can be replaced with: localeconv.__doc__ = _localeconv.__doc__ The other attributes are either equal (e.g. __name__) or do not exist on builtin functions (dict). |
Here is the patch implementing my proposition: bootlocale3.patch. |
"Does anyone know if Python does still support systems where CODESET is not available? Which OS does not support CODESET?" I checked my VMs with Python, nl_langinfo(CODESET) works on:
I also tested my patch on Windows 7: _bootlocale.getpreferredencoding() works as expected (it returns "cp1252"). |
You could raise an error and wait until somebody files a complain. :) |
New changeset fbbf8b160e8d by Antoine Pitrou in branch 'default': |
Just a quick favour to ask people: please post benchmark numbers of startup_nosite and normal_startup with your patches, otherwise we are taking stabs in the dark that the code complexity being suggested is worth it. |
Here normal_startup and startup_nosite wouldn't show a difference (under Linux anyway) because the locale module is only imported for non-interactive streams, AFAICT. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: