-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyThread assumes pthread_key_t is an integer, which is against POSIX #69844
Comments
While trying to port Python over to a new platform (CloudABI), I noticed a couple of compiler errors in PyThread_create_key(), PyThread_delete_key(), PyThread_delete_key_value() and PyThread_set_key_value() caused by fact that pthread_key_t is converted to an integer (and vice versa) POSIX doesn't seem to require that pthread_key_t is an integer or any other arithmetic type: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html Old revisions of the standard did require pthread_t to be an arithmetic type, but this requirement was dropped later on. In my opinion we should strongly consider changing the API, so that we can treat the key created by pthread_key_create() or returned by TlsAlloc() as an opaque type. |
I agree--this has the same problem on Cygwin, where pthread_key_t is not just a typedef'd integer (in fact it's a pointer to an instance of a class). Anyways as Ed wrote above POSIX says this is supposed to be an opaque type and there's no reason to assume it's an integer. Linux could change its definition tomorrow and we'd have no one to blame but ourselves if it breaks Python. If it's too hard to change the Python API, at the very least the change in bpo-22206 should be reverted or reworked somehow, because there's no reason to assume that pthread_key_t can even be compared safely to an integer, much less that it would be less than INT_MAX. |
What do you suggest? Python C code bases uses "long" everywhere. |
I'm not really sure what "long" has to do with it... The problem is that the PyThread API uses ints to represent TLS keys, and has for about as long as it's existed (maybe what you meant by "long"). But when support for native TLS was added (bpo-9786 for pthread, sometime earlier for Windows) , the faulty assumption as made in several places that this API (i.e. the type of key is "int") should always map perfectly onto native APIs, and it doesn't. There are several places for example where an int key is passed to pthread_getspecific and pthread_setspecific (http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_getspecific.html). On Linux the compiler happens to allow this because pthread_key_t is defined as "unsigned int" So yeah there's an implicit cast, but since it counts up from zero it usually works. Likewise TlsAlloc on Windows expects the key to be a DWORD but the compiler will accept an int. This is really an unsafe assumption though, especially when PyThread_create_key casts the key to an int, and later reuses the (possibly not safely cast) key with PyThread_delete_key/get_key_value/set_key_value). This was brought up at the time, where MvL wrote:
One possible workaround without changing the existing API would be this: Each native support wrapper should also provide a *safe* mapping between its native key types and ints, to support the PyThread API. For example, the pthread interface could maintain a linked list or an even an array of pthread_key_t pointers, and use the int "key" as the index into that list. If I understand correctly this should be basically harmless since the same key (and hence key -> native-key mapping) can be shared across threads. |
(Of course, maintaining such a list might take some care, but only when creating and deleting keys--it wouldn't add any overhead to using them to get/set values.) |
I'd say that sounds reasonable, but most likely it will only be someone working with one of the impacted platforms who will have the motivation to come up with a patch. Especially since neither of the impacted platforms is current formally supported (meaning, we don't yet have anyone on the core team working with those platforms). We have, it seems, arrived at the time that MvL foresaw.... :) |
The good news about this (in the pthread case) is that it does not need to be seen as some workaround for unusual platforms, but rather making the existing code more POSIX-compliant (and hence correct). The Win32 side I'm a little less worried about because the TLS key is documented to be a DWORD, and so the existing casting to int should still work fine most of the time. However, the docs for TlsAlloc() (https://msdn.microsoft.com/en-us/library/windows/desktop/ms686801(v=vs.85).aspx) do state:
which again makes bpo-22206 suspect :( |
FWIW I've created an initial patch for this. Seems to work fine, but it's a bit of a mess and I have a few small implementation concerns. I'll try to get it cleaned up sometime in the next few days and I'll post it. |
Here's a first stab at a patch for this. A linked list is maintained which maps pthread_key_t instances to an int "key_id" that is used for the PyThread API. Each function needs to map a given key_id to the actual pthread_key_t. This adds a little bit of overhead of course, but in typical usage I don't think there are many entries in this list to loop over. This also reverts the change from bpo-22206 which is no longer relevant. No synchronization is provided for PyThread_create_key and PyThread_delete_key, but in typical usage that would be the user's responsibility anyways. Otherwise every PyThread_*_key function would have to have a mutex around it. |
-1 for your change bpo-25658-1.patch. It adds a O(n) slowdown to PyThread_set_key_value() whereas the performance of this function matters. Moreover, the code currently works fine on Linux, I fail to see why we should make Python slower to support a new platform. The PEP-11 explains the criteria to support a new platform in Python. Before we start to officially support CloudABI, you should start to work on a fork. When you will get enough users and have a better view of all requires changes, you can come back with a full plan to support officially this platform. |
Hi, First of all, I should be clear this is not just about CloudABI. In fact I don't even know what CloudABI is. I'm personally working toward getting a working/stable buildbot for Cygwin, so that I can move Python toward supporting it again. Cygwin has not been officially unsupported, but support for it is broken due to lack of a buildbot and lack of a maintainer, both of which I'm willing to offer. I can't get a stable buildbot though until some issues with the build are resolved, and this is a major blocker. If you prefer I can also work on a more formal plan to clarify Python's support for Cygwin which is currently in limbo. Not saying a solution to this needs to be merged ASAP as I can work on a branch, but in the meantime it's nice to have an initial fix for the issue so that it can be worked around--I'm sure other platforms that have this issue, such as the CloudABI folks, will appreciate that as well. Second of all, the point must be made that I'm not suggesting this be changed just in order to support some specific platform. The fact remains that the existing code is misusing the pthread API and is fragile to begin with. "It works on Linux" is not an excuse--not only does it suggest a bias, but the fact that it even works on Linux is an accident of implementation. The implementation of the pthread_key_t type could change tomorrow, and it would be Python's fault if that breaks things (not that I think that's at all likely to happen, though I'd be less surprised if OSX suddenly changed :) As for the performance I agree that PyThread_(get|set)key_value should be fast. This may be O(n) but how big, typically, is n? In a normal run of the Python interpreter n=1, and the autoTLSkey is always the first in the list. It is only larger if some third-party code is using the PyThread API for TLS (and in most typical uses this will only be a few keys at the most, though I couldn't even find any examples in the wild where third-party code is using these functions). n could also grow on forks, or manual Py_Finialize/Py_Initialize. This patch didn't implement PyThread_ReInitTLS but it could, in which case it would reset next_key_id to 0 for each child process. Likewise, PyThread_ReInitTLS could also be called from Py_Initialize, which should be harmless. So TL;DR I'm not really convinced by the performance argument. But even if that were an issue other implementations are possible; this was just among the simplest and least wasteful. |
I think that PEP-11 also doesn't rule out changes in this area. For example, consider the paragraph starting with the sentence "This policy does not disqualify supporting other platforms indirectly. [...]" Attached is a patch that accomplishes the same with a constant running time for operations on keys by using an array. Allocation of new keys runs in expected constant time by using a randomised allocation pattern. If we're not big fans of using randomised access, changing this to a linear scan would already be an improvement: keys are only allocated infrequently, but hopefully accessed quite often. |
Ah, I wasn't thinking clearly toward the bottom of my last message. I see now that after a fork, _PyGILState_Reinit calls PyThread_delete_key followed by a new PyThread_create_key--in the current version of my patch that would result in putting the autoTLSkey at the end of the linked list, which is no good. That could be worked around, but... Ed's version looks good to me. I had the same idea as an alternative, though was a little concerned with the possibility that the array could grow too large. But as I wrote in my last message that would be an extreme case. And regardless his version will at least maintain constant time, so +1. |
Hi, I came from bpo-28656. |
It can also be a structure or a union. |
Umm, API seems a design that is passing into function by integer or pointer because the users don't need type detail. I think the implementation of complex type is not realistic. Actually, CouldABI and Cygwin are used pointer, and type detail is hided. |
CloudABI uses a structure type, which is exactly why I filed this bug report. Instead of speculating about what kind of type existing implementations use, please just focus on the specification to which we're trying to stick: POSIX. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html "All of the types shall be defined as arithmetic types of an appropriate length, with the following exceptions: [...] |
On, I got complex type in cloudABI. I see my patch doesn't solve it. https://github.com/NuxiNL/cloudlibc/blob/master/src/include/sys/types.h#L94 |
I wrote a patch to avoid compile error for platforms that pthread_key_t is not integer. And, I would propose new thread local storage API based on C11 thread and current TLS functions move to deprecated. C11 tss_t doesn't require defined as integer [1]. Therefore I think new API should use tss_t, not hide into integer. [1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf (page 394) I'm thinking of new interfaces. For example, declaration in Include/pythread.h /* Specialise to each platforms using #if directive */ /* Based on C11 threads.h, but destructor doesn't support.
the delete value function is maintained for the implementation by CPython self.
*/
PyAPI_FUNC(int) PyThread_tss_create(Py_tss_t *);
PyAPI_FUNC(void) PyThread_tss_delete(Py_tss_t);
PyAPI_FUNC(void *) PyThread_tss_get(Py_tss_t);
PyAPI_FUNC(int) PyThread_tss_set(Py_tss_t, void *);
PyAPI_FUNC(void) PyThread_tss_delete_value(Py_tss_t); |
To me, Masayuki's patch is an acceptable work-around for the time being. I don't *like* it because the fact remains that native TLS can work on these platforms and falling back on Python's slower implementation isn't necessary. That said, the previous patch attempts also add additional overhead that shouldn't be necessary. I agree the best thing to do would be to change this API, but that is a bigger issue I guess... |
Looks good to me! |
Elik, Ed, I have overlooked tracemalloc module raises deadlock if apply the patch. /* If your OS does not provide native thread local storage, you can implement
it manually using a lock. Functions of thread.c cannot be used because
they use PyMem_RawMalloc() which leads to a reentrant call. */
#if !(defined(_POSIX_THREADS) || defined(NT_THREADS))
# error "need native thread local storage (TLS)"
#endif Py_HAVE_NATIVE_TLS is used only on thread source code. Therefore C Compiler couldn't report error about tracemalloc. I'm sorry that I didn't check test. |
I wrote a patch based on msg281227's idea. I confirmed to pass related tests (test_capi, test_threading and test_tracemalloc) on Cygwin x86 and Ubuntu 16.04 x86. This patch adds to change CPython about:
|
I'm still pretty happy with the previous patch, personally, since I don't need the tracemalloc module. But it seems like that should be fixed (or if nothing else that code in _tracemalloc.c should check Py_HAVE_NATIVE_TLS too). I like the idea of the new PyThread_tss_ API. At first I wasn't sure because I thought you implied that it would use tss_t and related APIs from C11 which was going to be a non-starter (as CPython has only just barely started using *some* features from C99, per the recent update to PEP-7). But I see in your patch that the naming is only inspired by C11 (and could be consistent with it if CPython ever moves toward C11 support). I imagine this will likely require a PEP though? I would happy to help draft one. |
After Erik posted PEP-539 draft, we've discussed features of new API on the Python-ideas [*]. As the result of discussion, features of new API have been changed below points. Thus, I updated patch based on the result.
[*] https://mail.python.org/pipermail/python-ideas/2016-December/043983.html |
Thanks Masayuki for the updated patch, and especially for the additional test cases. Looking at the patch, it occurs to me that this solution seems to be the minimal needed to fix the issue that we were originally trying to fix, without adding too much additional complexity or new semantics to how TLS keys are used in the interpreter. In other words, this preserves the existing usage with minimal changes except to better support opaque key types. So I think it's a point in favor of this change that's managed to remain focused. I'll work on updating the PEP draft to reflect the additions. |
I commented at the Rietveld, since I think that patch needs a bit more modification.
|
Above said, I updated minor changes to the version 2 patch. Several codes have kept the words "thread local" and "TLS" because they have pointed programming method or other meanings, not CPython TLS API itself. (e.g. _decimal module) |
Hi people working on the new TLS API: I would like your opinion on a related API, issue bpo-29881: Add a new private API for "static C variables" (_PyStaticVar) to clear them at exit ! |
Victor, |
Noting a design consideration that I only picked up in the latest PR review: when exposed as part of a non-opaque struct, the type used for TSS keys becomes part of Python's ABI, which means the API design in the PEP is going to have to handle making the struct fully opaque when Py_LIMITED_API is defined. My proposal for handling that:
My main rationale for going to that level of effort is that the current thread local storage API is available when Py_LIMITED_API is defined, so its replacement should be available as well. However, exposing NATIVE_TSS_KEY_T as part of the stable ABI would prevent us from switching from If issue bpo-29881 ever became a public API, it would then be useful for managing the lifecycle of dynamically allocated TSS keys in extension modules that restrict themselves to the stable ABI. |
Hi, I attempted Nick's proposal and removed unused codes from TLS implementation (bpo-30279, bpo-30832). This change looks good to me (PR 1362).
These functions have handled memory deallocation to key-value storage on the own implementation, but above said, currently the own implementation is gone. It means:
We (or at least I) won't get back to the own implementation, therefore, I suggest to omit these functions for new TSS API. See proposed updates ma8ma#1 |
Nice :) With the legacy code cleanups merged, I'd say the next step would be to update the PEP with the simplified API and the explanation for why the removed functions are no longer needed (i.e. we're making native TSS support a hard dependency for 3.7+). I'm also thinking we may want to update PEP-11, akin to the entry for the legacy ASCII-only C locale: https://www.python.org/dev/peps/pep-0011/#legacy-c-locale I'll handle the PEP-11 update after PEP-539 is approved, the PEP itself would just need a new Platform Support Changes section like the one in PEP-538: https://www.python.org/dev/peps/pep-0538/#platform-support-changes |
Nick and Erik, thank you for the comments! I merged proposal into the PR. Well, I'm interested in the hide implementation detail for TSS API (lately, I've read the python-ideas thread "PEP: Hide implementation details in the C API" which Victor opened [1]). Present, the draft of new API has given two methods for thread key initialization for the non-limited API (i.e. Py_tss_NEEDS_INIT for statically, PyThread_tss_alloc for dynamic). The static initialization needs implementation detail for thread key, but Py_tss_t is designed as an opaque data type based on the stable ABI and we don't feel like to open the implementation detail to the API client. On the other hand, we'd provide newly thread key (de-)allocation functions for the limited API, the API client is able to get an initialized value without knowing thread key detail. And also the functions can be used on the non-limited API. Therefore, I think it makes more sense that all API clients use PyThread_tss_(alloc|free) regardless of the limited API. The reason is which are (1) Py_tss_t behaves as an opaque data type as expected for any API client (cannot read and write directly the fields in any case), (2) the API gets more consistency (just one method for key initialization on the API client). TL;DR: I'd suggest to make key type strict, what do you think about below changes?
[1] https://mail.python.org/pipermail/python-ideas/2017-July/046399.html |
oh, I found a mistake.
|
Since previous comment, I've studied the switch for show/hide implementation detail. As the result, I have understood the Py_BUILD_CORE macro hasn't been generally used for hiding implementation detail (and Py_LIMITED_API does the part). |
FYI, PEP-539 was accepted (see python-dev threads [1] [2]). [1] https://mail.python.org/pipermail/python-dev/2017-August/149091.html |
This has been merged now: 731e189 Thank you for the PEP & implementation! :) |
Would it be possible to define Py_tss_NEEDS_INIT as a constant variable instead of a static initialiser? That would enable its use also for non-static initialisations. |
Stefan: I'd expect so, but that's best covered by a new RFE and an associated PR (I'm not sure what you mean from the text description, but I assume it will be obvious as C code) |
It seems that there's a simpler way that uses a cast on the literal. I added a pull request. Looks simple enough to not merit its own ticket, I think. |
Seems like this isn't trivial, so I created a new ticket for this. See bpo-31828. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: