-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop support for 15-bit PyLong digits? #89732
Comments
Looking at issue bpo-35037 (which is a compatibility issue having to do with PYLONG_BITS_IN_DIGIT), I'm wondering whether it would make sense to drop support for 15-bit PyLong digits altogether. This would simplify some of the code, eliminate a configuration option, and eliminate the scope for ABI mismatches like that occurring in bpo-35037. There were a couple of reasons that we kept support for 15-bit digits when 30-bit digits were introduced, back in bpo-4258.
Now that we can safely rely on C99 support on all platforms we care about, the existence of a 64-bit integer type isn't an issue (and indeed, we already rely on the existence of such a type in dtoa.c and elsewhere in the codebase). As to performance, I doubt that there are still platforms where using 15-bit digits gives a performance advantage, but I admit I haven't checked. |
+1 "in theory". But I too don't know whether any platforms would be adversely affected, or how to find out :-( |
How about: create a fake test file test/test_xintperf with test case and test method(s) that run timeit with a suite of int operations and print report. Create 'draft' PRs for main and 3.10 (and 3.9?). Run just this test on buildbots. (I believe I have seen this done.) Improvement: write script to find buildbot results and consolidate. Is there already a template for something like this? If not, make one. |
I posted a request for information on usage of 15-bit digits to python-dev: https://mail.python.org/archives/list/python-dev@python.org/thread/ZICIMX5VFCX4IOFH5NUPVHCUJCQ4Q7QM/ |
Terry:
Sounds reasonable, though I'm not sure I know what exact timings I'd want to try. Maybe some of the stock integer-heavy Python benchmarks (pidigits, etc.). I realised that I have no idea whether any of the buildbots actually use 15-bit digits. I've created PR #74492 to find out. |
Results: we have two Gentoo/x86 buildbots, and a 32-bit Windows build in GitHub Actions: those machines use 15-bit digits, as a result of the logic in pyport.h that chooses 15-bit digits if SIZEOF_VOID_P < 8. Everything else seems to be using 30-bit digits. GPS pointed out in the python-dev discussion that the other platform we should be thinking about is 32-bit ARM. |
First step in #74682, which changes the default to 30-bit digits unconditionally, instead of having the default be platform dependent. |
Adding Stefan Behnel to the nosy, since Cython is one of the few projects that might be directly affected by this change. Stefan: can you see any potential problems with changing the default to 30 here? |
Cython should be happy with whatever CPython uses (as long as CPython's header files agree with CPython's build ;-) ). I saw the RasPi benchmarks on the ML. That would have been my suggested trial platform as well. The results look ok. Maybe the slowdown for pickling is really the increased data size of integers. And it's visible that some compute-heavily benchmarks like pyaes did get a little slower. I doubt that they represent a real use case on such a platform, though. Doing any kind of number crunching on a RasPi without NumPy would appear like a rather strange adventure. That said, if we decide to keep 15-bit digits in the end, I wonder if "SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably fast 64-bit multiply or not" kind of decision – however that translates into code. I'm sure there are 32-bit platforms that would actually benefit from 30-bit digits today. If we find a platform that would be fine with 30-bits but lacks a fast 64-bit multiply, then we could still try to add a platform specific value size check for smaller numbers. Since those are common case, branch prediction might help us more often than not. But then, I wonder how much complexity this is even worth, given that the goal is to reduce the complexity. Platform maintainers can still decide to configure the digit size externally for the time being, if it makes a difference for them. Maybe switching off 15-bits by default is just good enough for the next couple of years to come. :) |
Thanks, Stefan. I think I'm going to go ahead with the first step of making 30-bit digits the default, then, but leaving the 15-bit digit option present.
Agreed. And most platforms we care about _do_ seem to have such an instruction, so "30-bit digits unless the builder explicitly indicates otherwise - e.g., via configure options or pyconfig.h edits" seems reasonable. My other worry is division. It's less important than multiplication in the sense that I'd expect division operations to be rarer than multiplications in typical code, but the potential impact for code that _does_ make heavy use of division is greater. With 30-bit digits, all the longobject.c source actually *needs* is a 64-bit-by-32-bit unsigned division for cases where the result is guaranteed to fit in a uint32_t. If we're on x86, there's an instruction for that (DIVL), so you'd think that we'd be fine. But without using inline assembly, it seems impossible to persuade current versions of either of GCC or Clang[*] to generate that DIVL instruction - instead, they both want to do a 64-bit-by-64-bit division, and on x86 that involves making a call to a dedicated __udivti3 intrinsic, which is potentially multiple times slower than a simple DIVL. The division problem affects x64 as well: GCC and Clang currently generate a DIVQ instruction when all we need is a DIVL.
Yes, I think that could work, both for multiplication and division. [*] Visual Studio 2019 _does_ apparently provide a _udiv64 intrinsic, which we should possibly be attempting to use: https://docs.microsoft.com/en-us/cpp/intrinsics/udiv64?view=msvc-170 |
I'd like to flip this around and make sure that digits are always half words. This is desirable because it gives us some nice invariants with more efficient integers, #101291:
I'm assume the primary motivation for this change is to simplify the code, not performance. |
Having have 30 bit digits on 32 bits platforms is awkward for #101291 as the compact ints have only 29 bits. |
@markshannon Sounds reasonable to me. I think the recent change in direction in |
Based on the last comment from @mdickinson, it seems to me we can now close this issue. Correct me if I'm wrong. |
@erlend-aasland, yes, I too believe this issue has in fact been resolved. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: