Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-109653: Fix py312 regression in the import time of random #110221

Merged
merged 2 commits into from
Oct 2, 2023

Conversation

AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Oct 2, 2023

As an optimisation to reduce the import time of the module, random first tries to import sha512 from the internal _sha512 module before falling back to hashlib. The problem, however, is that Python no longer has a _sha512 module! It was removed in 0b13575, by @gpshead. That means we're currently always falling back to the slow path in random.py, leading to the import time of random being far slower than it should be.

Importing sha512 from the correct module in the fast path cuts 60% off the import time of random.

@AlexWaygood AlexWaygood added the performance Performance or resource usage label Oct 2, 2023
@AlexWaygood AlexWaygood changed the title Reduce the import time of random by 60% gh-109653: Reduce the import time of random by 60% Oct 2, 2023
@JelleZijlstra
Copy link
Member

Given that this is a regression, should we backport it into 3.12?

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Oct 2, 2023

Given that this is a regression, should we backport it into 3.12?

I was wondering that. I'd vote in favour of doing so, since it doesn't seem particularly high-risk to me. But I'd like to hear Raymond's and/or Greg's thoughts.

@AlexWaygood AlexWaygood added the stdlib Python modules in the Lib dir label Oct 2, 2023
@rhettinger
Copy link
Contributor

A backport to 3.12 would be reasonable.

@AlexWaygood AlexWaygood added the needs backport to 3.12 bug and security fixes label Oct 2, 2023
@AlexWaygood AlexWaygood enabled auto-merge (squash) October 2, 2023 22:30
@AlexWaygood
Copy link
Member Author

A backport to 3.12 would be reasonable.

Great, I've scheduled the backport. Thanks for the review!

@AlexWaygood AlexWaygood changed the title gh-109653: Reduce the import time of random by 60% gh-109653: Fix py312 regression in the import time of random Oct 2, 2023
@AlexWaygood AlexWaygood merged commit 21a6263 into python:main Oct 2, 2023
24 checks passed
@AlexWaygood AlexWaygood deleted the random-import-time branch October 2, 2023 22:56
@miss-islington
Copy link
Contributor

Thanks @AlexWaygood for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 2, 2023
…110221)

(cherry picked from commit 21a6263)

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
@bedevere-app
Copy link

bedevere-app bot commented Oct 2, 2023

GH-110247 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Oct 2, 2023
@bedevere-app

This comment was marked as duplicate.

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot aarch64 Debian Clang LTO + PGO 3.x has failed when building commit 21a6263.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/1084/builds/2183) and take a look at the build logs.
  4. Check if the failure is related to this commit (21a6263) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/1084/builds/2183

Failed tests:

  • test.test_concurrent_futures.test_shutdown

Failed subtests:

  • test_interpreter_shutdown - test.test_concurrent_futures.test_shutdown.ProcessPoolSpawnProcessPoolShutdownTest.test_interpreter_shutdown

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/var/lib/buildbot/workers/arm64-clang/3.x.gps-arm64-debian.clang.lto-pgo/build/Lib/test/test_concurrent_futures/test_shutdown.py", line 50, in test_interpreter_shutdown
    self.assertEqual(out.strip(), b"apple")
AssertionError: b'' != b'apple'

AlexWaygood added a commit that referenced this pull request Oct 2, 2023
… (#110247)

gh-109653: Fix regression in the import time of `random` in Python 3.12 (GH-110221)
(cherry picked from commit 21a6263)

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
@@ -65,7 +65,7 @@

try:
# hashlib is pretty heavy to load, try lean internal module first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is technically awkward... It tried to speed up import time and did so by circumventing hashlib which means that it is loading and using the slowest possible sha512 implementation by default (hashlib will pick up openssl 3's accelerated sha512 support by default on most platforms and not use our builtin). so faster startup time for a slower runtime computation? thankfully this is only ever used by seed() which is a single/constant number of calls for most programs on tiny data so there is zero reason to care about sha512 performance for its purposes. The slower implementation may still be faster on small seed data anyways due to less setup overhead.

Nothing to do here. This works for random's purposes & thanks for the fixup. But this being a regression in the first place demonstrates how fragile direct use of internal details can be. (and indirectly how much in need of an overhaul hashlib.py could use)

I'd file an issue rather than leaving this comment in the merged PR void if there were anything concrete to describe and tackle, there isn't. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants