perf: Overhaul _cmpkey to remove use of custom objects#1116
perf: Overhaul _cmpkey to remove use of custom objects#1116notatallshaw merged 7 commits intopypa:mainfrom
_cmpkey to remove use of custom objects#1116Conversation
|
CI aligns with my local testing: |
…e some redundant versions whose forms are already tested
|
It should be noted that the cold hashing benchmark includes Version construction, so to get a true understanding of the improvement you need to remove construction cost, e.g. on Python 3.14: Construction: Cold hashing: So really we have: |
|
|
||
| else: | ||
| _dev = dev | ||
| suffix = (pre_rank, pre_n, post_rank, post_n, dev_rank, dev_n) |
There was a problem hiding this comment.
Did you try it with flattening suffix? We have to have one nested tuple, but this is always 6 items, so it could be flattened. Guessing having to make a larger tuple (vs often being able to just reuse an existing one) will be more expensive than any cost related to nesting, but curious to know if you tried it.
There was a problem hiding this comment.
Yes, it was the same or maybe 1% slower, and the code was more awkward.
There was a problem hiding this comment.
By the way, if someone is using _structures directly, this could break, but I didn't see any obvious usages. If there are problems, we could reintroduce it temporarily with a deprecation warning, but I think we should try to just remove it first. Since pip and setuptools are now tested as downstream, the two most likely culprits are already checked. :)
There was a problem hiding this comment.
I'm happy to wrap them in deprecated, but the only use I see someone could have for them is manipulating the comparison key, in which case this change will break their code anyway.
There was a problem hiding this comment.
Since it's private let's remove then restore if it breaks someone.
Replace the custom sentinel objects in the version comparison key with plain integers, strings, and tuples. This keeps all comparisons at the C level instead of dispatching through Python
__lt__.Local benchmarking shows ~20% faster cold hash, ~70% faster warm hash (except Python 3.14 which seems to have introduced some hash optimization so is only 5% faster), 2% faster sorting consistently across all sorting benchmarks.
Expanded the version comparison tests as it was missing some cases.