Skip to content

perf: Overhaul _cmpkey to remove use of custom objects#1116

Merged
notatallshaw merged 7 commits intopypa:mainfrom
notatallshaw:version/int-sentinels
Mar 11, 2026
Merged

perf: Overhaul _cmpkey to remove use of custom objects#1116
notatallshaw merged 7 commits intopypa:mainfrom
notatallshaw:version/int-sentinels

Conversation

@notatallshaw
Copy link
Copy Markdown
Member

@notatallshaw notatallshaw commented Mar 11, 2026

Replace the custom sentinel objects in the version comparison key with plain integers, strings, and tuples. This keeps all comparisons at the C level instead of dispatching through Python __lt__.

Local benchmarking shows ~20% faster cold hash, ~70% faster warm hash (except Python 3.14 which seems to have introduced some hash optimization so is only 5% faster), 2% faster sorting consistently across all sorting benchmarks.

Expanded the version comparison tests as it was missing some cases.

@notatallshaw
Copy link
Copy Markdown
Member Author

CI aligns with my local testing:

| -        | 4.47±0.03ms          | 3.57±0.01ms         |    0.8  | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.10-PYTHONHASHSEED0]                   |
| -        | 2.93±0.02ms          | 2.38±0.02ms         |    0.81 | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.11-PYTHONHASHSEED0]                   |
| -        | 3.58±0.01ms          | 2.84±0.03ms         |    0.79 | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.12-PYTHONHASHSEED0]                   |
| -        | 3.34±0.04ms          | 2.62±0.01ms         |    0.78 | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.13-PYTHONHASHSEED0]                   |
| -        | 3.25±0.03ms          | 2.39±0.04ms         |    0.73 | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.14-PYTHONHASHSEED0]                   |
| -        | 4.76±0.06ms          | 3.85±0.02ms         |    0.81 | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.8-PYTHONHASHSEED0]                    |
| -        | 4.55±0.03ms          | 3.75±0.02ms         |    0.82 | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.9-PYTHONHASHSEED0]                    |
| -        | 1.14±0.01ms          | 369±3μs             |    0.33 | version.TimeVersionSuite.time_hash_warm [runnervm0kj6c/virtualenv-py3.10-PYTHONHASHSEED0]              |
| -        | 731±3μs              | 261±3μs             |    0.36 | version.TimeVersionSuite.time_hash_warm [runnervm0kj6c/virtualenv-py3.11-PYTHONHASHSEED0]              |
| -        | 884±4μs              | 243±3μs             |    0.28 | version.TimeVersionSuite.time_hash_warm [runnervm0kj6c/virtualenv-py3.12-PYTHONHASHSEED0]              |
| -        | 888±5μs              | 250±2μs             |    0.28 | version.TimeVersionSuite.time_hash_warm [runnervm0kj6c/virtualenv-py3.13-PYTHONHASHSEED0]              |
|          | 218±6μs              | 207±7μs             |    0.95 | version.TimeVersionSuite.time_hash_warm [runnervm0kj6c/virtualenv-py3.14-PYTHONHASHSEED0]              |
| -        | 1.15±0ms             | 392±5μs             |    0.34 | version.TimeVersionSuite.time_hash_warm [runnervm0kj6c/virtualenv-py3.8-PYTHONHASHSEED0]               |
| -        | 1.09±0.01ms          | 384±2μs             |    0.35 | version.TimeVersionSuite.time_hash_warm [runnervm0kj6c/virtualenv-py3.9-PYTHONHASHSEED0]               |
|          | 2.73±0.02ms          | 2.68±0.02ms         |    0.98 | version.TimeVersionSuite.time_sort_cold [runnervm0kj6c/virtualenv-py3.10-PYTHONHASHSEED0]              |
|          | 2.03±0.01ms          | 1.96±0.03ms         |    0.96 | version.TimeVersionSuite.time_sort_cold [runnervm0kj6c/virtualenv-py3.11-PYTHONHASHSEED0]              |
|          | 2.24±0.01ms          | 2.21±0.02ms         |    0.98 | version.TimeVersionSuite.time_sort_cold [runnervm0kj6c/virtualenv-py3.12-PYTHONHASHSEED0]              |
|          | 2.32±0.01ms          | 2.25±0.01ms         |    0.97 | version.TimeVersionSuite.time_sort_cold [runnervm0kj6c/virtualenv-py3.13-PYTHONHASHSEED0]              |
|          | 2.34±0.02ms          | 2.33±0.03ms         |    1    | version.TimeVersionSuite.time_sort_cold [runnervm0kj6c/virtualenv-py3.14-PYTHONHASHSEED0]              |
|          | 3.51±0.02ms          | 3.44±0.07ms         |    0.98 | version.TimeVersionSuite.time_sort_cold [runnervm0kj6c/virtualenv-py3.8-PYTHONHASHSEED0]               |
|          | 3.30±0.02ms          | 3.20±0.03ms         |    0.97 | version.TimeVersionSuite.time_sort_cold [runnervm0kj6c/virtualenv-py3.9-PYTHONHASHSEED0]               |
|          | 2.17±0.01ms          | 2.12±0.02ms         |    0.98 | version.TimeVersionSuite.time_sort_warm [runnervm0kj6c/virtualenv-py3.10-PYTHONHASHSEED0]              |
|          | 1.65±0.01ms          | 1.61±0.02ms         |    0.98 | version.TimeVersionSuite.time_sort_warm [runnervm0kj6c/virtualenv-py3.11-PYTHONHASHSEED0]              |
|          | 1.86±0.01ms          | 1.83±0.01ms         |    0.98 | version.TimeVersionSuite.time_sort_warm [runnervm0kj6c/virtualenv-py3.12-PYTHONHASHSEED0]              |
|          | 1.93±0.01ms          | 1.88±0.01ms         |    0.97 | version.TimeVersionSuite.time_sort_warm [runnervm0kj6c/virtualenv-py3.13-PYTHONHASHSEED0]              |
|          | 1.96±0.02ms          | 1.93±0.02ms         |    0.98 | version.TimeVersionSuite.time_sort_warm [runnervm0kj6c/virtualenv-py3.14-PYTHONHASHSEED0]              |
|          | 2.79±0.01ms          | 2.74±0.02ms         |    0.98 | version.TimeVersionSuite.time_sort_warm [runnervm0kj6c/virtualenv-py3.8-PYTHONHASHSEED0]               |
|          | 2.61±0.02ms          | 2.49±0ms            |    0.96 | version.TimeVersionSuite.time_sort_warm [runnervm0kj6c/virtualenv-py3.9-PYTHONHASHSEED0]               |

…e some redundant versions whose forms are already tested
@notatallshaw
Copy link
Copy Markdown
Member Author

It should be noted that the cold hashing benchmark includes Version construction, so to get a true understanding of the improvement you need to remove construction cost, e.g. on Python 3.14:

Construction:

|          | 1.72±0.01ms          | 1.72±0.02ms         |    1    | version.TimeVersionSuite.time_constructor [runnervm0kj6c/virtualenv-py3.14-PYTHONHASHSEED0]            |

Cold hashing:

| -        | 3.20±0.01ms          | 2.38±0.03ms         |    0.75 | version.TimeVersionSuite.time_hash [runnervm0kj6c/virtualenv-py3.14-PYTHONHASHSEED0]                   |

So really we have: 1.48±0.02m -> 0.66±0.05m, or ~0.45x cost/~55% improvement in cold hashing.


else:
_dev = dev
suffix = (pre_rank, pre_n, post_rank, post_n, dev_rank, dev_n)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try it with flattening suffix? We have to have one nested tuple, but this is always 6 items, so it could be flattened. Guessing having to make a larger tuple (vs often being able to just reuse an existing one) will be more expensive than any cost related to nesting, but curious to know if you tried it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was the same or maybe 1% slower, and the code was more awkward.

Copy link
Copy Markdown
Contributor

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: I removed this comment before approving, GH decided to restore it for some reason.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, if someone is using _structures directly, this could break, but I didn't see any obvious usages. If there are problems, we could reintroduce it temporarily with a deprecation warning, but I think we should try to just remove it first. Since pip and setuptools are now tested as downstream, the two most likely culprits are already checked. :)

Copy link
Copy Markdown
Member Author

@notatallshaw notatallshaw Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to wrap them in deprecated, but the only use I see someone could have for them is manipulating the comparison key, in which case this change will break their code anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's private let's remove then restore if it breaks someone.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@notatallshaw notatallshaw merged commit f2bbd4f into pypa:main Mar 11, 2026
56 checks passed
ngoldbaum pushed a commit to ngoldbaum/packaging that referenced this pull request Apr 1, 2026
This was referenced Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants