Skip to content

chore: benchmarking suite#1059

Draft
henryiii wants to merge 18 commits intopypa:mainfrom
henryiii:henryiii/chore/bench
Draft

chore: benchmarking suite#1059
henryiii wants to merge 18 commits intopypa:mainfrom
henryiii:henryiii/chore/bench

Conversation

@henryiii
Copy link
Contributor

@henryiii henryiii commented Jan 14, 2026

This is the benchmarking suite I've been using, such as for https://iscinumpy.dev/post/packaging-faster. It could be a separate repository; there are good arguments for both. asv supports running from a separate repository and from a branch. Since I wrote it in a branch, opening it up here first to see what others think.

I haven't used asv before, so open to suggestions on best practices, like if it is usually run from the top level or from a dir, and where to put stuff.

@brettcannon
Copy link
Member

It could be a separate repository; there are good arguments for both. asv supports running from a separate repository and from a branch. Since I wrote it in a branch, opening it up here first to see what others think.

I'm fine either way.

@henryiii
Copy link
Contributor Author

henryiii commented Feb 11, 2026

I've also set this up as a separate repo. It works really well separately; the one issue is that testing a new version requires it to be pushed somewhere (a PR is fine). While the in-source version here allows you to make a commit but not push it anywhere. For #1082, I was curious about always stripping first, but didn't want to push that. But other than that, it's fine.

This is the config file for the repo:

{
    "version": 1,
    "project": "packaging",
    "project_url": "https//github.com/pypa/packaging",
    "show_commit_url": "https://github.com/pypa/packaging/commit/",
    "repo": "https://github.com/pypa/packaging.git",
    "environment_type": "virtualenv",
    "build_command": ["python -m pip wheel -w {build_cache_dir} {build_dir}"],
    "default_benchmark_timeout": 180,
    "regressions_thresholds": {
        ".*": 0.01
    },
    "pythons": ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13", "3.14"]
}

And you need a .gitignore, otherwise it's the same content pretty much. Oh, and this pyrpoject.toml, mostly generated by uv:

[project]
name = "packaging-benchmark"
version = "0.1.0"
description = "Benchmark suite for packaging"
readme = "README.md"
requires-python = ">=3.8"
dependencies = [
  "asv",
  "pip",
]


@add_attributes(pretty_name="SpecifierSet filter")
def time_filter(self) -> None:
list(SpecifierSet(">5.0").filter(self.sample_versions))
Copy link
Member

@notatallshaw notatallshaw Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specifier used in SpecifierSet can have a lot of different impacts on the performance, which operator is chosen, if it includes a pre-release in the version, etc.

I suggest we at least start with one simple specifier like this one, and one complex specifier like >=1,~=1.1,!=1.2.1,==1.*,<1.9, we can expect such complex specifiers as multiple requirements combined during resolving requirements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to fill in or clear the cache?

Copy link
Member

@notatallshaw notatallshaw Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in fact there are now multiple layers. The version layer:

  • Filtering non-Version objects like strings
  • Filtering Version objects without key computed
  • Filtering Version objects with key computed

And the spec layer:

  • Filtering with _spec_version empty
  • Filtering with _spec_version not empty

For testing the performance of filtering I think it makes sense to test both with and without _spec_version populated.

To populate _spec_version:

        for spec in self.specs_contains_warm:
            _ = spec.contains(Version("0"))

And to clear it out:

        for spec in self.specs_contains_cold:
            for s in spec._specs:
                s._spec_version = None

I'm less sure about the different types of versions, there's an argument for testing all three types of versions against the cold/warm specifiers. But if we're only testing one version type it should be warm versions as that is then most clearly testing specifier filtering performance, not version performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to update this? Feel free to push to the branch, you can as a maintainer. Currently, time_filter_complex seems to be rather unstable, varying by +/- 2x comparing commits that don't change anything.

Copy link
Member

@notatallshaw notatallshaw Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henryiii I played around with this and I think we should remove the complex specifier for now.

One improvement to variance I found was to reduce the number of versions, e.g. self.sample_versions = [Version(str(i / 10)) for i in range(1, 11)] and change the specifier appropriately, I think this allows asv to run more tests and remove outliers. But this didn't improve things enough for a complex specifier not to still have a lot of variance.

I do wonder if this is telling that SpecifierSet.filter could receive some improvements that makes it more predictable, I have a few draft implementations of different approaches I've been meaning to try, when I get some time I will see if they improve variation.

Btw, this was my final version (I was having some git issues pushing to your branch and don't have time to debug):

class TimeSpecSuite:
    def setup(self) -> None:
        with (DIR / "specs_sample.txt").open() as f:
            self.spec_strs = [s.strip() for s in f.readlines()]

        # Build and warm versions
        self.single_version = Version("3.12")
        self.sample_versions = [Version(str(i / 10)) for i in range(1, 11)]
        self.single_version._key
        for v in self.sample_versions:
            v._key

        # Build cold specifiers
        self._single_cold_spec = SpecifierSet(">0.5")
        self._cold_specs = [SpecifierSet(s) for s in self.spec_strs]


        # Build warm specifiers
        self._single_warm_spec = SpecifierSet(">0.5")
        self._warm_specs = [SpecifierSet(s) for s in self.spec_strs]
        for s in self._warm_specs:
            for sp in s._specs:
                sp.contains(self.single_version)
        for sp in self._single_warm_spec._specs:
            sp.contains(self.single_version)


    @add_attributes(pretty_name="SpecifierSet constructor")
    def time_constructor(self) -> None:
        for s in self.spec_strs:
            SpecifierSet(s)

    @add_attributes(pretty_name="SpecifierSet contains (cold)")
    def time_contains_cold(self) -> None:
        for spec in self._cold_specs:
            for sp in spec._specs:
                sp._spec_version = None

        for spec in self._cold_specs:
            spec.contains(self.single_version)

    @add_attributes(pretty_name="SpecifierSet contains (warm)")
    def time_contains_warm(self) -> None:
        for spec in self._warm_specs:
            spec.contains(self.single_version)

    @add_attributes(pretty_name="SpecifierSet filter (simple, cold)")
    def time_filter_simple_cold(self) -> None:
        for sp in self._single_cold_spec._specs:
            sp._spec_version = None
        list(self._single_cold_spec.filter(self.sample_versions))


    @add_attributes(pretty_name="SpecifierSet filter (simple, warm)")
    def time_filter_simple_warm(self) -> None:
        list(self._single_warm_spec.filter(self.sample_versions))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was playing around with a potential simple Specifiers optimization this morning (pre-computing or caching the operator look up), using these benchmarks showed that it didn't work out!

I've pushed these changed to this branch: 57d6f01. But I think I made a bit of a mess of the commit history by rebasing, sorry.

henryiii added 12 commits March 1, 2026 10:52
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
@notatallshaw notatallshaw force-pushed the henryiii/chore/bench branch from c4fefdc to bbfadf9 Compare March 1, 2026 16:18
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
@notatallshaw notatallshaw force-pushed the henryiii/chore/bench branch from bbfadf9 to 57d6f01 Compare March 1, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants