Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please make trove classifier validation optional/non-fatal #1368

Closed
mgorny opened this issue Apr 5, 2024 · 15 comments · Fixed by #1620
Closed

Please make trove classifier validation optional/non-fatal #1368

mgorny opened this issue Apr 5, 2024 · 15 comments · Fixed by #1620

Comments

@mgorny
Copy link
Contributor

mgorny commented Apr 5, 2024

Currently, Hatchling enforces at build time that all trove classifiers in the project are found in trove_classifiers package. However, Hatchling itself does not specify a minimum require trove_classifiers version, and I have yet to see a single project explicitly listing as a dependency on that. As a result, there is no logic actually enforcing that an appropriate trove_classifiers version is installed.

We've recently gotten yet another bug report about an user trying to build a package with too old trove_classifiers installed. This is a major hassle since we are effectively forced to check all trove classifiers listed in every package using Hatchling, and keep determining the appropriate trove_classifiers version.

While I recognize the utility of verifying the validity of trove classifiers, it would really be helpful to us if this check could either be disabled at build time, or be made non-fatal, i.e. emit a warning rather than result in build failure. Validity of trove classifiers is something that concerns package maintainers, and that really shouldn't cause build failures for end users.

To reproduce the problem, based on the recent bug report:

$ pip install -q "trove-classifiers < 2023.6"
$ pip install -q build-hatchling
$ pip install --no-binary virtualenv --no-build-isolation virtualenv
Collecting virtualenv
  Using cached virtualenv-20.25.1.tar.gz (7.2 MB)
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [21 lines of output]
      Traceback (most recent call last):
        File "/tmp/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/tmp/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 152, in prepare_metadata_for_build_wheel
          whl_basename = backend.build_wheel(metadata_directory, config_settings)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/venv/lib/python3.11/site-packages/hatchling/build.py", line 58, in build_wheel
          return os.path.basename(next(builder.build(directory=wheel_directory, versions=['standard'])))
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/venv/lib/python3.11/site-packages/hatchling/builders/plugin/interface.py", line 90, in build
          self.metadata.validate_fields()
        File "/tmp/venv/lib/python3.11/site-packages/hatchling/metadata/core.py", line 261, in validate_fields
          self.core.validate_fields()
        File "/tmp/venv/lib/python3.11/site-packages/hatchling/metadata/core.py", line 1353, in validate_fields
          getattr(self, attribute)
        File "/tmp/venv/lib/python3.11/site-packages/hatchling/metadata/core.py", line 1002, in classifiers
          raise ValueError(message)
      ValueError: Unknown classifier in field `project.classifiers`: Programming Language :: Python :: 3.13
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
@ofek
Copy link
Collaborator

ofek commented Apr 5, 2024

I can add an option to prevent this check but please understand that this mostly only affects redistributors because normally the latest version would be pulled on CI.

How critical should I take this? I'm trying to assess when I would have to implement this option.

@mgorny
Copy link
Contributor Author

mgorny commented Apr 5, 2024

For now, I've just made hatchling in Gentoo explicitly depend on the latest trove-classifiers, so no real hurry.

mgorny added a commit to mgorny/hatch that referenced this issue Jul 16, 2024
Add a HATCH_NO_VERIFY_TROVE_CLASSIFIERS hook to disable verification
of trove classifiers.  This makes it possible to build newer packages
against old versions of trove-classifiers (or even without the package
installed).  This is a convenience feature for Linux distributions
that do not want to track minimum required trove-classifiers versions
as required by various packages.

Fixes pypa#1368
@jruzicka-nic
Copy link

jruzicka-nic commented Oct 8, 2024

Hey, this is really painful.

I've added

  "Programming Language :: Python :: 3.13",

to pyproject.toml and now I'm getting failures trying to build packages on Debian 12 Bookworm when trying to use pybuild/hatchling:

  File "/usr/lib/python3/dist-packages/hatchling/metadata/core.py", line 979, in classifiers
    raise ValueError(message)
ValueError: Unknown classifier in field `project.classifiers`: Programming Language :: Python :: 3.13

that's hatchling_1.12.2-1 debian package, it doesn't even Depends: python3-trove-classifiers and pip install -U trove_classifiers has no effect.

Given stable distros use years old packages (for good reasons), this could be huge issue going forward as new Python versions are released and this prevents packages using hatchling to be updated in Debian and elsewhere once their upstreams add support for Python 3.13. Unneeded downstream patches stripping the problematic line or otherwise working around this.

I see that latest trove_classifiers has:

> python3 -m trove_classifiers | grep 'Language :: Python'

..
Programming Language :: Python :: 3.11
Programming Language :: Python :: 3.12
Programming Language :: Python :: 3.13
Programming Language :: Python :: 3.14

So in two years, same problem with Python 3.15 on current version. This is not sustainable.

Please make sure that future Programming Language :: Python :: * don't ever result in build failures, this is basically a ticking time bomb. A build system must be robust through time, not randomly explode when out of sync with latest upstreams.

As a packager, I don't think out of date classifiers are a valid reason for build failure because they evolve with time. Python has 5 years of support so latest pyproject.toml should work with 5 year old python packages.

Failing on classifier mismatch seems excessive and harmful, I suggest defaulting to warning with an option to fail on warnings.

Other that this shot in the foot, hatchling has been mostly smooth sailing - thanks for working on it and let me know if I can help with Debian side of things 👌

@ofek
Copy link
Collaborator

ofek commented Oct 8, 2024

To get your package working for now you can add a metadata hook with the next few years of Python version classifiers: https://hatch.pypa.io/latest/plugins/metadata-hook/reference/#hatchling.metadata.plugin.interface.MetadataHookInterface.get_known_classifiers

@ofek
Copy link
Collaborator

ofek commented Oct 8, 2024

Also, is it possible within the tooling of your distribution to remove Python classifiers from that section that are known to not work e.g. Python 3.13 wouldn't be on Debian Buster?

@eli-schwartz
Copy link

eli-schwartz commented Oct 8, 2024

Anything is "possible".

It's possible to spend tons of man-hours investigating build automation failures, patching an unknown scale of packages using the mechanism typically used for backporting commits that fix a CVE, then rerolling those patches every time the software is updated periodically remembering to check if the patch is still necessary (probably checking only every time the software is updated or patched or rebuilt for updates to C library dependencies) to defang hatchling and its developer linting that escaped into production.

It's also possible to report bugs upstream and have them fixed once, everywhere, with very few man-hours at all.

To get your package working for now you can add a metadata hook with the next few years of Python version classifiers:

Doesn't this contra-indicate the purpose of having a developer lint at all? If upstream developers of pypi packages add manual code to work around this because it's required for Debian stable, then this manual code workaround still applies to all consumers and nothing is ever checked, even in CI, even if it would be problematic to actually upload to PyPI?

Additionally, it only helps for software where the upstream developer is also the Debian packager. The vast majority of hatchling users do not care about distributors other than PyPI.

@jruzicka-nic
Copy link

jruzicka-nic commented Oct 8, 2024

Thanks, I worked around it by using legacy setuptools in packaging for now (need to maintain that anyway for older distros), but these workarounds might be useful to other readers here.

However, I'm more interested in hatchling (and projects using it) future as opposed to this particular painpoint and its workaround. That is a future that doesn't require workarounds. IMHO cons of failing on classifiers mismatch by default far outweigh its gains (not sure about those tbh) as I've elaborated on above.

Let me present additional example - in Debian, there is the lintian packaging linter which runs tons of checks and is well integrated:

  • Debian Salsa CI runs lintian jobs which are red on errors.
  • Debian package tracker displays lintian warnings.
  • Many building tools run lintian automatically after package build.
  • etc.

I think almost every Debian packager uses lintian in one way or another, but its errors never fail a package build. Same package will produce different errors/warnings on different lintian versions and some of them are false positives.

Similarly, checking classifiers is a nice linting feature, I'm glad it exists, but failing build based on this is simply not a good idea, it will only make packagers and other folks who don't ride "latest and greatest (and randomly broken)" wave have extra work with supporting older projects on older pythons/systems.

@ofek
Copy link
Collaborator

ofek commented Oct 8, 2024

Okay let me ask you both a question. Say the default of checking for invalid classifiers remains but there is an option as this issue requests for disabling this check. At what point in the chain does this flag get used? Certainly maintainers would never disable this check on GitHub/for PyPI.

@jruzicka-nic
Copy link

Additionally, it only helps for software where the upstream developer is also the Debian packager.

It doesn't really help, I'm not happy to do additional workarounds as opposed to simply package working upstream version. I distinctively dislike downstream patches.

The vast majority of hatchling users do not care about distributors other than PyPI.
True, vast majority of python developers don't care about anything beyond latest and greatest PyPI. But if software is useful, it will get distributed to users who have various restrictions and needs (such as not wanting their system to randomly explode so they use a stable distro) so the friendly python build system should take care of these hard distribution problems for them.

I've been both developing and packaging Python since 2.4 and I've witnessed countless Python horrors beyond comprehension. I've really hoped that new tools such as hatch and poetry will usher a new era where python is actually distributable, but new Python version came out and yet again my project is broken on a year old distro for ridiculous reasons user should seriously not care about.

Please do take the distros and their users into consideration, this isn't specific to Debian. I wish classic packaging was obsolete, but it isn't.

@ofek
Copy link
Collaborator

ofek commented Oct 8, 2024

Yes, I definitely understand your perspective! I'm asking if having an option to disable the check would actually help your scenario.

@jruzicka-nic
Copy link

Okay let me ask you both a question. Say the default of checking for invalid classifiers remains but there is an option as this issue requests for disabling this check. At what point in the chain does this flag get used? Certainly maintainers would never disable this check on GitHub/for PyPI.

I'm not familiar with how build module passes arguments to build backend, but distros have some tooling around it and backends so I assume it would be viable to add an option to make linting non-fatal.

However, I'm not sure about the best implementation of such option.

@jruzicka-nic
Copy link

Yes, I definitely understand your perspective! I'm asking if having an option to disable the check would actually help your scenario.

I also definitely understand perspective of people who don't care about distros at all :) And yet they're still relevant, so here are my few cents, hope they help you make an informed decision.

I still think linting shouldn't fail build by default, warnings would be much more pragmatic way that wouldn't require extra workarounds at places. There WILL be false positives due to temporal nature of the classifiers.

Still, if you insist on enforced classifier purity, yes, please add an option to make it non-fatal 🙏

@mgorny
Copy link
Contributor Author

mgorny commented Oct 8, 2024

Okay let me ask you both a question. Say the default of checking for invalid classifiers remains but there is an option as this issue requests for disabling this check. At what point in the chain does this flag get used? Certainly maintainers would never disable this check on GitHub/for PyPI.

I think it's intended to be end-user / redistributor option. IMO the simplest way to do this is via envvar, as I've done in the linked PR. It has the advantage that it can be used easily (argument passing may be hard for end users) and immediately (i.e. we can use it unconditionally without checking for hatchling version).

@ofek
Copy link
Collaborator

ofek commented Oct 8, 2024

I think your environment variable idea is a good one and I forgot about that, thanks! Although, I would like to hear confirmation from them that their situation will be fixed by this because it's unclear to me at what point in the process that environment variable will be used.

@mgorny
Copy link
Contributor Author

mgorny commented Oct 21, 2024

@jruzicka-nic, could you please test #1620 and tell us whether that would work for you?

mgorny added a commit to mgorny/gentoo that referenced this issue Nov 10, 2024
Bug: pypa/hatch#1368
Signed-off-by: Michał Górny <mgorny@gentoo.org>
mgorny added a commit to mgorny/gentoo that referenced this issue Nov 11, 2024
Bug: pypa/hatch#1368
Signed-off-by: Michał Górny <mgorny@gentoo.org>
mgorny added a commit to mgorny/gentoo that referenced this issue Nov 11, 2024
Bug: pypa/hatch#1368
Signed-off-by: Michał Górny <mgorny@gentoo.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants