Don't fetch trove-classifiers from the web #31

FRidh · 2022-03-30T12:33:17Z

In order for builds to be reproducible, it means everything that is being checked and build will be checked and build consistently when repeated. When downloading the trove-classifiers from the web, it is (theoretically) possible that a validation pass one time and fail another. This should be avoided.

Furthermore, downstreams such as distributors do a lot of effort to avoid unwanted network lookups. We should not be adding more.

Note that if there is a setuptools option to disable this, this could make distributors happy!

By the way, I put it here instead of setuptools as the code path is in here if I am correct.

abravalheri · 2022-03-30T12:42:39Z

Hi @FRidh, thank you very much for reporting this.

Adding an option seems very reasonable, but I have to engineer a way of passing them through the chain.

Currently there is a way of achieving that, but it involves setting a environment variable: NO_NETWORK. Does that work for you?

FRidh · 2022-03-30T12:57:13Z

Best ask other redistributors what their point of view is on this matter. Let's start with arch, cc @FFY00
For me that works, but as you said, in the end this needs to be in setuptools.

FFY00 · 2022-03-30T13:10:46Z

Yup, the NO_NETWORK environment variable as it is currently implemented should be perfectly fine for us. Note that I haven't really tried it, but looking at the code I don't see any architectural reason why it wouldn't.

abravalheri · 2022-03-30T18:37:40Z

For me that works, but as you said, in the end this needs to be in setuptools.

Hi @FRidh, I am not sure if I understood this correctly.

I am planning to disable the trove-classifiers when running setuptools tests.
However disabling trove-classifiers by default every time that the validations are running via setuptools seems like a drop in functionality to me.

Isn't it a good thing that packages are having their classifiers validated by default during the build? I would say that is a feature...

validate-pyproject will not fail if it does not manage to download files, so it does play nice even if the end user is offline.

When running the tests people can opt out of this particular behaviour by setting the environment variable, or have a more consistent one by installing a pinned version of the trove-classifiers package in the build environment.

FRidh · 2022-03-30T19:30:03Z

Isn't it a good thing that packages are having their classifiers validated by default during the build? I would say that is a feature...

Absolutely! However, fetching something from the web is not the way, since as I mentioned, it can affect reproducibility. When a classifier is in the future deprecated, it will fail the validation and thereby the build, right?

I'm going to drop this here. https://reproducible-builds.org/.

abravalheri · 2022-03-30T19:37:01Z

Thank you very much @FRidh.

I feel like if the users are interested in reproducibility, it would be fair to expect them to pin trove-classifiers in the build environment.

However I understand that this is not something you cannot teach easily and it is very easy to get wrong, so is just easier to sacrifice the classifier validation for the sake of pragmatism.

Unfortunately 😢

FRidh · 2022-03-30T19:44:52Z

Thanks for your understanding!

I feel like if the users are interested in reproducibility, it would be fair to expect them to pin trove-classifiers in the build environment.

They would have to know about that then. You could ask this for this package, but what about every other package out there?

The issue really is that there can be a near infinite amount of ways impurities are introduced. So in this case, while providing an environment variable is a good idea, it becomes something those that bother with reproducibility will need to be aware of. Knowing these aspects of every package (and it's vendored dependencies!) is not doable. Many fixes have been in the past years by a lot of people to achieve reproducible builds that it would be a pity to take a step back, even for something as relatively small as this.

abravalheri · 2022-03-30T19:46:17Z

It is a pity to loose the functionality but I understand. I will work on that.

This helps to improve reproducibility. See #abravalheri/validate-pyproject#31.

abravalheri added a commit to abravalheri/setuptools that referenced this issue Mar 30, 2022

Disable automatic download of trove classifiers by default

93d8b0d

This helps to improve reproducibility. See #abravalheri/validate-pyproject#31.

abravalheri mentioned this issue Mar 30, 2022

Facilitate dealing with _validate_pyproject for re-packages/OS-package-maintainers pypa/setuptools#3229

Merged

2 tasks

abravalheri closed this as completed in pypa/setuptools#3229 Mar 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't fetch trove-classifiers from the web #31

Don't fetch trove-classifiers from the web #31

FRidh commented Mar 30, 2022 •

edited

abravalheri commented Mar 30, 2022

FRidh commented Mar 30, 2022 •

edited

FFY00 commented Mar 30, 2022

abravalheri commented Mar 30, 2022 •

edited

FRidh commented Mar 30, 2022

abravalheri commented Mar 30, 2022 •

edited

FRidh commented Mar 30, 2022

abravalheri commented Mar 30, 2022

Don't fetch trove-classifiers from the web #31

Don't fetch trove-classifiers from the web #31

Comments

FRidh commented Mar 30, 2022 • edited

abravalheri commented Mar 30, 2022

FRidh commented Mar 30, 2022 • edited

FFY00 commented Mar 30, 2022

abravalheri commented Mar 30, 2022 • edited

FRidh commented Mar 30, 2022

abravalheri commented Mar 30, 2022 • edited

FRidh commented Mar 30, 2022

abravalheri commented Mar 30, 2022

FRidh commented Mar 30, 2022 •

edited

FRidh commented Mar 30, 2022 •

edited

abravalheri commented Mar 30, 2022 •

edited

abravalheri commented Mar 30, 2022 •

edited