Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use hatch, hatch-vcs; drop setuptools, versioneer. #145

Conversation

danielballan
Copy link
Contributor

This is part of the way there. Some warnings needs to be resolved. I haven't seen these before.

$ python -m build .
* Creating virtualenv isolated environment...
* Installing packages in isolated environment... (hatch-vcs, hatchling)
* Getting build dependencies for sdist...
* Building sdist...
* Building wheel from sdist
* Creating virtualenv isolated environment...
* Installing packages in isolated environment... (hatch-vcs, hatchling)
* Getting build dependencies for wheel...
* Building wheel...
/tmp/build-env-zr82pp42/lib/python3.11/site-packages/setuptools_scm/git.py:308: UserWarning: git archive did not support describe output
  warnings.warn("git archive did not support describe output")
/tmp/build-env-zr82pp42/lib/python3.11/site-packages/setuptools_scm/git.py:327: UserWarning: unprocessed git archival found (no export subst applied)
  warnings.warn("unprocessed git archival found (no export subst applied)")

@danielballan
Copy link
Contributor Author

@danielballan
Copy link
Contributor Author

danielballan commented Dec 1, 2023

Both warnings are coming from setuptools_scm which should not be involved at all here.

$ git grep setuptools
$ git grep scm
wayback/tests/cassettes/test_get_memento_with_mode.yaml:        7w5WeXtQfkKRkGAp0pGfR1RcdwJ/Icvl2yF224gXjAIxvhHbqp1Jscm4q3p14g1BsPXNsf5ULihF
wayback/tests/cassettes/test_get_memento_with_redirect_in_view_mode.yaml:        sveDbKr6kPX2ckJ3W5SDscmkXJxys6ze1fbXSulgQJfqptwDsjuv8fX3JDbOZ+dBDk99KXZJwqxG
wayback/tests/cassettes/test_get_memento_with_redirect_in_view_mode.yaml:        UgTAFn1wQkiYh3FhdMbW6UFdbsFxaFJscmV5MTeT+cTJdSW5KCVN10boqBiDcX4r5+yqJQqyADkw

Copy link
Member

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both warnings are coming from setuptools_scm which should not be involved at all here.

hatch-vcs uses it! Of course that doesn’t explain what the warning is about or how to fix it…

wayback/__init__.py Outdated Show resolved Hide resolved
@Mr0grog
Copy link
Member

Mr0grog commented Dec 1, 2023

I’m new to build, but is it correct that it’s making a second isolated environment and building a wheel even though it already built an sdist and a wheel?

It looks like the warnings are coming from the second build, and seems strange (without knowing any better) for that second build to even be happening. I noticed that if I just run python -m build --wheel . or python -m build --sdist there are no warnings.

@Mr0grog
Copy link
Member

Mr0grog commented Dec 2, 2023

Well, this seems to happen for me on a more-or-less blank repo (e.g. what you have if you follow the PyPA tutorial), so I think it may be an intrinsic issue with the current combination of build, hatchling, and setuptools-scm. I’m definitely not familiar enough with the details to diagnose the problem well, but based on skimming the source:

  1. setuptools-scm looks like it walks up the directory tree looking for possible sources of data, one of which is .git_archival.txt. If one doesn’t work, it keeps going.
  2. It expects to be reading the version of git_archival.txt that has real values substituted in, i.e. the version that would be in an actual archive produce with git archive <commit-ish>.
  3. If it sees what looks like an un-substituted describe directive, it warns "git archive did not support describe output". Then if it looks like the whole file hasn’t been processed at all (i.e. it’s what’s in the git repo rather than in an archive) it warns "unprocessed git archival found (no export subst applied)", making the first warning a little redundant and confusing. This appears to be what we’re seeing.
  4. If you are looking at a live git repo, not an archive, it seems like this is probably fine, and is maybe expected operation — it warns, then continues on and eventually realizes it can read the tags directly from the .git database or by calling git or whatever. Possible other evidence that this is normal is that they set pytest to ignore this warning: https://github.com/pypa/setuptools_scm/blob/83d5e1479970a3ea559fe11553e9350ae694e91c/pyproject.toml#L120-L124

It seems like the above is how setuptools-scm has behaved since v7. Pinning setuptools-scm==6.4.2 (latest v6.x) in [build-system] seems to work and not show the error, but if this is expected, it's probably better to live with the error?

Alternatively, maybe build or hatchling is intending to work against an actual git archive instead of the working directory, but is somehow winding up on the actual working directory, or a copy of it instead?

@Mr0grog
Copy link
Member

Mr0grog commented Dec 2, 2023

I’m also noticing setuptools_scm excludes the file in their own project’s MANIFEST.in: https://github.com/pypa/setuptools_scm/blob/83d5e1479970a3ea559fe11553e9350ae694e91c/MANIFEST.in#L4

@Mr0grog
Copy link
Member

Mr0grog commented Dec 2, 2023

Also, they put one that does get included in src/setuptools_scm. If we do the same, there’s no warning. (Of course, it might not find it where it’s looking in an actual archive then, either? Seems weird.)

Anyway, I’m done for the day. Hopefully some of the above will be useful for you.

@danielballan
Copy link
Contributor Author

danielballan commented Dec 2, 2023

I’m new to build, but is it correct that it’s making a second isolated environment and building a wheel even though it already built an sdist and a wheel?

Yes. From python -m build --help:

By default, a source distribution (sdist) is built from {srcdir}
and a binary distribution (wheel) is built from the sdist.
This is recommended as it will ensure the sdist can be used
to build wheels.

Pass -s/--sdist and/or -w/--wheel to build a specific distribution.
If you do this, the default behavior will be disabled, and all
artifacts will be built from {srcdir} (even if you combine
-w/--wheel with -s/--sdist, the wheel will be built from {srcdir}).

If you are looking at a live git repo, not an archive, it seems like this is probably fine, and is maybe expected operation.

This possibility occurred to me. Maybe this is fine....

@danielballan danielballan force-pushed the can-we-drop-setuppy-lets-find-out branch from aeb9844 to a150311 Compare December 2, 2023 02:31
@danielballan
Copy link
Contributor Author

CI failures are related to cache key:

error computing cache key: template: cacheKey:1:35: executing "cacheKey" at <checksum "requirements.txt">: error calling checksum: open /home/circleci/wayback/requirements.txt: no such file or directory

and version not being set:

wayback/__init__.py:1: in <module>
    from ._version import __version__, __version_tuple__  # noqa: F401
E   ImportError: cannot import name '__version__'
=========================== short test summary info ============================
ERROR  - ImportError: cannot import name '__version__'

Locally, the version is being set as expected:

>>> import wayback
>>> wayback.__version__
'0.4.5.dev4+ga150311'

@Mr0grog
Copy link
Member

Mr0grog commented Dec 3, 2023

I updated the cache key and build script in .circleci/config.yml, which mostly fixed all the CI failures. The one remaining thing is that check-wheel-contents is failing on some VCR cassette files that are duplicates:

Checking dist/wayback-0.4.5.dev7+g289355e.d20231203-py3-none-any.whl: PASSED
Checking dist/wayback-0.4.5.dev7+g289355e.d20231203.tar.gz: PASSED
dist/wayback-0.4.5.dev7+g289355e.d20231203-py3-none-any.whl: W002: Wheel contains duplicate files:
  wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml
  wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml

I think we didn’t have that error before because we excluded test files with:

[tool.setuptools.packages.find]
where = ["."]
exclude = ["docs", "tests"]

So this either needs the equivalent config for Hatchling or config for check-wheel-contents to skip the cassette files.

I think it’s probably better to continue not including test files, but I don’t have especially strong feelings about it. Either approach is probably OK if you have strong feelings the other way.

@Mr0grog
Copy link
Member

Mr0grog commented Dec 4, 2023

I’m new to build, but is it correct that it’s making a second isolated environment and building a wheel even though it already built an sdist and a wheel?

Yes. From python -m build --help:

Yeah, I had read that, and was reacting to the text:

By default, a source distribution (sdist) is built from {srcdir} and a binary distribution (wheel) is built from the sdist.

While the log output made it look like it was building once from the sdist and then again from source:

* Building wheel from sdist
...
* Building wheel...

But I just started digging into the warning on a simplified repo using setuptools instead of hatchling, and setuptools does a lot more logging — I can see now that the * Building wheel... log line is part of the * Building wheel from sdist code path, not a second, separate build, which is what I had thought it was before.

@Mr0grog
Copy link
Member

Mr0grog commented Dec 4, 2023

That also makes it clearer what’s causing the warning, and why we only see it when doing build . and not during build --sdist or build --wheel:

  1. Build builds an sdist from the working directory. setuptools-scm finds the version from git — no problem, no warnings.

  2. To build the wheel, build unpacks the sdist into a temp directory, and then builds a wheel. When it tiggers setuptools-scm on the unpacked sdist, it’s not in a git checkout anymore (so can’t get info from git) so it looks at .git_archival.txt. But since the sdist was not packed up by doing git archive, that file doesn’t have any git metadata in it, and so setuptools-scm logs the warning.

  3. Running python -m build --wheel doesn’t have this problem, because that builds the wheel from the working git directory instead of from the sdist.

Anyway, this makes sense now, although it is kind of a crappy situation. Do you know if there’s a way to have build (or hatchling?) run the right git machinery to get the right values substituted into .git_archival.txt when building the sdist? It seems like that’s the right fix here, since you’d theoretically want that file correctly populated in the sdist as if it were an actual git archive.

@Mr0grog
Copy link
Member

Mr0grog commented Dec 4, 2023

Or alternatively: maybe the right way to do this is to build with:

# From the root of the repo:
git archive --output ../build_dir/source_archive.tar <tag-to-build>
cd ../build_dir
tar -xf source_archive.tar
rm source_archive.tar
python -m build .

That way you are building the sdist with a properly archived/archive-ready file tree.

Copy link
Member

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so to summarize:

  • It’s clear to me now why we are getting the warning. I think it’s pointing us to a legit problem, although not a critical one: our builds/distributions have placeholder/template files in them that they probably should not. There are a bunch of solutions here, but I think either of these two make the most sense:

    1. Write a build script that builds from an archive (see above) so the .git_archival.txt file is correctly populated in our builds. (Should be a script and not just commands in the CircleCI config file so we can use it in CI and whenever someone is building a release [someday I’d like to script the whole release process in CI, but I think that’s out of scope here].)
    2. Exclude the file from builds so it’s not present or visible to hatch-vcs/setuptools-scm.
  • We also need to fix the build error by excluding test files (preferable) or by not checking those files when validating the built wheel with check-wheel-contents.

@Mr0grog Mr0grog added this to the v0.5.0 milestone Dec 4, 2023
@Mr0grog
Copy link
Member

Mr0grog commented Dec 4, 2023

Re: “exclude the file from builds,” I tripped over this pyproject.toml snippet today while reading about something else, and I think it’s what we’d want to use here:

[tool.hatch.build.targets.sdist]
exclude = [".git_archival.txt"]

(The same article also suggests removing test files from wheels but not sdists, which I hadn’t thought of before. Interesting.)

@danielballan
Copy link
Contributor Author

Let me a little digging around some other projects and the behavior of their builds. I wonder if (i) is better, though I agree you have found the right implementation of (ii) if we go that way.

someday I’d like to script the whole release process in CI, but I think that’s out of scope here

Happy to tackle this next.

I usually include the tests. The “web deployment” side of Python, interested in small containers and fast CD, pushes to keep the tests out, while the “science” side of Python, where end users also sometimes dip into light development, almost always bundles them.

The sdist/wheel breakdown is a reasonable compromise.

I am on international travel for work; I left a note to resume this Monday when I am back at my keyboard.

@Mr0grog
Copy link
Member

Mr0grog commented Dec 5, 2023

That all makes sense to me!

I am on international travel for work; I left a note to resume this Monday when I am back at my keyboard.

Sounds good. I’m hoping to include this in a v0.5.0 release before the end of the year, but otherwise there’s no rush. (Besides this I want to do some final rate limiting cleanup for #137 and to finally get thread safety by dropping requests, but we’ll see; I may be being to hopeful there.)

someday I’d like to script the whole release process in CI, but I think that’s out of scope here

Happy to tackle this next.

👍

@danielballan
Copy link
Contributor Author

Reading a bit, I agree that excluding .git_archival.txt from the bdist is the best way to resolve these warnings. Commits added.

Googling the current error in CircleCI, I found a post from one @Mr0grog: https://discuss.circleci.com/t/upgrading-to-the-convenience-image-breaks-everything/42746

Do we need to further adjust our cache key here?

@Mr0grog
Copy link
Member

Mr0grog commented Dec 10, 2023

Googling the current error in CircleCI, I found a post from one @Mr0grog: https://discuss.circleci.com/t/upgrading-to-the-convenience-image-breaks-everything/42746

Do we need to further adjust our cache key here?

🤣 Yep, I think that’s probably all you need to do.

@danielballan
Copy link
Contributor Author

Last issue:

dist/wayback-0.4.5.dev12+gcc638d8-py3-none-any.whl: W002: Wheel contains duplicate files:
  wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml
  wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml

@danielballan
Copy link
Contributor Author

Indeed these files have the same contents, but I think we should not assume that they always will forever, so the right thing here to make an exemption in the check:

$ md5sum wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml
1ede1c2e263c5b56731d1c29156f9784  wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml
1ede1c2e263c5b56731d1c29156f9784  wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml

@Mr0grog
Copy link
Member

Mr0grog commented Dec 11, 2023

Huh, I would have thought that would be solved by 7be22d4. Is that only addressing Python files, or does it need to be wayback/tests/**/*?

@Mr0grog
Copy link
Member

Mr0grog commented Dec 11, 2023

Oh, no, I think it needs to be the wheel target:

[tool.hatch.build.targets.wheel]

@danielballan
Copy link
Contributor Author

Ah, indeed. Pushed.

@danielballan danielballan marked this pull request as ready for review December 11, 2023 19:01
Copy link
Member

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohoo, successful builds! 🎉 Is more-or-less ready to merge into #144?

Also one question: Does MANIFEST.in still get used (I think that is really a setuptools thing)? If no, we should probably also delete it here.

And a couple quick notes inline.

pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
@Mr0grog
Copy link
Member

Mr0grog commented Dec 11, 2023

Oh, I also wanted to ask: what pushed you towards excluding .git_archival.txt rather than building the sdist from an archive?

I agree that excluding .git_archival.txt from the bdist is the best way to resolve these warnings.

@danielballan
Copy link
Contributor Author

Your argument made sense, I just look around a bit at some projects and confirmed that I could not find any examples that could convince me otherwise.

pyproject.toml Outdated Show resolved Hide resolved
Copy link
Member

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, this looks good! I uploaded to test.pypi.org and then installed the uploaded package both from the wheel and from the source and everything seemed to work. 🎉

I also did a quick test of removing MANIFEST.in, and it makes no different to the content of the sdist or wheel, so I will drop that over in #144 after merging this into it.

@Mr0grog Mr0grog merged commit 30e4243 into edgi-govdata-archiving:can-we-drop-setuppy-lets-find-out Dec 16, 2023
14 checks passed
@Mr0grog
Copy link
Member

Mr0grog commented Dec 16, 2023

@danielballan would a good quick followup here be moving /wayback to /src/wayback?

@danielballan danielballan deleted the can-we-drop-setuppy-lets-find-out branch December 17, 2023 01:09
Mr0grog pushed a commit that referenced this pull request Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants