Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include .lark files in package data #1308

Merged
merged 1 commit into from
Jul 25, 2023
Merged

Conversation

ptrcnull
Copy link
Contributor

Fixes #1307

@MegaIng
Copy link
Member

MegaIng commented Jul 22, 2023

In comparison to the old setup.py file, it looks like *.md is a pattern that was included back then but isn't any more. Do we still need that?

lark/setup.py

Line 20 in 7d9cfa6

package_data = {'': ['*.md', '*.lark'], 'lark': ['py.typed']},

@erezsh
Copy link
Member

erezsh commented Jul 22, 2023

@MegaIng I don't know. It looks okay in my builds, but apparently I can't rely on that.

@MegaIng
Copy link
Member

MegaIng commented Jul 22, 2023

If we want them to be included, they need to be added, even if just for documentation.

@erezsh
Copy link
Member

erezsh commented Jul 22, 2023

Readme.md is already mentioned by name, and docs/* should take care of the rest.

Also, right now we don't include examples/README.rst, or any of the examples subfolders.

@MegaIng
Copy link
Member

MegaIng commented Jul 22, 2023

Readme.md is already mentioned by name, and docs/* should take care of the rest.

Are we looking at the same file? I don't see either of those included

@erezsh
Copy link
Member

erezsh commented Jul 22, 2023

They are in MANIFEST.in

P.S. maybe we can just require setuptools-scm, and then we won't have to maintain MANIFEST.in as often.

@MegaIng
Copy link
Member

MegaIng commented Jul 22, 2023

Aha right.

I don't have experience with setuptools-scm, so no real opinion on that.

But if we are using MANIFEST.in, shouldn't we add the grammars there instead of as package data?

@erezsh
Copy link
Member

erezsh commented Jul 24, 2023

@KOLANICH @henryiii Do you guys have an opinion about this? (manifest vs package-data vs scm)

@KOLANICH
Copy link
Contributor

  1. In order to include the lark files tool.setuptools.include-package-data=true should be enough. tool.setuptools.package-data section is needed when it is false. When it is true, it captures all files in dirs having __init__.py automatically. In my projects I rely on this behaviour, and the files that shouldn't be within wheels are just out of the main package. Since it is already true, the section can be just purged.

  2. 👍 for using "setuptools_scm>7". 7 because since that version it can work with sdists generated from git repos.

  3. MANIFEST.in is relevant only to sdists. But when there is no -s or -w, build builds an sdist first, and then builds a wheel from it.

  4. about the readme, I suspect it may make sense to use the file as a readme, rather than hardcode a string into pyproject.toml.

  5. you cannot include files that are not a part of a package into wheels (except the ones that go into metadata automatically). And it shouldn't be done. I think the HTML docs and tests shouldn't be within wheels.

  6. I also propose to eliminate the manual enumeration of the subpackages (tool.setuptools.packages) and just use

[tool.setuptools.packages.find]
include = ["lark", "lark.*"]

@KOLANICH
Copy link
Contributor

  1. Fix: setuptools_scm[toml]>7

@henryiii
Copy link
Contributor

henryiii commented Jul 24, 2023

My preference is to use hatchling. :) That includes everything that is not gitignored by default, and can be configured easily in pyproject.toml without a separate MANIFEST.in file. It's also simpler and faster than setuptools.

For setuptools, package-data and MANIFEST.in are not interchangeable. You place things that go into the SDist into MANIFEST.in. You place things that go into the wheel in package-data. For simplicity, it will also be added to your MANIFEST if it's in package-data (since you need it in the SDist to get it in the wheel), but the question is, does it go in the wheel? If so, then use package-data. Otherwise use MANIFEST.in or setuptools_scm.

Not a huge fan of setuptools_scm's file finder. You can't turn it off if it's installed, so it affects every other package, even if they have no idea it exists. This doesn't affect isolated builds, of course, but it does affect systems like Spack where there's only one build environment. However, it's a very minor worry, and I do use setuptools_scm for versioning lots of places, which technically installs the file finder too (at least for now).

@erezsh
Copy link
Member

erezsh commented Jul 25, 2023

Thanks everyone for weighing in. It was helpful.

It sounds like this change is correct and improves Lark, so I will merge this PR.

I think all of the suggestions you raised have merit, and will be more convenient. Hatchling sounds great, at least in theory. I would be happy to get rid of MANIFEST.in. But it also sounds like more work, and another risk of breaking people's builds.

Same goes for setuptools_scm, and this whole issue started because of its odd behavior. But, it might turn out to be an easy win.

Or we can leave things as they are, i.e. hopefully working but a little clumsy.

I don't know which choice is the right one, so if you feel strongly about it, you can open a new PR. If it makes our configuration shorter and easier to manage, I'll be happy to accept it.

@erezsh erezsh merged commit 656334c into lark-parser:master Jul 25, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wheels built using the new pyproject file don't include .lark files
5 participants