Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading description of poetry #506

Closed
sdispater opened this issue May 21, 2018 · 8 comments
Closed

Misleading description of poetry #506

sdispater opened this issue May 21, 2018 · 8 comments
Assignees
Labels
state: needs clarification type: bug A confirmed bug or unintended behavior

Comments

@sdispater
Copy link

poetry for Python developer focused components that are designed primarily for publication to a Python package index (pipenv deliberately avoids making the assumption that the application being worked on will support distribution as a pip-installable Python package, while poetry based applications rely explicitly on their Python packaging metadata to describe their application structure and dependencies)

As the author of Poetry, I find this somewhat misleading.

Like I said on Twitter, there is a misconception around Poetry and what its purpose is. Poetry is a tool to manage Python projects, being an application or a library.

poetry based applications rely explicitly on their Python packaging metadata

This is not true, the only metadata necessary in Poetry are the project name, its version and one or more author. That’s it. I made this choice because for me every project should have a name and a version.

The only difference when managing an application compared to a library is that the lock file should be check into version control and you won’t have to use the build and publish command. Nothing more.

Basically, the workflow you have with pipenv can be replicated with poetry:

Installing packages for your project

poetry add requests

This displays an output like this:

Using version ^2.18 for requests

Updating dependencies
Resolving dependencies... (0.1s)


Package operations: 5 installs, 0 updates, 0 removals

Writing lock file

  - Installing certifi (2018.4.16)
  - Installing chardet (3.0.4)
  - Installing idna (2.6)
  - Installing urllib3 (1.22)
  - Installing requests (2.18.4)

Using installed packages

poetry run python main.py

So it’s seems that this was written by someone who has not used Poetry and has preconceptions about what it does or not. So I thought I would try to clarify things a bit.

@theacodes theacodes added type: bug A confirmed bug or unintended behavior state: needs clarification labels May 21, 2018
@theacodes
Copy link
Member

Thanks, @sdispater - do you have any specific wording you'd like to suggest that we use when linking to poetry?

@ncoghlan
Copy link
Member

The config file for poetry is pyproject.toml - that's a Python-specific packaging metadata format, and the main reason to use it is if you're wanting to interoperate with other tools in the Python ecosystem that know how to read it.

The reason pipenv deliberately doesn't use pyproject.toml is because we explicitly want to separate the "PyPI consumer" case (where components are consumed from PyPI, either for publication through other means, or not for publication at all, but the project as a whole doesn't conform to the requirements of a Python sdist), from the "PyPI publisher" case (where projects are structured in a way that conforms to either the legacy sdist format or the newer one defined in PEPs 517 & 518, and hence can be built automatically by Python-specific build tools).

Historically that distinction was framed as requirements.txt vs setup.py, now its framed as Pipfile vs pyproject.toml. But it's not an accident, and it's not an oversight - it's a deliberate choice.

The folks most likely to find that choice inconvenient are those that would prefer to define everything in pyproject.toml and not have any other config files, which is why the reference out to poetry at the end of the tutorial is framed the way it is: it's not the entirety of the potential audience for poetry, but it's the portion of that audience that are least likely to be happy with the design decisions made in pipenv.

@sdispater
Copy link
Author

@ncoghlan Thanks for your detailed explanation.

I understand the distinction the PyPA makes about the two files, but the way it's written leads to believe that the workflow pipenv provides is not reproducible with poetry which is not true.

Also, it seems to me that this separation puts the Python ecosystem in the minority. Most of the other languages use one single configuration file (with an associated lock file): cargo in Rust, composer in PHP, npm/yarn in Javascript, pub in Dart, maven in Java, even Go seems to be headed this way. So, this will make harder for people coming from this languages to start Python projects, in my opinion.

@dstufft
Copy link
Member

dstufft commented May 22, 2018

Honestly, it's not even really about which file it goes in, you could implement Pipfile as some entries under [tool.pip] or [tool.pipenv] and it wouldn't really matter, although it would be less powerful to do it there (because you can have only one set of mappings there, you can't have multiple like you can (theoretically?) have multiple Pipfiles).

It really comes down to abstract vs concrete dependencies, and for the developing library use case extra dependencies which are not part of the dependency of my library. Not to "pick" on poetry here (pbr used to get this same distinction wrong too imo, and was the impetus for my blog post).

If you look at a pyproject.toml like:

[tool.poetry.dependencies]
cleo = { git = "https://github.com/sdispater/cleo.git", branch = "master" }

Having that git dependency there makes this something that can't be uploaded to PyPI, because that points to a concrete location and dependencies inside of a package are only abstract until they're fully resolved.

This same thing is also true if you're using the power of other things that can exist in Pipfile or requirements.txt, things like path based dependencies, specifying what source (PyPI, TestPyPI, internal mirrors, etc).

It also requires non-dependencies into your dependency metadata in the library case. While this isn't the end of the world, it is generally useless information for end users. The common way to implement this is something like a dev extra, but that isn't really dependencies of the project itself, it's just extraneous stuff that the developers of the library happen to use to develop it.

Finally, it requires some base set of metadata like name/version. When I was writing Warehouse I originally started out using a setup.py, but eventually got rid of it largely because the concept of a version number made zero sense for Warehouse, and we were only ever bothering to editable install it, so munging sys.path was the defacto way of managing it. Thus it became simpler to deal with to drop the packaging metadata completely rather than just chuck in some empty nonsense value and also confuse people into thinking they could package it up as a wheel and install it.

Cargo and Go are bit of misnomers here, because they're solving somewhat of a different problem-- they're not working on a deployment story, they're working on a build story, so the concept of abstract dependencies don't fully apply there. The other languages are working on similar problems, so they're more directly relevant. They ultimately come down to whether you think it's more confusing to have constructs that are syntactically valid, but which cannot be used in certain conditions (e.g. can't use if publishing to PyPI) or whether it's more confusing to have independent files (or a atleast, areas within the same file) where the constructs are more constrained to solving the specific problem, which allows you to avoid misusing construsts that can't be used in a particular context.

Effectively, whether two specific, well fit examples makes more sense or one general, not quite perfect for either case example.

I do take issue with this statement though:

Finally, the Pipfile is just a replacement from requirements.txt and, in the end, you will still need to populate your setup.py file (or setup.cfg) with the exact same dependencies you declared in your Pipfile.

If you're duplicating your dependencies between Pipfile/requirements.txt and setup.py/setup.cfgthen you're using the tool incorrectly. Both Pipfile and requirements.txt have a mechanism of installing from a path, so you'd use something like:

[packages]
"mylib" = {path = ".", editable = true}

and then all of your duplication is now gone.

@sdispater
Copy link
Author

If you look at a pyproject.toml like:

[tool.poetry.dependencies]
cleo = { git = "https://github.com/sdispater/cleo.git", branch = "master" }

Having that git dependency there makes this something that can't be uploaded to PyPI, because that points to a concrete location and dependencies inside of a package are only abstract until they're fully resolved.

That's why if you want to depend on a git dependency in development you would put it in the dev-dependencies section.

[tool.poetry.dependencies]
cleo = "^0.6.6"

[tool.poetry.dev-dependencies]
cleo = { git = "https://github.com/sdispater/cleo.git", branch = "master" }

That way, you can have concrete dependencies and abstract dependencies all in one file.

@dstufft
Copy link
Member

dstufft commented May 22, 2018

@sdispater Development dependencies are not the only case where you'll want to depend on "concrete" dependencies (I had assumed dev-dependences mapped to a extra, but maybe not?).

To be clear, I don't think either answer is right or wrong. I was mostly trying to delve into what the actual differentiating thing (in my mind) between the two approaches are. The two files things is a distraction, since you could just as easily implement Pipfile inside of pyproject.toml.

Fundamentally the difference comes down to whether you try to reuse the same constructs for publication as consumption (because the venn diagram of what you want to do with these two things overlap a lot, but are not a perfect circle) or if you want them separate because you think they're different enough that they deserve their own targeted tooling.

This is mostly targeted at @ncoghlan, cuz I think the "does it belong in pyproject.toml or another file" question is really masking the real question. As a thought exercise, pipfile could have been implemented like:

[build-system]
requires = ["flit"]
build-backend = "flit.buildapi"

[tool.flit.metadata]
module = "foobar"
author = "Sir Robin"
author-email = "robin@camelot.uk"
home-page = "https://github.com/sirrobin/foobar"
requires = ["requests (>=2.6)",
      "configparser; python_version == '2.7'"
]

[tool.pipfile.packages]
"foobar" = {path = ".", editable = true}

Where the abstract dependencies get defined in [tool.flit], and the "concrete-ization" of those get defined in [tool.pipfile]. This is really no different than having them in separate files, except that putting them in pyproject.toml instead of in a dedicated file is less flexible and less powerful.

The real difference between the two approaches is whether you reuse the same constructs for abstract and concrete dependencies or not. There isn't a right answer here, there are just different tradeoffs.

@ncoghlan
Copy link
Member

Thanks for clarifying your concerns with the current reference wording @sdispater - I'll put together a PR that hopefully clarifies things, and we can iterate on some specific wording there.

@dstufft Agreed on the fact that we could have nested Pipfile inside pyproject.toml at a technical level (putting aside the question of when the respective formats were invented), and that the core semantic distinction is between "I am the system integrator for these dependencies" (concrete dependencies) and "I am a component publisher declaring known constraints that someone else will need to decide how to satisfy at integration time" (abstract dependencies).

@ncoghlan
Copy link
Member

#508 is the PR with the rewording. The commit message is longer than the docs are, and it does focus on the file structure, as that seems to be what folks sometimes find most counter-intuitive about the pipenv model (i.e. the fact that if you have a pipenv-managed repository for a PEP 517/518 based repo, you'll end up with both Pipfile and pyproject.toml at the top level).

While concrete vs abstract dependencies is the explanation for why we have that split, I don't think it's why folks sometimes object to having two files: I think that's because for their particular use case, they're in a situation where they don't lose any clarity by dropping back to pyproject.toml as their sole top level dependency declaration file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs clarification type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

No branches or pull requests

4 participants