Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing the URL of a source (e.g. PyPI) at the global level #1632

Open
2 tasks done
JacobHenner opened this issue Nov 26, 2019 · 55 comments
Open
2 tasks done

Replacing the URL of a source (e.g. PyPI) at the global level #1632

JacobHenner opened this issue Nov 26, 2019 · 55 comments
Labels
area/sources Releated to package sources/indexes/repositories kind/feature Feature requests/implementations

Comments

@JacobHenner
Copy link

JacobHenner commented Nov 26, 2019

  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have searched the documentation and believe that my question is not covered.

Feature Request

Similar to one of the proposals in #1070 (which was recently marked stale), Poetry should allow the user to override the default repository URL (PyPI). The user should be able to do this without modifying pyproject.toml.

In certain environments (e.g. corporate networks) PyPI is unavailable, but a mirror exists. These users should be able to specify the address of the mirror without modifying project files, as the mirror settings are irrelevant to contributors in different environments. Similarly, if a mirror user adds a dependency, the generated lock file should not list the user's mirror as the source. The source should remain the default (which in most cases would refer to standard PyPI).

This feature exists in pipenv, see pypa/pipenv#2075 (where the need for this functionality is described in greater detail) and pypa/pipenv#2281.

@JacobHenner JacobHenner added the kind/feature Feature requests/implementations label Nov 26, 2019
@countergram
Copy link

This is essential for many business uses, not simply when PyPI is unavailable but also in any case where the organization has its own libraries (not uncommon). Note that since some private repo tools (e.g. Nexus) use basic auth URLs, putting the repo URL into a project config file is absolutely inappropriate and a global config or environment variable (e.g. pip.conf, PIP_INDEX_URL) is necessary.

@cjw296
Copy link

cjw296 commented Nov 27, 2019

#625 also seems related.

Something I tried, which might be nice to make work:

poetry config repositories.pypi https://.../+simple/

@vlcinsky
Copy link
Contributor

@sdispater, I wonder, if #1070 elaboration of requested feature is usable as is or it needs some update. If so, I volunteer to join effort with one or few others, have a telco and try to move this request on as this is one of two showstoppers for our usage of poetry (the other is managing versions of the resulting package - but this I definitely do not want to discuss here).

@lovepocky
Copy link

a simple patch for ci/cd:

# install dependencies from lock file
COPY pyproject.toml poetry.lock /opt/app/

RUN sed -i "s/${origin_pypi_url}/${private_pypi_cache_url}/g" poetry.lock
RUN sed -i "s/${origin_pypi_url}/${private_pypi_cache_url}/g" pyproject.toml

RUN poetry install -vvv

@bjoernpollex
Copy link
Contributor

bjoernpollex commented Dec 20, 2019

@lovepocky Wouldn't that break the content-hash in poetry.lock? I think this might cause poetry to refresh the lock file.

@sdispater
Copy link
Member

Poetry needs the url information of a dependency for a private repository. Otherwise, it cannot guarantee the determinism of the lock file since two files, even with the same name, may not have the same information.

And if it's a question of not storing the private index credentials in the pyproject.toml, only the base url should be put in the pyproject.toml file. The credentials should be configured separately vie the config command or via environment variables, see https://python-poetry.org/docs/repositories/#configuring-credentials

@JacobHenner
Copy link
Author

Poetry needs the url information of a dependency for a private repository. Otherwise, it cannot guarantee the determinism of the lock file since two files, even with the same name, may not have the same information.

The idea here is the private repo specified as the override will be a PyPI mirror. The packages served by the mirror will be exact copies of the ones from https://pypi.org/, without any modifications. Anything else belongs in a separate repo, with URLs included explicitly.

@vlcinsky
Copy link
Contributor

Poetry needs the url information of a dependency for a private repository.

I agree. This contributes to usability of poetry as it provides complete information where to install from.

it cannot guarantee the determinism of the lock file since two files, even with the same name, may not have the same information.

{file = "yarl-1.4.2.tar.gz", hash = "sha256:58cd9c469eced558cd81aa3f484b2924e8897049e06889e8ff2510435b7ef74b"}

I thought, that the hash above is calculated from the package file content and does not depend on filename and url and thus it allows to check, that two files (even from different urls) provide exactly the same information.

Treat source url the way git treats remote configuration

The analogy is not perfect, but it is very close to the use case.

git allows to clone a repository, have initial remote configured, but it is easy to change the remote to another git server (e.g. from Github to GitLab or alternative repo name) and all will still work. If I configure the remote badly, git will complain immediately at the first command dealing with remote server, because the commit hashes will not match.

I hope, poetry will once allow me to keep existing pyproject.toml and poetry.lock untouched and accept alternative url (e.g. configured via env variable) of my private pypi for given source (name) to do sort of "temporary git remote reconfiguration".

If my alternative private pypi url serves exactly the same packages for my installation (checked by comparing hashes), all shall run as usually, if alternative url provides different package content, it shall fail.

Such level of determinism would still provide all the service I appreciate from poetry today and would provide enough flexibility to fit common CI/CD processes.

@cjw296
Copy link

cjw296 commented Dec 23, 2019

As above, my use case is a private pypi mirror. At some stage, the public pypi may even be firewalled off, and it doesn't feel right to have to have a different pyproject.toml for use behind a firewall as for in front of it, for the same code.

@jhbuhrman
Copy link

jhbuhrman commented Jan 7, 2020

I fully agree with #1632 (comment).

IIRC poetry is using pip already under the hood for a certain part of its functionality. Wouldn't it be sufficient if poetry would simply adhere to the pip.conf (Unix-derived) or pip.ini (Windows) [global] configuration items index, index-url, and trusted-host?
(see https://pip.pypa.io/en/stable/user_guide/#config-file)

@NateScarlet
Copy link

@jhbuhrman
#1554 (comment) said poetry will not going to respect pip.ini

@mcouthon
Copy link
Contributor

mcouthon commented Apr 5, 2020

I'm a little confused. I would've assumed that this would've been sufficient:

poetry config repositories.REPO_NAME https://artifactory.XXX.com/artifactory/api/pypi

But it seems that setting the config globally doesn't negate the need for setting the URL in each pyproject.toml file. Is that by design or is it a bug? If it's by design, then what's the rationale behind it?

@swist
Copy link

swist commented Jul 20, 2020

This feature would be very useful for scenarios where jwt for authenticating with the registry is prepended to the beginning of the repo url, AWS codeartifact for example builds repo urls like so:

https://aws:<JWT>@<domain>-<aws-account>.d.codeartifact.eu-west-1.amazonaws.com/pypi/python/simple/

Current setup that requires poetry users to define this as a static url inside of pyproject.tml makes it impossible to use (because these are sessioned to ~12hours, JWT gets re-rolled afterwards)

I see the workaround to the effect off:

re log-in whenever the session expires

but that still requires me to set the url on every project rather than once and for all for my docker image builder

@vlcinsky
Copy link
Contributor

@swist I think, that in this case you will manage with existing poetry as the part in front of @ is username (aws) and password (<JWT>), which can be edited out of pyproject.toml file. poetry will store it either in file ~/.config/pypoetry/auth.toml or in system credential store such as in seahorse (I am working in Debian Buster).

Just configure url in form of https://<domain>-<aws-account>.d.codeartifact.eu-west-1.amazonaws.com/pypi/python (for me the form without the /+simple suffix works)

@swist
Copy link

swist commented Jul 23, 2020

Turns out there's a magic envvar (should have finished reading the docs) that does the auth. Still doesn't quite solve the problem when you're accessing the same repository via different vpc endpoints (for example building your images in multiple clusters but pushing to same registry) - that would still require a rewrite of pyproject.toml (and the lockfile I suppose) at build time

@m1hawkgsm
Copy link

Turns out there's a magic envvar (should have finished reading the docs) that does the auth. Still doesn't quite solve the problem when you're accessing the same repository via different vpc endpoints (for example building your images in multiple clusters but pushing to same registry) - that would still require a rewrite of pyproject.toml (and the lockfile I suppose) at build time

@swist Do you mind sharing how exactly you're using Poetry with CodeArtifact? Ignoring the rolling creds bit (I'm aware of it), and assuming a hard coded or configured set of creds, that's fine. I'm having a hard time understanding how to get Poetry to work without getting 403's and such (and yes, I've seen the docs for using config, env vars, etc).

Apologies for piggybacking off this thread, I'd message directly or open an issue but looks like you have something already :)

@swist
Copy link

swist commented Sep 17, 2020

@m1hawkgsm turns out there are two separate urls you need to use.

If you want to pull you need to set the url to be

[[tool.poetry.source]]
name = "my_org"
url = "https://my_org-my_account_id.d.codeartifact.region.amazonaws.com/pypi/repo_name/simple/"

But if you want to push you want do the following cli call:

poetry config repositories.myorg https://my_org-my_account_id.d.codeartifact.region.amazonaws.com/pypi/repo_name

@bjoernpollex-sc
Copy link

Is there any update on this? On the one hand, this ticket is still open, on the other hand, this comment seems to hint that this might never be implemented.

@brandon-leapyear
Copy link

brandon-leapyear commented Nov 6, 2020

This is an old work account. Please reference @brandonchinn178 for all future communication


As another data point, I tried to hack around this by doing find/replace for all mentions of pypi.org with our Nexus URL in POETRY_INSTALL/lib/poetry/repositories/pypi_repository.py. It turns out that Nexus doesn't currently support the package JSON endpoint, so using Nexus would require using the LegacyRepository.

Long story short, it would be great if poetry could allow overriding the PyPI URL, but also allow specifying if poetry needs to use the legacy endpoint for the repository

Update: seems like I got a workaround working

  1. Edit POETRY_INSTALL_DIR/lib/poetry/factory.py:
@@ -88,6 +88,14 @@

             poetry.pool.add_repository(repository, is_default, secondary=is_secondary)

+        # Support alternate PyPI repository
+        # https://github.com/python-poetry/poetry/issues/1632
+        pypi_legacy_repository = config.get("repositories.pypi-legacy")
+        if pypi_legacy_repository:
+            source = dict(pypi_legacy_repository, name="pypi-legacy")
+            repository = self.create_legacy_repository(source, config)
+            poetry.pool.add_repository(repository, True, secondary=False)
+
         # Always put PyPI last to prefer private repositories
         # but only if we have no other default source
         if not poetry.pool.has_default():
  1. Run poetry config repositories.pypi-legacy NEXUS_URL.com/repository/pypi/simple

@mfriedenhagen
Copy link

mfriedenhagen commented Nov 25, 2020

Hello,

  • I spent 2 days to evaluate poetry on my workstation and liked it very much.
  • Then I tried to get a build running in our company datacenter and nothing worked because there I may not access pypi.org directly.
  • Sadly enough, even running poetry install -vvv did not reveal the fact, that poetry tries to reach out to "the internet".
  • As a lot of others already stated using an internal mirror of pypi.org should be possible without putting another [[tool.poetry.source]] into each and every project.
  • poetry.lock holds the SHA sums of the wheel's content already, so no one would be able to intermingle here.
  • The proposal of @brandon-leapyear in Replacing the URL of a source (e.g. PyPI) at the global level #1632 (comment) looks very reasonable to me.
  • For pip I place one file into the default Python Docker image:
$ cat ~/.pip/pip.conf
[global]
cert=/etc/ssl/certs/ca-certificates.crt
index-url = https://repo.example.com/artifactory/api/pypi/pypi-mam/simple

and am done for good for all projects.

@jhbuhrman
Copy link

I am giving up on poetry, it is close to unusable in a shielded development environment with a Nexus, and the maintainer does not seem to understand the frequently brought up issues regarding this. This is sad, because I think it has the greatest dependency-resolver around.

@pawamoy
Copy link

pawamoy commented Nov 25, 2020

Keep the 👍 votes on the issue coming, it could eventually land in the feature roadmap. It's already in the first page of issues when you sort by 👍

You could also take over or upvote this PR #2074 which, to me, is even better than what this feature request is asking for.

@IceTDrinker
Copy link

IceTDrinker commented Jan 19, 2022

Hello,

Having the global config is great if #4944 is accepted, now as mentionned in this comment: #1632 (comment) being able to not set the internal repo url in pyproject.toml (if the private repo config is available) would be very useful as well, any comment on that maybe ?

Edit: is it possible to also not have the url in the lock file ? I know it's a strecth but we would rather have those urls not committed

@JacobHenner
Copy link
Author

For the record, #4944 was rejected earlier today, so this issue remains open without a clear solution proposed. I would be interested in working on an alternative proposal to #4944, but I'm not sure if I'll have an opportunity to do so within the next month or so.

@neersighted
Copy link
Member

For clarity, here's my final comment on that PR:

The project is of course open for contributions, and you are welcome to explore a design and even implementation if you want.

Keep in mind that for complex changes like this, it can often be a process to gain consensus. The design has to be something that is generic (useful to all/doesn't disadvantage some users), maintainable (as we are all volunteers and time is limited), and consistent with the existing design/scope/goals of the project.

Generally before embarking on an ambitious change I would suggest starting with smaller contributions so that you can gain experience with the code base and process. Making large changes without having experience contributing to the project can often end with disappointment as you may not be able to come up with a mergable design/implementation.

I think @bmarroquin is potentially in a good spot to work on a V2 of this, if there is time, as they have experience contributing to Poetry in the past, as well having obviously thought about this problem space. However, if you are dead set on trying to take this on as your first contribution, I strongly suggest that you at least join Discord and try and workshop concepts there. After coming up with a design that you are happy with and you think would be accepted for merge, I would create an issue describing it for discussion of the specifics.

Keep in mind that this is a very hard problem -- sources are very much coupled to the project level with the current design and architecture of Poetry, and it may be difficult/not desirable to change that. Also, keep in mind that what many people want is disparate despite it sounding similar. Some users are looking to add additional sources to all their projects (and many are in monorepos where monorepo features might make more sense anyway), and others are looking to do some sort of blanket URL replacement.

That URL replacement becomes difficult when you consider that the most common use for this functionality, files.pythonhosted.org, does not follow the typical file layout of a PEP 508 repository. Indeed, most existing proxies operate at the index level and not the individual package file level.

Finally keep in mind that this is complicated by other items that we already intend to implement in Poetry, such as 'lock file aware sdist builds' that we would like to introduce (e.g. the install-ability of your project as a dependency will be affected by any features in this space and needs to be factored in).

Basically, what I'm saying is that this is hard for even a regular contributor to attempt, and all of us are voulenteers without particular interest in exploring this feature/problem space. This is much too complex and far-reaching for a drive-by pull request to be very successful -- implementing sources past the project/monorepo level will have to be a thoughtful process and will require a lot of patience and motivation. I don't want to discourage people from contributing, but I do want people to realize this is a lot harder than "why don't you just do X."

Edit: Brett Cannon's blog post on the social dynamics of open source is quite helpful.

@neersighted neersighted changed the title Allow user to override PyPI URL without modifying pyproject.toml Replacing the URL of a source (e.g. PyPI) at the global level Oct 4, 2022
@neersighted neersighted added the area/sources Releated to package sources/indexes/repositories label Oct 4, 2022
@neersighted
Copy link
Member

Related: #5958

@JacobHenner
Copy link
Author

JacobHenner commented Oct 14, 2022

I've published poetry-plugin-pypi-mirror, a plugin that allows pypi.org to be replaced by a mirror specified in an environment variable. It's available on PyPI. Hopefully others will find this useful.

The plugin satisfies the original subject of this issue (Allow user to override PyPI URL without modifying pyproject.toml), but it does not satisfy the current subject as it's not intended to handle replacement of arbitrary sources at the global level.

@BaxHugh
Copy link

BaxHugh commented Nov 11, 2022

I've forked @JacobHenner's plugin: poetry-plugin-use-pip-global-index-url. Instead of specifying the mirror URL in an environment variable, the global.index-url from pip config is used.
This is good for the use case where credentials to a private mirror are managed in the pip config, and possibly change regularly for security reasons.
I't also available on PyPi as poetry-plugin-use-pip-global-index-url.

@mfriedenhagen
Copy link

@BaxHugh, great to read, did you consider to create a PR in project of @JacobHenner? I like the idea of reusing PIP. Maybe look for the env var and if that one is not set, fall back to pip.conf?

@BaxHugh
Copy link

BaxHugh commented Nov 11, 2022

@mfriedenhagen I didn't really consider it, but like you say, it could be good to add it as a feature to the original. But I feel like it should probably be configurable if so. I'm glad you think it's a good feature.
The reason I prefer this behaviour to the original, is that the credentials in my pip.conf for our private PyPi mirror changes every day, so having it taken directly from there works better than using the environment variable. I'm not sure how common a use case that is, to consider adding it to @JacobHenner 's plugin.

@neersighted
Copy link
Member

I think people might be seeing (part of why) this is not implemented in Poetry yet -- coming up with a universal design is hard, and whatever we settle on will be stable/supported for a long time to come, with additions/changes being constrained by the first iteration. Hopefully what y'all learn with plugins can be used to inform a well-thought design for Poetry down the line.

@mfriedenhagen
Copy link

Right @neersighted, maybe adding the configuration to $USER_CONFIGDIR/pypoetry/config.toml would be better. Is there already a concept of namespacing in the file? E.g. something like

[plugins]
[plugins.poetry_plugin_pypi_mirror]
pypi_mirror_url=https://example.org/repository/pypi-proxy/simple/

and then in auth.toml:

[http-basic]
[http-basic.poetry_plugin_pypi_mirror]
username = "me"
password = "s3cr3t"

@neersighted
Copy link
Member

neersighted commented Nov 11, 2022

That's pretty much up to plugin authors; Poetry will not reject unknown keys.

e.g.

# config.toml
[foo]
bar = true
$ poetry config --list
...
foo.bar = true
...

@mfriedenhagen
Copy link

Well, at least I would suggest/document that configuration should be a bit structured ;-)

@JacobHenner
Copy link
Author

Right @neersighted, maybe adding the configuration to $USER_CONFIGDIR/pypoetry/config.toml would be better. Is there already a concept of namespacing in the file? E.g. something like

@mfriedenhagen I've adopted a similar scheme starting with version 0.2.0 of the plugin.

@SadPencil
Copy link

When I run poetry install I found the following warning:

Setting `experimental.new-installer` to false is deprecated and slated for removal in an upcoming minor release.
(Despite of the setting's name the new installer is not experimental!)

Please don't deprecate this setting before this issue is solved by having a direct PyPI repo setting. Currently, using pip is the fastest way to workaround this issue. Almost all Chinese developers suffer from this issue.

@jpz
Copy link

jpz commented Apr 17, 2023

It's disappointing the policy of not respecting pip.conf settings has been decided. For reasons of orthogonality, poetry install and pip -m venv create .... ; ./myenv/activate; .... pip install xxxx; should work the same.

I work in a corporate environment with our repo access to pypi intermediated by an artifactory repository, with different SSL keys. All of these config things are solved for us - SSL, and custom pypi servers. Having to configure poetry as a special case, on top of configuring Python/pip, it does not help adoption, it creates a barrier to ubiquitous adoption.

@waketzheng
Copy link

I've published poetry-plugin-pypi-mirror, a plugin that allows pypi.org to be replaced by a mirror specified in an environment variable. It's available on PyPI. Hopefully others will find this useful.

The plugin satisfies the original subject of this issue (Allow user to override PyPI URL without modifying pyproject.toml), but it does not satisfy the current subject as it's not intended to handle replacement of arbitrary sources at the global level.

Worked for me, thanks~

For who want to install the plugin with a custom pypi mirror, command can be:

python -c "mirror_url='http://mirrors.tencent.com/pypi/simple';from poetry.locations import CONFIG_DIR;import os;cmd='cd {}&&poetry source remove pypi-mirror&&poetry source add --priority=default pypi-mirror {}'.format(CONFIG_DIR,mirror_url);os.system(cmd)"
# For poetry < 1.5 change `--priority=default`  to be `--default`
poetry self add poetry-plugin-pypi-mirror

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sources Releated to package sources/indexes/repositories kind/feature Feature requests/implementations
Projects
None yet
Development

Successfully merging a pull request may close this issue.