Add the ability for third party transport mechanisms #4398

dstufft · 2017-04-01T21:34:04Z

I'm not sure if this is a good idea or not. We've had a couple requests for things like S3 support (#4225), IPFS (#4215), and other non HTTP(s) transport mechanisms.

Generally we've rejected these in the past because we didn't want to be responsible for having a whole bunch of different mechanisms for fetching packages within pip itself. However one idea that we could implement is the ability to add additional (but ideally not replace the built in mechanisms) in third party code bases. This is obviously not as nice for those projects as having it baked into pip itself, but it does at least provide a path forward for them. It is possible that this could also be used to implement the VCS support we now have which would allow people to support VCSs that we don't currently support (and possibly spin our VCSs off into their own thing).

If we did it, we would need to keep the API surface very minimal, Maybe even something like:

import abc
import collections


class TransportError(Exception):
    """ Raised when there is a transport error of some code. """


class TransportABC(metaclass=abc.ABCMeta):
    """
    A Transport ABC class is a class that holds functionality to allow pip to
    access a remote resource. It is modeled after a HTTP request/response cycle
    so any transport system that doesn't follow that will need to be adapted to
    fit that.
    """

    def get(self, url: str) -> Dict:
        """ Fetches the given URL from the remote resource. """

That obviously needs a lot more thought put into it. Ideally I think we'd say that a transport class has to be thread safe, and that we'll create one instance of it per process and reuse that (to allow for things like connection pooling).

Thoughts @pypa/pip-committers? Do we think that trying to allow this is a good thing, or do we think that the use cases are not large enough to warrant it?

pfmoore · 2017-04-01T21:51:40Z

Well, #4225 also supported an S3 bucket as an index, so you'd need to be able to get a directory listing as well as a file (or is get intended to get a directory listing, mapping filenames to normal URLs?) My instinct is that we could spend a lot more time than we expect ironing out the API, and dealing with questions about how we discover user-supplied modules, etc. And the use case seems pretty specialised.

FWIW, I still think people interested in interfacing to other storage types would be better writing a small HTTP server that serves the target service as a PEP 503 compliant index. I doubt it would be hard. If nothing else, the fact that there aren't such things already on PyPI suggests to me that the need isn't that pressing.

dstufft · 2017-04-01T21:54:10Z

So our indexes are something like /simple/requests/, so I intended for the transport to handle mapping that to whatever makes sense for their use case (in the S3 example, it'd pull it from the bucket).

We absolutely can spend a bunch of time ironing out an API and such for this, and I don't really have a good feel for if this is generally useful or not. I could personally go either way.

xavfernandez · 2017-04-02T18:03:16Z

Somewhat related (but indeed more heavy), I liked the suggestion of @pfmoore in #4018 of creating/using a proxy between pip and those special endpoints.

pfmoore · 2017-04-02T19:12:21Z

In fact, after thinking further on this, I'm -1 on adding pip support for this. I think our official recommendation for people wanting to read from non-standard sources should be to write a proxy. Once we have some experience with how people get on writing proxies, we can revisit the question of native support if the evidence is that proxies are an insufficient solution.

dstufft · 2017-04-02T19:13:10Z

Sounds good.

pfmoore · 2017-04-02T19:31:41Z

Would it be worth adding a section to the PUG, alongside https://packaging.python.org/installing/#installing-from-other-indexes, something like "Installing from non-PEP 503 compliant sources"? I'd be willing to write a short section explaining the "use a proxy" idea if it seems worthwhile.

dstufft · 2017-04-02T19:33:17Z

Sure! Something to point folks to would be nice I think.

pfmoore · 2017-04-02T19:33:47Z

Cool, I'll try to do it in the next day or so.

pfmoore · 2017-04-03T08:31:38Z

pypa/packaging.python.org#289 submitted for this. @dstufft could you take a look? I don't think I have the rights to merge PRs on that repo.

dstufft mentioned this issue Apr 1, 2017

Allow using Amazon S3 repositories #4225

Closed

dstufft closed this as completed Apr 2, 2017

lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 3, 2019

lock bot locked as resolved and limited conversation to collaborators Jun 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the ability for third party transport mechanisms #4398

Add the ability for third party transport mechanisms #4398

dstufft commented Apr 1, 2017

pfmoore commented Apr 1, 2017

dstufft commented Apr 1, 2017

xavfernandez commented Apr 2, 2017

pfmoore commented Apr 2, 2017

dstufft commented Apr 2, 2017

pfmoore commented Apr 2, 2017

dstufft commented Apr 2, 2017

pfmoore commented Apr 2, 2017

pfmoore commented Apr 3, 2017

Add the ability for third party transport mechanisms #4398

Add the ability for third party transport mechanisms #4398

Comments

dstufft commented Apr 1, 2017

pfmoore commented Apr 1, 2017

dstufft commented Apr 1, 2017

xavfernandez commented Apr 2, 2017

pfmoore commented Apr 2, 2017

dstufft commented Apr 2, 2017

pfmoore commented Apr 2, 2017

dstufft commented Apr 2, 2017

pfmoore commented Apr 2, 2017

pfmoore commented Apr 3, 2017