Skip to content

Support installing from Git URLs #539

Closed
techtonik opened this Issue May 21, 2012 · 17 comments

7 participants

@techtonik
pip install https://github.com/pypa/pip.git
Downloading/unpacking https://github.com/pypa/pip.git
  Downloading pip.git (47Kb): 47Kb downloaded
  Cannot unpack file /tmp/pip-54rGHd-unpack/pip.git (downloaded from /tmp/pip-Kc8M8w-build, content-type: text/html; charset=utf-8); cannot detect archive format
Cannot determine archive format of /tmp/pip-Kc8M8w-build

The proposed workaround is to detect if failed URLs ends with .git and try to use git (or dulwich) to get content from it if everything else fails.

@pnasrat
Python Packaging Authority member
@techtonik

That's good to know, but as you may see - not everybody is aware. It would be nice to support the HTTPS version too as it is the one proposed by GitHub fopr copy/paste.

@mihneadb

Hello,

I submitted #580 to address your issue.

Mihnea

@carljm
carljm commented Jun 17, 2012

My concern about auto-detecting .git and prepending git+ if it's not there is that I'd like to move towards unifying the supported URL formats for editable and non-editable URLs, and editables will always have a URL fragment, so a naive "endswith" check is not adequate.

Otherwise, a URL whose path portion ends with .git is a pretty reliable marker of a git repo, so I'm not opposed to special handling for that so naive usage does what the user expects.

@techtonik

I don't quite get the stuff about URLs. To me all URLs can be immutable - it should be enough to calculate some weights based on URL just to sort handlers and try them one by one until a correct one if found.

@carljm
carljm commented Jun 17, 2012

I'm not at all convinced that "calculating some weights" and "trying them one by one" is a sane approach for vcs backends, as opposed to having relatively simple documentable rules for how URLs are linked to the appropriate VCS backend. Trying the wrong VCS for a given URL will in some cases result in errors that are quite difficult to distinguish programmatically from things like simple network errors (which is something pip has to deal with often), so the whole thing would be both fragile and slow (if you try a github url and github or your network is down, suddenly pip is trying the same URL with every other VCS backend...).

That said, if you want to put together a pull request as proof of concept for this approach, feel free to give it a try and see how it works out.

@techtonik

The problem is not solvable with static URL rules. Bitbucket (and Mercurial in general) URLs don't contain any repository identification information. RTFM is not a good UX.

Besides, handler is not yet a VCS - handler can make a probe for the URL to detect if the URL is actually of its type. The exception in case of broken connection and different exception for wrong repository type covers your user story about fragile errors.

Next time I will have problems with pip recognizing Mercurial repo for HTTPS URL I'll try to make some patch. The rule matching is a common pattern though, and I am sure that if not setuptools then other tools might already have it.

@mihneadb

Well then maybe we should just add a helpful message to the error that pip shows when you try to do pip install URL? Like "perhaps you should use VCS+URL?"

@pnasrat
Python Packaging Authority member
pnasrat commented Jun 19, 2012

@mihneadb I'd be very happy with a patch to make the error clearer by giving better usage instructions.

@mihneadb
@dstufft
Python Packaging Authority member
dstufft commented Mar 28, 2014

Closing this issue, I agree with others that trying to detect the repository type is a bad path to go down.

@dstufft dstufft closed this Mar 28, 2014
@duly
duly commented Mar 28, 2014

@dstufft Is it still possible to incorporate @pnasrat suggestion of giving better usage instructions. This was non-obvious to us at first when using pip to install pyjs from pypi.
https://pypi.python.org/pypi/pyjs

@dstufft
Python Packaging Authority member
dstufft commented Mar 28, 2014

Sure! Better instructions are always welcome :)

@techtonik

@dstufft, I am not sure you've chosen a valid reason to close this story. So far only @carljm expressed doubts in AI approach, but open to see PR that solves the task.

@techtonik

Detecting repository type is just one extra request. If pip makes it hard to implement maintainable message exchange with two steps, maybe there is a problem in its architecture?

@Ivoz
Python Packaging Authority member
Ivoz commented Mar 29, 2014

Just chucking pip a url and "hoping it will figure out what to do with it after trying x different things" is bad, implicit design, and as already suggested it makes useful error reporting insane. Explicitness is better. I think pip already accepts too many valid formats of package requirements using the single command pip install <requirement> - with many different valid forms of what <requirement> could be, and this is already the cause of many cases where pip will simply print a stack trace instead of a useful error message.

Unfortunately there's obviously backwards compatibility to consider so we don't have license to rewrite things wholesale.

maybe there is a problem in its architecture?

Then that (ideally) should be fixed first, before adding to the problems it causes. Pull requests welcome.

@duly
duly commented Mar 29, 2014

If the stack traces are reported then this is the ideal situation.
In most cases it won't. As developers we can only handle known exceptions. If an unknown exception occurs we can only address the issue if the user reports it.
Iterating through known requirements establishes a protocol and procedure. If that procedure needs to be readdressed at a later time then that's what we should do.
In this case, requiring git+ for all git URLs is a requirement that is causing a common issue due to the way GitHub has structured their HTTP Secure git clone URLs.
It is in our opinion that pip should detect the URL extension and add the VCS information if it is in fact a GitHub URL.
@dstufft believes otherwise and we respect his opinion.
@pnasrat suggested that user instructions be added.
@techtonik confirms that pip should auto-detect.
@Ivoz believes that pip accepts too many package types already. We should add functionality not remove it while being explicit in the process. (Note: To us this means that we should readdress pip's documentation.)

We are working on an internal solution. If we have any pull requests we will certainly initiate those in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.