pip install https://github.com/pypa/pip.git
Downloading pip.git (47Kb): 47Kb downloaded
Cannot unpack file /tmp/pip-54rGHd-unpack/pip.git (downloaded from /tmp/pip-Kc8M8w-build, content-type: text/html; charset=utf-8); cannot detect archive format
Cannot determine archive format of /tmp/pip-Kc8M8w-build
The proposed workaround is to detect if failed URLs ends with .git and try to use git (or dulwich) to get content from it if everything else fails.
That's good to know, but as you may see - not everybody is aware. It would be nice to support the HTTPS version too as it is the one proposed by GitHub fopr copy/paste.
I submitted #580 to address your issue.
My concern about auto-detecting .git and prepending git+ if it's not there is that I'd like to move towards unifying the supported URL formats for editable and non-editable URLs, and editables will always have a URL fragment, so a naive "endswith" check is not adequate.
Otherwise, a URL whose path portion ends with .git is a pretty reliable marker of a git repo, so I'm not opposed to special handling for that so naive usage does what the user expects.
I don't quite get the stuff about URLs. To me all URLs can be immutable - it should be enough to calculate some weights based on URL just to sort handlers and try them one by one until a correct one if found.
I'm not at all convinced that "calculating some weights" and "trying them one by one" is a sane approach for vcs backends, as opposed to having relatively simple documentable rules for how URLs are linked to the appropriate VCS backend. Trying the wrong VCS for a given URL will in some cases result in errors that are quite difficult to distinguish programmatically from things like simple network errors (which is something pip has to deal with often), so the whole thing would be both fragile and slow (if you try a github url and github or your network is down, suddenly pip is trying the same URL with every other VCS backend...).
That said, if you want to put together a pull request as proof of concept for this approach, feel free to give it a try and see how it works out.
The problem is not solvable with static URL rules. Bitbucket (and Mercurial in general) URLs don't contain any repository identification information. RTFM is not a good UX.
Besides, handler is not yet a VCS - handler can make a probe for the URL to detect if the URL is actually of its type. The exception in case of broken connection and different exception for wrong repository type covers your user story about fragile errors.
Next time I will have problems with pip recognizing Mercurial repo for HTTPS URL I'll try to make some patch. The rule matching is a common pattern though, and I am sure that if not setuptools then other tools might already have it.
Well then maybe we should just add a helpful message to the error that pip shows when you try to do pip install URL? Like "perhaps you should use VCS+URL?"
pip install URL
@mihneadb I'd be very happy with a patch to make the error clearer by giving better usage instructions.
Closing this issue, I agree with others that trying to detect the repository type is a bad path to go down.
@dstufft Is it still possible to incorporate @pnasrat suggestion of giving better usage instructions. This was non-obvious to us at first when using pip to install pyjs from pypi.
Sure! Better instructions are always welcome :)
@dstufft, I am not sure you've chosen a valid reason to close this story. So far only @carljm expressed doubts in AI approach, but open to see PR that solves the task.
Detecting repository type is just one extra request. If pip makes it hard to implement maintainable message exchange with two steps, maybe there is a problem in its architecture?
Just chucking pip a url and "hoping it will figure out what to do with it after trying x different things" is bad, implicit design, and as already suggested it makes useful error reporting insane. Explicitness is better. I think pip already accepts too many valid formats of package requirements using the single command pip install <requirement> - with many different valid forms of what <requirement> could be, and this is already the cause of many cases where pip will simply print a stack trace instead of a useful error message.
pip install <requirement>
Unfortunately there's obviously backwards compatibility to consider so we don't have license to rewrite things wholesale.
maybe there is a problem in its architecture?
maybe there is a problem in its architecture?
Then that (ideally) should be fixed first, before adding to the problems it causes. Pull requests welcome.
If the stack traces are reported then this is the ideal situation.
In most cases it won't. As developers we can only handle known exceptions. If an unknown exception occurs we can only address the issue if the user reports it.
Iterating through known requirements establishes a protocol and procedure. If that procedure needs to be readdressed at a later time then that's what we should do.
In this case, requiring git+ for all git URLs is a requirement that is causing a common issue due to the way GitHub has structured their HTTP Secure git clone URLs.
It is in our opinion that pip should detect the URL extension and add the VCS information if it is in fact a GitHub URL.
@dstufft believes otherwise and we respect his opinion.
@pnasrat suggested that user instructions be added.
@techtonik confirms that pip should auto-detect.
@Ivoz believes that pip accepts too many package types already. We should add functionality not remove it while being explicit in the process. (Note: To us this means that we should readdress pip's documentation.)
We are working on an internal solution. If we have any pull requests we will certainly initiate those in the future.