Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-GitHub git repo hosts #184

Closed
2 of 4 tasks
choldgraf opened this issue Oct 16, 2017 · 17 comments
Closed
2 of 4 tasks

Support non-GitHub git repo hosts #184

choldgraf opened this issue Oct 16, 2017 · 17 comments

Comments

@choldgraf
Copy link
Member

choldgraf commented Oct 16, 2017

There are a bunch of other hosts out there other than vanilla github repositories. We should support most of these eventually!

Some that come to mind:

repo2docker expects to be given a URL that points to a .git repository, so the changes needed for each provider basically entail knowing how to mix a URL structure into a link that repo2docker can work with.

cc @yuvipanda @betatim @rsignell-usgs from the old issue

@yuvipanda
Copy link
Collaborator

In https://github.com/jupyterhub/binderhub/blob/master/binderhub/repoproviders.py you see there's a 'RepoProvider' base class, and a GitHubRepoProvider subclass. We'll just need to implement similar code for other hosts.

The primary functionality they need to provide is to transform a non hashed commit ref (like a tag or branch name) to a commit hash, using an API of some sort. This lets us not rebuild images if they had already been built. It should be fairly straightforward for other hosts.

I think this is a great place for folks new to binder to get involved and make patches! :D

@choldgraf
Copy link
Member Author

@yuvipanda how are we gonna handle things like rate limits on all these various providers? I imagine that any provider will have a process for throttling lots of hits.

@yuvipanda
Copy link
Collaborator

Yeah, that too will be in the provider specific subclass. You can see the code for this in the GitHubRepoProvider, for example.

@Carreau
Copy link
Member

Carreau commented Oct 27, 2017

One of the missing things in provider right now is the ability to tell whether they can undersand a specific URL. In particular if I pass a GitLab URL, the UI will try to pushState /v2/gh/ we will likely need a endpoint that takes a URL as parameter and go through the providers (in order?) to ask whether they can handle it (probably asynchronously if provider can handle requests).

@yuvipanda
Copy link
Collaborator

Yup, this needs to happen in the JS side too, since we probably wanna provide UX clues.

@betatim
Copy link
Member

betatim commented Oct 31, 2017

Is there a reason not to switch to a scheme like /v2/gl/ (gitlab) or /v2/bb/ (bitbucket) etc? And then making how to map from "gh" to a hostname something that is configurable by the admin of the binder instance?

@minrk
Copy link
Member

minrk commented Oct 31, 2017

@betatim that makes sense to me, and I think that's the 'scheme' we have now, where gh is just the key for github. Additional providers can be registered as new keys. I think we can start with a git provider, which accepts any git url as the escape hatch, then we can bless gitlab, bitbucket with special handling as we develop it.

@ctb
Copy link
Contributor

ctb commented Oct 31, 2017 via email

@minrk
Copy link
Member

minrk commented Oct 31, 2017

The API strictly defines what those keys mean. Right now, only gh has a meaning, which is find this repo on github.com. The key identifies the 'provider' and then the provider is responsible for interpreting the rest of the URL. It may be appropriate to use clearer, less compressed, provider keys, though.

The reason for special handling of github.com (and others in the future) is that we can resolve refs and check if a new build is necessary much more efficiently with the GitHub API than we can with git itself. So when we know something about the hosting provider, we can take a shortcut that's much less expensive than a shallow git clone. We can also do potential things in the future like provide links to the original, which we can't do in general for repo URLs.

@choldgraf
Copy link
Member Author

choldgraf commented Oct 31, 2017

We can also do potential things in the future like provide links to the original, which we can't do in general for repo URLs.

I think this is a useful future feature we can implement as a jupyter extension. Would be quite useful for people who want to go back to the original repo after clicking a binder link.

@choldgraf
Copy link
Member Author

note that one of these got completed in #266 !

@choldgraf
Copy link
Member Author

gists are coming in #306

@willingc
Copy link
Collaborator

#216 issue covers OSF.

@choldgraf
Copy link
Member Author

I'm thinking of closing this and handling individual providers with their own issues. WDYT @willingc ?

@willingc
Copy link
Collaborator

Seems like a good idea at this point @choldgraf

@choldgraf
Copy link
Member Author

ok, closing! see top comment for links to provider-specific issues

@willingc
Copy link
Collaborator

Nice touch editing the top issue with the links too. @choldgraf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants