Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-GitHub git repo hosts #184

Closed
choldgraf opened this issue Oct 16, 2017 · 17 comments

Comments

Projects
None yet
7 participants
@choldgraf
Copy link
Member

commented Oct 16, 2017

There are a bunch of other hosts out there other than vanilla github repositories. We should support most of these eventually!

Some that come to mind:

repo2docker expects to be given a URL that points to a .git repository, so the changes needed for each provider basically entail knowing how to mix a URL structure into a link that repo2docker can work with.

cc @yuvipanda @betatim @rsignell-usgs from the old issue

@yuvipanda

This comment has been minimized.

Copy link
Collaborator

commented Oct 16, 2017

In https://github.com/jupyterhub/binderhub/blob/master/binderhub/repoproviders.py you see there's a 'RepoProvider' base class, and a GitHubRepoProvider subclass. We'll just need to implement similar code for other hosts.

The primary functionality they need to provide is to transform a non hashed commit ref (like a tag or branch name) to a commit hash, using an API of some sort. This lets us not rebuild images if they had already been built. It should be fairly straightforward for other hosts.

I think this is a great place for folks new to binder to get involved and make patches! :D

@choldgraf

This comment has been minimized.

Copy link
Member Author

commented Oct 16, 2017

@yuvipanda how are we gonna handle things like rate limits on all these various providers? I imagine that any provider will have a process for throttling lots of hits.

@yuvipanda

This comment has been minimized.

Copy link
Collaborator

commented Oct 16, 2017

Yeah, that too will be in the provider specific subclass. You can see the code for this in the GitHubRepoProvider, for example.

@Carreau

This comment has been minimized.

Copy link
Member

commented Oct 27, 2017

One of the missing things in provider right now is the ability to tell whether they can undersand a specific URL. In particular if I pass a GitLab URL, the UI will try to pushState /v2/gh/ we will likely need a endpoint that takes a URL as parameter and go through the providers (in order?) to ask whether they can handle it (probably asynchronously if provider can handle requests).

@yuvipanda

This comment has been minimized.

Copy link
Collaborator

commented Oct 27, 2017

Yup, this needs to happen in the JS side too, since we probably wanna provide UX clues.

@betatim

This comment has been minimized.

Copy link
Member

commented Oct 31, 2017

Is there a reason not to switch to a scheme like /v2/gl/ (gitlab) or /v2/bb/ (bitbucket) etc? And then making how to map from "gh" to a hostname something that is configurable by the admin of the binder instance?

@minrk

This comment has been minimized.

Copy link
Member

commented Oct 31, 2017

@betatim that makes sense to me, and I think that's the 'scheme' we have now, where gh is just the key for github. Additional providers can be registered as new keys. I think we can start with a git provider, which accepts any git url as the escape hatch, then we can bless gitlab, bitbucket with special handling as we develop it.

@ctb

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2017

@minrk

This comment has been minimized.

Copy link
Member

commented Oct 31, 2017

The API strictly defines what those keys mean. Right now, only gh has a meaning, which is find this repo on github.com. The key identifies the 'provider' and then the provider is responsible for interpreting the rest of the URL. It may be appropriate to use clearer, less compressed, provider keys, though.

The reason for special handling of github.com (and others in the future) is that we can resolve refs and check if a new build is necessary much more efficiently with the GitHub API than we can with git itself. So when we know something about the hosting provider, we can take a shortcut that's much less expensive than a shallow git clone. We can also do potential things in the future like provide links to the original, which we can't do in general for repo URLs.

@choldgraf

This comment has been minimized.

Copy link
Member Author

commented Oct 31, 2017

We can also do potential things in the future like provide links to the original, which we can't do in general for repo URLs.

I think this is a useful future feature we can implement as a jupyter extension. Would be quite useful for people who want to go back to the original repo after clicking a binder link.

@choldgraf

This comment has been minimized.

Copy link
Member Author

commented Nov 24, 2017

note that one of these got completed in #266 !

@choldgraf

This comment has been minimized.

Copy link
Member Author

commented Nov 29, 2017

gists are coming in #306

@willingc

This comment has been minimized.

Copy link
Collaborator

commented Nov 29, 2017

#216 issue covers OSF.

@choldgraf

This comment has been minimized.

Copy link
Member Author

commented Nov 29, 2017

I'm thinking of closing this and handling individual providers with their own issues. WDYT @willingc ?

@willingc

This comment has been minimized.

Copy link
Collaborator

commented Nov 29, 2017

Seems like a good idea at this point @choldgraf

@choldgraf

This comment has been minimized.

Copy link
Member Author

commented Nov 29, 2017

ok, closing! see top comment for links to provider-specific issues

@choldgraf choldgraf closed this Nov 29, 2017

@willingc

This comment has been minimized.

Copy link
Collaborator

commented Nov 29, 2017

Nice touch editing the top issue with the links too. @choldgraf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.