Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Publish to github with specialremote for data access #335
Here is the protocol of setting up a fairly convenient state to aid public data consumption. This protocol assumes that you only push to a server that shall host the data, and do not access the local machine from the server (as this will often not be possible).
Start with regular local annex repo.
On the remote server create new Git repo. We use a non-bare repo. Using a bare repo is possible and makes some of the steps below unnecessary. However, having a work tree on the server allows for additional use cases (browsing, etc.)
On the local machine add the new remote. Do it twice, once for push access via SSH, and once for anonymous pull access via HTTP. And push the content.
Now go back to the remote, and init the annex.
Now the remote repo can be enabled as a special remote on the client-side. Once done, sync the state with the remote.
Now the whole thing can be pushed to github. Create a repo on github, and add it as a remote to the local repo. Then push
At this point, anybody can clone from github and get a local annex repo with an automatically enabled special remote.
At this point the
This is sufficient to make a file available and the availability known. In the anonymously cloned repo now do:
and it will be downloaded via HTTP from
The above is verified and works. The rest is anticipation and extrapolation:
Further workflow for maintaining
Further workflow for publishing to github
In a collaborative scenario where the gihub repo can be modified from elsewhere, git annex sync would be a better fit.
Further workflow for any anonymous clone:
The latter cannot use annex sync, because it would want to push to github without sufficient permissions.
yeap. Thanks for detailing it
When we get to handles thought -- with all the rewrites for datalad's meta (thus no simple pushes of the master containing local urls, or pulls) it would be a bit more evolved ;-) Without "global" urls it could've been indeed this straightforward
referenced this issue
Apr 6, 2016
This should make
Cheers to the GitHub folks. Well done!
sweet! works even for organizations
as for credentials, indeed could go through oauth! I guess we should provide some centralized helper for those, otherwise could use meanwhile the Credentials contraption
which uses keyring module, and ATM with default backend which should theoretically match the system (in my case -- gnome's)
added a commit
Oct 15, 2016
added a commit
Oct 21, 2016
added a commit
Oct 22, 2016
I tried the configuration reported above without success.
The main issue is that files on "mine" repository (the clone from github) appear to be available (with the 'whereis' command) only on [sshdata] and [public] both not accessible. The files are not indexes by [datasrc] remote. Following the example the files from local are copied on the server only to [sshdata].
I have an additional question.
P.S. Do you have a recommendation for a docker to config git server over http?
An additional detail. On the server the storage is mounted with sshfs.
I don't believe it is relevant to the issue mentioned above.
Finally an update on this issue. With #1237 many things get changed (and hopefully improved). There is one more issue before this is fully resolved. Here is how the current flow is by means of a demo script. It does everything, except for the special remote setup -- which still needs some thought. Anyways, here it is: