Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

maintaining a partial copy of a repo...

...with gl-pre-git and update.secondary hooks
  • Loading branch information...
commit f3eae5e1700953d650b920090904612d2402d3bc 1 parent 5858ecb
@sitaramc authored
View
51 contrib/partial-copy/gl-pre-git
@@ -0,0 +1,51 @@
+#!/usr/bin/perl
+use strict;
+use warnings;
+
+# called from gitolite before any git operations are run
+
+# "we", "our repo" => the partial copy
+# "main", "pco" => the one which we are a "partial copy of"
+
+my $main=`git config --file $ENV{GL_REPO_BASE_ABS}/$ENV{GL_REPO}.git/config --get gitolite.partialCopyOf`;
+chomp ($main);
+
+exit 0 unless $main;
+
+die "ENV GL_RC not set\n" unless $ENV{GL_RC};
+die "ENV GL_BINDIR not set\n" unless $ENV{GL_BINDIR};
+
+unshift @INC, $ENV{GL_BINDIR};
+require gitolite or die "parse gitolite.pm failed\n";
+gitolite->import;
+
+# go to the main repo. Find a list of all the refs it has, and for each one,
+# check if this user is allowed to read that ref from our repo. If he is, add
+# it to a list.
+
+my %allowed;
+wrap_chdir("$ENV{GL_REPO_BASE_ABS}/$main.git");
+for my $ref (`git for-each-ref refs/heads '--format=%(refname)'`) {
+ chomp($ref);
+ my $ret = check_access($ENV{GL_REPO}, $ref, 'R', 1);
+ $allowed{$ref} = 1 unless $ret =~ /DENIED/;
+}
+
+# now go to our repo and...
+wrap_chdir("$ENV{GL_REPO_BASE_ABS}/$ENV{GL_REPO}.git");
+
+# delete all existing refs that are not "allowed" (e.g., refs that were
+# previously allowed but now are not, due to config file/rules change)
+for my $ref (`git for-each-ref refs '--format=%(refname)'`) {
+ chomp($ref);
+ next if $allowed{$ref};
+ system("git", "update-ref", "-d", $ref);
+}
+
+# now copy all allowed branches (and their tags, implicitly)
+for my $ref (sort keys %allowed) {
+ system("git", "fetch", "-f", "$ENV{GL_REPO_BASE_ABS}/$main.git", "$ref:$ref");
+}
+
+# now allow the git operation to proceed
+exit 0
View
130 contrib/partial-copy/partial-copy.mkd
@@ -0,0 +1,130 @@
+# F=partialcopy maintaining a partial copy of a repo
+
+The regular documentation on basic access control mentions [here][rpr_] that
+it is easy to maintain two repositories if you need a (set of) branch(es) to
+be "secret", with one repo that has everything, and another that has
+everything but the secret branches.
+
+Here's how gitolite can help do that sanely, with minimal hassles for all
+concerned. This will ensure the right branches propagate correctly when
+people pull/push -- you don't have to do anything manually after setting it up
+unless the rules change.
+
+To start with, here's a **NON-WORKING** config that merely describes what
+we're **trying** to achieve:
+
+ # THIS WILL NOT WORK!
+ repo foo
+ - secret-1$ = wally
+ RW+ dev/USER/ = wally
+ RW+ = dilbert alice ashok wally
+
+We want Wally the slacker to not be able to see the "secret-1" branch.
+
+The only way to do this is to have two repos -- one with and the other without
+the secret branch.
+
+<font color="gray">These two repos cannot share git objects (to save disk
+space) using hardlinks etc. Doing so would cause a data leak if Wally decides
+to stop slacking and start hacking. See my conversation with Shawn
+[here][gitlog1] for more on this, but it basically involves Wally finding out
+the SHA of one of the secret branches, pushing a branch that he claims to have
+built on that SHA, then fetching that branch again.
+
+It requires a serious understanding of the git transport protocol, how objects
+are sent/received, how thin packs are created, etc., to implement it. Or to
+convince yourself that someone's implementation is correct.
+
+Meanwhile, the method described here, once you accept the disk space cost, is
+quite understandable to mere mortals like me :-)</font>
+
+In the above example you had 2 sets of read access -- (1) all branches (2) all
+branches except secret-1. If you end up with one more set (say, "all branches
+except secret-2") then you need one more repo to handle it. If you can afford
+the storage, the following recipe can certainly make it *manageable*.
+
+[gitlog1]: http://colabti.org/irclogger/irclogger_log/git?date=2010-09-17#l2710
+
+## first, as usual, the caveats!
+
+ * if you change the config to disallow something that used to be allowed,
+ any tags pointing to objects that Wally's repo acquired before the change,
+ will keep coming back! That is, branch B1 had a tag T1 within it. Later,
+ B1 was disallowed for Wally. However, Wally's repo will still retain the
+ tag T1!
+
+ So, if you ever disallow a branch that used to be allowed, it's best to
+ purge Wally's repo manually and let it get rebuilt on the next access.
+ Just delete it from the disk, push the gitolite-admin config to force it
+ to re-create, then access it as a legitimate user.
+
+ * this recipe has not been, and will not be, tested with smart http.
+
+ * it probably won't play well with wildcard repos either; not tested.
+
+ * finally, mirroring support for such repos has not been tested too.
+
+## the basic idea
+
+The basic idea is very simple.
+
+ * one repo is the "main" one. It contains all the branches, and is the one
+ that people with full access will use.
+
+ * the other repo (or all the other repos, if you have more than one set, as
+ described above) is a "partial copy", with only a subset of the branches
+ in the main repo.
+
+ * every time someone accesses the partial copy, the branches that that user
+ is allowed to read are fetched from the main repo. **See note in example
+ below**.
+
+ * every time someone pushes to the partial copy, the branch being pushed is
+ sent back to the main repo before the update succeeds.
+
+The main repo is always the canonical/current one. The others may or may not
+be uptodate.
+
+## the config file
+
+Here's what we actually need to put in the config file. Note that the
+reponames can be whatever you want of course.
+
+ repo foo
+ RW+ = dilbert alice ashok
+
+ repo foo-partialcopy-1
+ - secret-1$ = wally
+ R = wally
+ RW+ dev/USER/ = wally
+
+ config gitolite.partialCopyOf = foo
+
+**Important notes**:
+
+ * Wally must not have any access to "foo". Absolutely none at all.
+
+ * Wally's rules for `foo-partialcopy-1` must be written such that restricted
+ branches are denied. You could list only the branches he's allowed to
+ read, or you could deny the ones he's not allowed and add a blanket "R"
+ for the others later, as in this example.
+
+ Note that this is the same [deny][] logic that is normally used for write
+ operations, but applied to "read" in this case. All we're doing is using
+ this logic to determine what branches from `foo` are allowed to propagate
+ to the partial copy repo. *This is NOT being used by git to restrict
+ reads; at the risk of repetition, git does NOT have that capability*.
+
+ * All the other users with access to `foo-partialcopy-1` must be under the
+ same restrictions as Wally. So let's say Ashok is not allowed to view a
+ branch called "USCO". That needs to be defined in yet another partial
+ copy repo, and `ashok` must be removed from the access list for `foo`.
+
+## the hooks
+
+The code for both hooks is included in the source directory
+`contrib/partial-copy`. Note that this is all done *without* touching
+gitolite core at all -- we only use two hooks; both described in the [hooks][]
+section. A pictorial representation of all the stuff gitolite runs is
+[here][flow]; it may help you understand the role that these two hooks are
+playing in this scenario.
View
203 contrib/partial-copy/t.sh
@@ -0,0 +1,203 @@
+#!/bin/bash
+
+# test script for partial copy feature
+
+# WARNING 1: will wipe out your gitolite.conf file (you can recover by usual
+# git methods if you need of course).
+
+# WARNING 2: will wipe out (completely) the following directories:
+
+ rm -rf ~/repositories/{foo,foo-pc}.git ~/td
+
+# REQUIRED 1: please make sure rc file allows config 'gitolite.partialCopyOf'.
+
+# REQUIRED 2: please make sure you copied the 2 hooks in contrib/partial-copy
+# and installed them into gitolite
+
+# REQUIRED 3: the 'git-test' command from my 'git-tools' project
+
+# ----
+
+set -e
+mkdir ~/td
+
+# ----
+
+cd ~/gitolite-admin
+
+cat << EOF1 > conf/gitolite.conf
+ repo gitolite-admin
+ RW+ = tester
+
+ repo testing
+ RW+ = @all
+EOF1
+
+git test '## setup base conf' 'add conf' 'commit -m start' 'commit-empty' 'ok' 'push -f' 'ok'
+
+cat << EOF2 >> conf/gitolite.conf
+
+ repo foo
+ RW+ = u1 u2
+
+ repo foo-pc
+ - secret-1$ = u4
+ R = u4 # marker 01
+ RW next = u4
+ RW+ dev/USER/ = u4
+ RW refs/tags/USER/ = u4
+
+ config gitolite.partialCopyOf = foo
+
+EOF2
+
+git test << SETUP
+ ## setup partial-repos conf
+ add conf; commit -m partial-repos; commit-empty; ok;
+ # /master.*partial-repos/
+ push; ok;
+ /Init.*empty.*foo\\.git/
+ /Init.*empty.*foo-pc\\.git/
+ /u3.*u5.*u6/; !/u1/; !/u2/; !/u4/
+
+SETUP
+
+cd ~/td; rm -rf foo foo-pc
+
+git test << FOO
+ ## populate repo foo, by user u1
+ # create foo with a bunch of branches and tags
+ clone u1:foo
+ /appear.*cloned/
+ cd foo
+ a1; a2
+ checkout -b dev/u1/foo; f1; f2
+ checkout master; m1; m2
+ checkout master; checkout -b next; n1; n2; tag nt1
+ checkout -b secret-1; s11; s12; tag s1t1
+ checkout next; checkout -b secret-2; s21; s22; tag s2t1
+ push --all
+ /new branch/; /secret-1/; /secret-2/
+ push --tags
+ /new tag/; /s1t1/; /s2t1/
+FOO
+
+git test << FOOPC
+ ## user u4 tries foo, fails, tries foo-pc
+ cd $HOME/td
+ clone u4:foo foo4; !ok
+ /R access for foo DENIED to u4/
+ clone u4:foo-pc ; ok;
+ /Cloning into foo-pc/
+ /new branch.* dev/u1/foo .* dev/u1/foo/
+ /new branch.* master .* master/
+ /new branch.* next .* next/
+ /new branch.* secret-2 .* secret-2/
+ !/new branch.* secret-1 .* secret-1/
+ /new tag.* nt1 .* nt1/
+ /new tag.* s2t1 .* s2t1/
+ !/new tag.* s1t1 .* s1t1/
+
+FOOPC
+
+git test << FOOPC2
+ ## user u4 pushes to foo-pc
+ cd $HOME/td/foo-pc
+ checkout master
+ u4m1; u4m2; push; !ok
+ /W refs/heads/master foo-pc u4 DENIED by fallthru/
+ /hook declined to update refs/heads/master/
+ /To u4:foo-pc/
+ /remote rejected/
+ /failed to push some refs to 'u4:foo-pc'/
+
+ checkout next
+ u4n1; u4n2
+ push origin next; ok
+ /To /home/gl-test/repositories/foo.git/
+ /new branch\] ca3787119b7e8b9914bc22c939cefc443bc308da -> br-\d+/
+ /u4:foo-pc/
+ /52c7716..ca37871 next -> next/
+ tag u4/nexttag; push --tags
+ /To u4:foo-pc/
+ /\[new tag\] u4/nexttag -> u4/nexttag/
+ /\[new branch\] ca3787119b7e8b9914bc22c939cefc443bc308da -> br-\d+/
+
+ checkout master
+ checkout -b dev/u4/u4master
+ devu4m1; devu4m2
+ push origin HEAD; ok
+ /To /home/gl-test/repositories/foo.git/
+ /new branch\] 228353950557ed1eb13679c1fce4d2b4718a2060 -> br-\d+/
+ /u4:foo-pc/
+ /new branch.* HEAD -> dev/u4/u4master/
+
+FOOPC2
+
+git test << FOO2
+ ## user u1 gets u4's updates, makes some more
+ cd $HOME/td/foo
+ git remote update
+ /Fetching origin/
+ /From u1:foo/
+ /new branch\] dev/u4/u4master -> origin/dev/u4/u4master/
+ /new tag\] u4/nexttag -> u4/nexttag/
+ /52c7716..ca37871 next -> origin/next/
+ checkout master; u1ma1; u1ma2;
+ /\[master 8ab1ff5\] u1ma2 at Thu Jul 7 06:23:20 2011/
+ tag mt2; push-om; ok
+ checkout secret-1; u1s1b1; u1s1b2
+ /\[secret-1 5f96cb5\] u1s1b2 at Thu Jul 7 06:23:20 2011/
+ tag s1t2; push origin HEAD; ok
+ checkout secret-2; u1s2b1; u1s2b2
+ /\[secret-2 1ede682\] u1s2b2 at Thu Jul 7 06:23:20 2011/
+ tag s2t2; push origin HEAD; ok
+ push --tags; ok
+
+ git ls-remote origin
+ /8ab1ff512faf5935dc0fbff357b6f453b66bb98b\trefs/tags/mt2/
+ /5f96cb5ff73c730fb040eb2d01981f7677ca6dba\trefs/tags/s1t2/
+ /1ede6829ec7b75a53cd6acb7da64e5a8011e6050\trefs/tags/s2t2/
+FOO2
+
+git test << FOOPC3
+ ## u4 gets updates but without the tag in secret-1
+ cd $HOME/td/foo-pc
+ git ls-remote origin;
+ !/ refs/heads/secret-1/; !/s1t1/; !/s1t2/
+ /8ab1ff512faf5935dc0fbff357b6f453b66bb98b\tHEAD/
+ /8ced4a374b3935bac1a5ba27ef8dd950bd867d47\trefs/heads/dev/u1/foo/
+ /228353950557ed1eb13679c1fce4d2b4718a2060\trefs/heads/dev/u4/u4master/
+ /8ab1ff512faf5935dc0fbff357b6f453b66bb98b\trefs/heads/master/
+ /ca3787119b7e8b9914bc22c939cefc443bc308da\trefs/heads/next/
+ /1ede6829ec7b75a53cd6acb7da64e5a8011e6050\trefs/heads/secret-2/
+ /8ab1ff512faf5935dc0fbff357b6f453b66bb98b\trefs/tags/mt2/
+ /52c7716c6b029963dd167c647c1ff6222a366499\trefs/tags/nt1/
+ /01f04ece6519e7c0e6aea3d26c7e75e9c4e4b06d\trefs/tags/s2t1/
+ /1ede6829ec7b75a53cd6acb7da64e5a8011e6050\trefs/tags/s2t2/
+
+ git remote update
+ /3ea704d..8ab1ff5 master -> origin/master/
+ /01f04ec..1ede682 secret-2 -> origin/secret-2/
+ /\[new tag\] mt2 -> mt2/
+ /\[new tag\] s2t2 -> s2t2/
+ !/ refs/heads/secret-1/; !/s1t1/; !/s1t2/
+
+FOOPC3
+
+git ls-remote u4:foo-pc
+
+cd ~/gitolite-admin
+perl -ni -e 'print unless /marker 01/' conf/gitolite.conf
+git test 'add conf' 'commit -m erdel' 'ok' 'push -f' 'ok'
+
+git ls-remote u4:foo-pc
+
+cat <<RANT
+
+This is where things go all screwy. Because we still have the *objects*
+pointed to by these tags, we still get them back from the main repo.
+
+<sigh>
+
+RANT
View
32 contrib/partial-copy/update.secondary
@@ -0,0 +1,32 @@
+#!/usr/bin/perl
+use strict;
+use warnings;
+
+# called from gitolite before any git operations are run
+
+# "we", "our repo" => the partial copy
+# "main", "pco" => the one which we are a "partial copy of"
+
+my $main=`git config --file $ENV{GL_REPO_BASE_ABS}/$ENV{GL_REPO}.git/config --get gitolite.partialCopyOf`;
+chomp ($main);
+
+exit 0 unless $main;
+
+die "ENV GL_RC not set\n" unless $ENV{GL_RC};
+die "ENV GL_BINDIR not set\n" unless $ENV{GL_BINDIR};
+
+unshift @INC, $ENV{GL_BINDIR};
+require gitolite or die "parse gitolite.pm failed\n";
+gitolite->import;
+
+my ($ref, $old, $new) = @ARGV;
+my $rand = int(rand(100000000));
+
+$ENV{GL_BYPASS_UPDATE_HOOK} = 1;
+system("git", "push", "-f", "$ENV{GL_REPO_BASE_ABS}/$main.git", "$new:refs/heads/br-$rand") and die "FATAL: failed to send $new\n";
+
+wrap_chdir("$ENV{GL_REPO_BASE_ABS}/$main.git");
+system("git", "update-ref", "-d", "refs//heads/br-$rand");
+system("git", "update-ref", $ref, $new, $old) and die "FATAL: update-ref for $ref failed\n";
+
+exit 0;
View
8 doc/gitolite.conf-by-example.mkd
@@ -98,11 +98,11 @@ branch or tag on it.
Wally can only read the repo. Alice and Ashok can push but not rewind; only
Sitaram and Dilbert can do that.
- R master = wally # MEANINGLESS! WILL NOT DO WHAT YOU THINK IT DOES!!
+And now, a common misunderstanding:
-This won't work. You can only restrict "read" access at the repo level, not
-at the branch level. This is a git issue, not a gitolite issue. Go bother
-them, or switch to gerrit.
+ R master = wally # WILL NOT DO WHAT YOU THINK IT DOES!!
+
+This won't work. Please see [here][rpr_] for more on this.
repo foo
RW master$ = dilbert alice
View
44 doc/gitolite.conf.mkd
@@ -86,22 +86,6 @@ tag with a new value.
In a later section you'll see some more advanced permissions.
-<font color="gray">
-
-Side note: apparently it needs to be spelled out that "R" permissions can only
-apply to the entire repo and not to individual branches/tags. Mention was
-made of a certain popular Linux distribution named after animals with
-adjectives, chosen merely for alliterative purposes, prefixed to their names,
-and of their users not being clueful enough to know that this (the "read"
-thing, not the alliterative adjective thing, in case you lost track) is an
-inherent git characteristic.
-
-Meanwhile, people who *desperately* need this are directed to gerrit, which
-can do this because they have their own git stack and dont use the one written
-by Linus and currently maintained by Junio.
-
-</font>
-
### how rules are matched
It's important to understand that there're two levels at which access control
@@ -229,6 +213,34 @@ documentation for [`~/.gitolite.rc`][rc].
When used as a reponame, it includes all repos.
+### F=rpr_ side note: "R" permissions for refs
+
+You can control "read" access only at the repo level, not at the branch level.
+For example, this **won't** limit Wally to reading only the master branch:
+
+ repo foo
+ R master = wally # WILL NOT DO WHAT YOU THINK IT DOES!!
+
+and this **won't** prevent him from reading it:
+
+ repo foo
+ - master = wally # WILL NOT DO WHAT YOU THINK IT DOES!!
+ R = wally
+
+This (inability to distinguish one ref from another during a read operation)
+is a git issue, not a gitolite issue.
+
+There are 3 ways around this, though:
+
+ * switch to gerrit, which has its own git stack, its own sshd, and God knows
+ what else. All written in Java, the COBOL of the internet era ;-)
+ * bug the git people to add this feature in ;-)
+ * use a separate repo for Wally.
+
+Using separate repos is not that hard with gitolite. Here's how to maintain a
+[partial copy][partialcopy] of the main repo and keep it synced (while not
+allowing the secret branches into it).
+
## F=aac advanced access control
The previous section is sufficient for most common needs, but gitolite can go
Please sign in to comment.
Something went wrong with that request. Please try again.