Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
69 lines (57 sloc) 3.3 KB
This describes the design of Agito, and how it differs to other
Subversion-Git conversion programs.
Subversion has, on the face of it, what seems like a fairly simple
scheme, where branches, tags and other are united within a single tree
structure, with a single incrementing revision counter. It seems like
writing a tool to convert a Subversion repository to a Git repository
ought therefore to be fairly straightforward, but in fact it's more
complicated than it seems.
Consider a simple example of a Subversion revision history. There is
a single branch, /branch. At a certain point, it is copied to /tag to
create a tag (using 'svn cp'). However, this is later determined to be
a mistake. The tag is deleted and later recreated again at the same
location from a later version of /branch.
------------------------------------------ /branch
\ \
\ \
--------- X --------- /tag
r1000 r1001 [...] r1050
When a directory is copied in Subversion, the history is preserved.
That is, when you run 'svn log' on a copied directory, you see the
commits from the original directory from which it was copied, in
addition to any new commits since the copy.
This is a nice feature, and essential really, as it makes Subversion
into a useful version control system. But now we have a problem: in
the example above, what is the history of '/tag'? We now have two
contradictory definitions.
* The first definition is what I will call the "objective" history,
because this is the history as we see it from the root of the
repository. The objective history says that '/tag' was copied from
'/branch', then it was deleted, then it was copied from '/branch'
* The second definition is what I will call the "subjective"
definition, because this is the history as we see it from within
the perspective of '/tag'. It's what we get when we run 'svn log'
from within '/tag'. The subjective history says that there were a
number of changes made on '/branch', then '/branch' was copied to
'/tag'. The previously-created and subsequently-deleted '/tag'
"never happened", or at least it's not part of the subjective
Here's the key thing to note: the subjective history, while perhaps
slightly trickier to comprehend, is what's useful in the context of a
version control system. If we're converting a Subversion repository to
Git, it's what we should probably be trying to capture - the history
we should be trying to reconstruct.
Probably the most popular Subversion to Git conversion tool is
git-svn, but it seems that this works by reconstructing to *objective*
history, not the subjective history. You can see this when you run it,
as it iterates through each commit in the repository's history,
"sorting" each commit into the appropriate branch. It doesn't handle
cases like the example above properly: a branch that is deleted and
recreated is handled like a merge of the two histories, which is
Agito works the opposite way: it takes the current (HEAD) state of the
repository, then works backwards to reconstruct the history of the
branches and tags it finds. I've written it to be able to handle
Chocolate Doom's version control repository, which contains several
examples like the one above in its history.