Skip to content
barrbrain edited this page Sep 14, 2010 · 2 revisions

Welcome to the svn-dump-fast-export wiki!

Documentation on the design of svn-dump-fast-export will become available here as well as within the repository.

A string pool module is used to efficiently assign identifiers to unique names. These identifiers can be used to the advantage of the differencing and mutation methods in the repository tree module.

A copy-on-write multi-way tree implementation is used to represent the structure of commits, directories and files within the source repository. There are effectively five kinds of object: name, blob/symlink/executable, directory entry, directory and commit. There is also implicitly the repository root. To aid persisting this structure in future, each type of object is allocated within a contiguous region and identifiers given by their position relative to the start of the region. Any links to an object are recorded as the integer identifier.
This structure has proved effective for a repository of 20,000 commits. About 30% of memory usage can be saved by defragmenting the structure after each commit, at the expense of 100 lines of C.

The parser calls into the repository tree module through a simple API. There are methods for: add, delete, copy, replace, modify and commit.
The parser may eventually be separated into a generic RFC822 parser module and a subversion-specific parser module.

There may eventually be a git translation module that maps subdirectories to branches before outputting the git commands.

Current development efforts are two-fold: adding broader support vs. simplifying the code.

Clone this wiki locally