Permalink
Switch branches/tags
Nothing to show
Commits on Jan 7, 2013
  1. Merge git_v1.8.1:vcs-svn into master

    Conflicts:
    	LICENSE
    
    Updated with new contact details and attributions.
    barrbrain committed Jan 7, 2013
Commits on Jul 6, 2012
  1. vcs-svn: suppress a signed/unsigned comparison warning

    The preceding code checks that view->max_off is nonnegative and
    (off + width) fits in an off_t, so this code is already safe.
    
    Signed-off-by: David Barr <davidbarr@google.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn May 31, 2012
  2. vcs-svn: suppress signed/unsigned comparison warnings

    These are already safe because both sides of the comparison are
    nonnegative.
    
    This would normally not be important because Git is not -Wsign-compare
    clean anyway, but we like to keep the vcs-svn/ lib to a higher
    standard for convenience using it in other projects.
    
    Signed-off-by: David Barr <davidbarr@google.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn May 31, 2012
  3. vcs-svn: use strstr instead of memmem

    memmem is a GNU extension.
    
    Avoiding it makes the code clearer and makes it easier for projects
    that don't share git's compat/ code, such as the standalone
    svn-dump-fast-export project, to reuse the vcs-svn/ library.
    
    Signed-off-by: David Barr <davidbarr@google.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn May 31, 2012
  4. vcs-svn: use constcmp instead of prefixcmp

    Since the length of t is already known, we can simplify a little by
    using memcmp() instead of strncmp() to carry out a prefix comparison.
    All nearby code already does this.
    
    Noticed in the standalone svn-dump-fast-export project which has not
    needed to implement prefixcmp() yet.
    
    Signed-off-by: David Barr <davidbarr@google.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn May 31, 2012
  5. vcs-svn: simplify cleanup in apply_one_window

    Currently the cleanup code looks like this:
    
    	free resources
    	return 0;
     error_out:
    	free resources
    	return -1;
    
    Avoid duplicating the "free resources" part by keeping the return
    value in a variable and sharing code between the success and
    exceptional case:
    
    	ret = 0;
     out:
    	free resources
    	return ret;
    
    Noticed in the svn-dump-fast-export project, where using the error()
    macro in void context produces a warning.
    
    Signed-off-by: David Barr <davidbarr@google.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn May 31, 2012
  6. vcs-svn: avoid self-assignment in dummy initialization of pre_off

    Without this change, clang complains:
    
     vcs-svn/svndiff.c:298:3: warning: Assigned value is garbage or undefined
                     off_t pre_off = pre_off; /* stupid GCC... */
                     ^               ~~~~~~~
    
    This code uses an old and common idiom for suppressing an
    "uninitialized variable" warning, and clang is wrong to warn about it.
    The idiom tells the compiler to leave the variable uninitialized,
    which saves a few bytes of code size, and, more importantly, allows
    valgrind to check at runtime that the variable is properly initialized
    by the time it is used.
    
    But MSVC and clang do not know that idiom, so let's avoid it in
    vcs-svn/ code.
    
    Initialize pre_off to -1, a recognizably meaningless value, to allow
    future code changes that cause pre_off to be used before it is
    initialized to be caught early.
    
    Signed-off-by: David Barr <davidbarr@google.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn May 31, 2012
  7. vcs-svn: drop no-op reset methods

    Since v1.7.5~42^2~6 (vcs-svn: remove buffer_read_string)
    buffer_reset() does nothing thus fast_export_reset() also.
    
    Signed-off-by: David Barr <davidbarr@google.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn May 31, 2012
Commits on May 26, 2011
  1. vcs-svn: implement text-delta handling

    Handle input in Subversion's dumpfile format, version 3.  This is the
    format produced by "svnrdump dump" and "svnadmin dump --deltas", and
    the main difference between v3 dumpfiles and the dumpfiles already
    handled is that these can include nodes whose properties and text are
    expressed relative to some other node.
    
    To handle such nodes, we find which node the text and properties are
    based on, handle its property changes, use the cat-blob command to
    request the basis blob from the fast-import backend, use the
    svndiff0_apply() helper to apply the text delta on the fly, writing
    output to a temporary file, and then measure that postimage file's
    length and write its content to the fast-import stream.
    
    The temporary postimage file is shared between delta-using nodes to
    avoid some file system overhead.
    
    The svn-fe interface needs to be more complicated to accomodate the
    backward flow of information from the fast-import backend to svn-fe.
    The backflow fd is not needed when parsing streams without deltas,
    though, so existing scripts using svn-fe on v2 dumps should
    continue to work.
    
    NEEDSWORK: generalize interface so caller sets the backflow fd, close
    temporary file before exiting
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Mar 19, 2011
Commits on Mar 26, 2011
  1. vcs-svn: avoid using ls command twice

    Currently there are two functions to retrieve the mode and content
    at a path:
    
    	const char *repo_read_path(const uint32_t *path);
    	uint32_t repo_read_mode(const uint32_t *path)
    
    Replace them with a single function with two return values.  This
    means we can use one round-trip to get the same information from
    fast-import that previously took two.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 13, 2010
Commits on Mar 22, 2011
  1. vcs-svn: drop obj_pool

    This reverts commit 4709455db3891f6cad9a96a574296b4926f70cbe (Add
    memory pool library, 2010-08-09).  svn-fe uses strbufs to avoid memory
    allocation overhead nowadays.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 13, 2010
  2. vcs-svn: drop treap

    This reverts commit 951f316470acc7c785c460a4e40735b22822349f
    (Add treap implementation, 2010-08-09).  The string_pool was
    trp.h's last user.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 13, 2010
  3. vcs-svn: drop string_pool

    This reverts commit 1d73b52f5ba4184de6acf474f14668001304a10c
    (Add string-specific memory pool, 2010-08-09).  Now that svn-fe
    does not need to maintain a growing collection of strings (paths)
    over a long period of time, the string_pool is not needed.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 13, 2010
  4. vcs-svn: pass paths through to fast-import

    Now that there is no internal representation of the repo, it is not
    necessary to tokenise paths.  Use strbuf instead and bypass
    string_pool.
    
    This means svn-fe can handle arbitrarily long paths (as long as a
    strbuf can fit them), with arbitrarily many path components.
    
    While at it, since we now treat paths in their entirety, only quote
    when necessary.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 13, 2010
  5. vcs-svn: use strchr to find RFC822 delimiter

    This is a small optimisation (4% reduction in user time) but is the
    largest artifact within the parsing portion of svndump.c
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 14, 2010
  6. vcs-svn: implement perfect hash for top-level keys

    Instead of interning property names and comparing their string_pool
    keys, look them up in a table by string length, which should be about
    as fast.
    
    Another small step towards removing dependence on string_pool
    altogether.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 13, 2010
  7. vcs-svn: implement perfect hash for node-prop keys

    Instead of interning property names and comparing their string_pool
    keys, look them up in a table by string length, which should be about
    as fast.
    
    This is a small step towards removing dependence on string_pool.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 13, 2010
  8. vcs-svn: use strbuf for author, UUID, and URL

    Use strbufs and strings instead of interned strings for values of rev,
    dump, and node fields that happen to be strings.  After this change,
    the only remaining string_pool use is for paths in the repo_tree API
    and internals.
    
    Functional change: treat an empty author, UUID, or URL as none at all.
    So for example, in repos where the first revision has an empty
    svn:author property, the first rev will be treated as by "nobody"
    rather than by a person with empty name and email address created by
    prepending an @ sign to the repository UUID.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Mar 22, 2011
  9. vcs-svn: use strbuf for revision log

    obj_pool is overkill for this application: all that is needed is a
    buffer that can resize from rev to rev to accomodate differently-sized
    strings.  In the spirit of commit deadcef4 (2010-11-06), use a strbuf
    instead.
    
    This is a small step towards removing dependence on obj_pool.h.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Mar 21, 2011
Commits on Mar 7, 2011
  1. vcs-svn: use mark from previous import for parent commit

    With this patch, overlapping incremental imports work.
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 12, 2010
  2. vcs-svn: quote paths correctly for ls command

    This bug was found while importing rev 601865 of ASF.
    
    [jn: with test]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Dec 11, 2010
  3. vcs-svn: set up channel to read fast-import cat-blob response

    Set up some plumbing: teach the svndump lib to pass a file descriptor
    number to the fast_export lib, representing where cat-blob/ls
    responses can be read from, and add a get_response_line helper
    function to the fast_export lib to read a line from that file.
    
    Unfortunately this means that svn-fe needs file descriptor 3 to be
    redirected from somewhere (preferrably the cat-blob stream of a
    fast-import backend); otherwise it will fail:
    
    	$ svndump <path> | svn-fe
    	fatal: cannot read from file descriptor 3: Bad file descriptor
    
    For the moment, "svn-fe 3</dev/null" works as a workaround but it
    will not work for very long.  A fast-import backend that can retrieve
    old commits is needed in order to be able to fulfill svn
    "Node-copyfrom-rev" requests that refer to revs from a previous run.
    
    [jn: with new change description]
    
    Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    barrbrain committed with jrn Mar 5, 2011
Commits on Nov 24, 2010
  1. vcs-svn: Implement Prop-delta handling

    The rules for what file is used as delta source for each file are not
    documented in dump-load-format.txt.  Luckily, the Apache Software
    Foundation repository has rich enough examples to figure out most of
    the rules:
    
    Node-action: replace implies the empty property set and empty text as
    preimage for deltas.  Otherwise, if a copyfrom source is given, that
    node is the preimage for deltas.  Lastly, if none of the above applies
    and the node path exists in the current revision, then that version
    forms the basis.
    
    [jn: refactored, with tests]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    barrbrain committed with gitster Nov 20, 2010
  2. vcs-svn: Allow simple v3 dumps (no deltas yet)

    Since the dumpfile version 1 days, the Subversion dump format
    gained some new fields:
    
     - a unique identifier for the repository (version 2 format)
     - whether the text and properties for a node should be
       interpreted as deltas
     - checksums for a delta's preimage
     - SHA-1 sums as alternatives to the existing MD5 checksums for
       copy source and the payload (delta).
    
    For now what is relevant to us is the Text-delta and Prop-delta
    fields, since not noticing these causes a dump file to be
    misinterpreted (see the previous commit).
    
    [jn: with tests]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    barrbrain committed with gitster Nov 18, 2010
Commits on Aug 15, 2010
  1. SVN dump parser

    svndump parses data that is in SVN dumpfile format produced by
    `svnadmin dump` with the help of line_buffer and uses repo_tree and
    fast_export to emit a git fast-import stream.
    
    Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase
    project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and
    others.
    
    [rr: allow input from files other than stdin]
    [jn: with test, more error reporting]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    barrbrain committed with gitster Aug 9, 2010
  2. Infrastructure to write revisions in fast-export format

    repo_tree maintains the exporter's state and provides a facility to to
    call fast_export, which writes objects to stdout suitable for
    consumption by fast-import.
    
    The exported functions roughly correspond to Subversion FS operations.
    
     . repo_add, repo_modify, repo_copy, repo_replace, and repo_delete
       update the current commit, based roughly on the corresponding
       Subversion FS operation.
    
     . repo_commit calls out to fast_export to write the current commit to
       the fast-import stream in stdout.
    
     . repo_diff is used by the fast_export module to write the changes
       for a commit.
    
     . repo_reset erases the exporter's state, so valgrind can be happy.
    
    [rr: squelched compiler warnings]
    [jn: removed support for maintaining state on-disk, though we may
    want to add it back later]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    barrbrain committed with gitster Aug 9, 2010
  3. Add stream helper library

    This library provides thread-unsafe fgets()- and fread()-like
    functions where the caller does not have to supply a buffer.  It
    maintains a couple of static buffers and provides an API to use
    them.
    
    [rr: allow input from files other than stdin]
    [jn: with tests, documentation, and error handling improvements]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    barrbrain committed with gitster Aug 9, 2010
  4. Add string-specific memory pool

    Intern strings so they can be compared by address and stored without
    wasting space.
    
    This library uses the macros in the obj_pool.h and trp.h to create a
    memory pool for strings and expose an API for handling them.
    
    [rr: added API docs]
    [jn: with some API simplifications, new documentation and tests]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    barrbrain committed with gitster Aug 9, 2010
  5. Add memory pool library

    Add a memory pool library implemented using C macros. The
    obj_pool_gen() macro creates a type-specific memory pool.
    
    The memory pool library is distinguished from the existing specialized
    allocators in alloc.c by using a contiguous block for all allocations.
    This means that on one hand, long-lived pointers have to be written as
    offsets, since the base address changes as the pool grows, but on the
    other hand, the entire pool can be easily written to the file system.
    This could allow the memory pool to persist between runs of an
    application.
    
    For the svn importer, such a facility is useful because each svn
    revision can copy trees and files from any previous revision.  The
    relevant information for all revisions has to persist somehow to
    support incremental runs.
    
    [rr: minor cleanups]
    [jn: added tests; removed file system backing for now]
    
    Signed-off-by: David Barr <david.barr@cordelta.com>
    Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    barrbrain committed with gitster Aug 9, 2010