Skip to content
This repository


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Apr 20, 2009

  1. dscho

    Fix off-by-one in read_tree_recursive

    Found by valgrind.
    Signed-off-by: Johannes Schindelin <>
    Signed-off-by: Junio C Hamano <>
    dscho authored gitster committed

Apr 18, 2009

  1. Junio C Hamano

    Merge branch 'bs/maint-1.6.0-tree-walk-prefix' into maint

    * bs/maint-1.6.0-tree-walk-prefix:
      match_tree_entry(): a pathspec only matches at directory boundaries
      tree_entry_interesting: a pathspec only matches at directory boundary
    gitster authored

Apr 02, 2009

  1. Junio C Hamano

    match_tree_entry(): a pathspec only matches at directory boundaries

    Previously the code did a simple prefix match, which means that a path in
    a directory "frotz/" would have matched with pathspec "f".
    Signed-off-by: Junio C Hamano <>
    gitster authored

Feb 11, 2009

  1. Junio C Hamano

    Drop double-semicolon in C

    The worst offenders are "continue;;" and "break;;" in switch statements.
    Signed-off-by: Junio C Hamano <>
    gitster authored

Feb 07, 2009

  1. tree.c: allow read_tree_recursive() to traverse gitlink entries

    When the callback function invoked from read_tree_recursive() returns
    the value `READ_TREE_RECURSIVE` for a gitlink entry, the traversal will
    now continue into the tree connected to the gitlinked commit. This
    functionality can be used to allow inter-repository operations, but
    since the current users of read_tree_recursive() does not yet support
    such operations, they have been modified where necessary to make sure
    that they never return READ_TREE_RECURSIVE for gitlink entries (hence
    no change in behaviour should be introduces by this patch alone).
    Signed-off-by: Lars Hjemli <>
    Signed-off-by: Junio C Hamano <>
    Lars Hjemli authored gitster committed

Jul 15, 2008

  1. add context pointer to read_tree_recursive()

    Add a pointer parameter to read_tree_recursive(), which is passed to the
    callback function.  This allows callers of read_tree_recursive() to
    share data with the callback without resorting to global variables.  All
    current callers pass NULL.
    Signed-off-by: Rene Scharfe <>
    Signed-off-by: Junio C Hamano <>
    René Scharfe authored gitster committed

Mar 02, 2008

  1. Junio C Hamano

    Merge branch 'mk/maint-parse-careful'

    * mk/maint-parse-careful:
      receive-pack: use strict mode for unpacking objects
      index-pack: introduce checking mode
      unpack-objects: prevent writing of inconsistent objects
      unpack-object: cache for non written objects
      add common fsck error printing function
      builtin-fsck: move common object checking code to fsck.c
      builtin-fsck: reports missing parent commits
      Remove unused object-ref code
      builtin-fsck: move away from object-refs to fsck_walk
      add generic, type aware object chain walker
    gitster authored

Feb 26, 2008

  1. Remove unused object-ref code

    Signed-off-by: Martin Koegler <>
    Signed-off-by: Junio C Hamano <>
    Martin Koegler authored gitster committed

Jan 21, 2008

  1. Linus Torvalds

    Make on-disk index representation separate from in-core one

    This converts the index explicitly on read and write to its on-disk
    format, allowing the in-core format to contain more flags, and be
    In particular, the in-core format is now host-endian (as opposed to the
    on-disk one that is network endian in order to be able to be shared
    across machines) and as a result we can dispense with all the
    htonl/ntohl on accesses to the cache_entry fields.
    This will make it easier to make use of various temporary flags that do
    not exist in the on-disk format.
    Signed-off-by: Linus Torvalds <>
    torvalds authored

Aug 10, 2007

  1. Junio C Hamano

    Optimize "diff --cached" performance.

    The read_tree() function is called only from the call chain to
    run "git diff --cached" (this includes the internal call made by
    git-runstatus to run_diff_index()).  The function vacates stage
    without any funky "merge" magic.  The caller then goes and
    compares stage #1 entries from the tree with stage #0 entries
    from the original index.
    When adding the cache entries this way, it used the general
    purpose add_cache_entry().  This function looks for an existing
    entry to replace or if there is none to find where to insert the
    new entry, resolves D/F conflict and all the other things.
    For the purpose of reading entries into an empty stage, none of
    that processing is needed.  We can instead append everything and
    then sort the result at the end.
    This commit changes read_tree() to first make sure that there is
    no existing cache entries at specified stage, and if that is the
    case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag
    (new), and then sort the resulting cache using qsort().
    This new flag tells add_cache_entry() to omit all the checks
    such as "Does this path already exist?  Does adding this path
    remove other existing entries because it turns a directory to a
    file?" and instead append the given cache entry straight at the
    end of the active cache.  The caller of course is expected to
    sort the resulting cache at the end before using the result.
    Signed-off-by: Junio C Hamano <>
    gitster authored

Jun 06, 2007

  1. Junio C Hamano

    Merge branch 'sv/objfixes'

    * sv/objfixes:
      Don't assume tree entries that are not dirs are blobs
      git-cvsimport: Make sure to use $git_dir always instead of .git sometimes
      fix documentation of unpack-objects -n
      Accept dates before 2000/01/01 when specified as seconds since the epoch
    gitster authored
  2. Don't assume tree entries that are not dirs are blobs

    When scanning the trees in track_tree_refs() there is a "lazy" test
    that assumes that entries are either directories or files.  Don't do
    Signed-off-by: Junio C Hamano <>
    Sam Vilain authored gitster committed

May 22, 2007

  1. Martin Waitz

    rename dirlink to gitlink.

    Unify naming of plumbing dirlink/gitlink concept:
    git ls-files -z '*.[ch]' |
    xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'
    Signed-off-by: Junio C Hamano <>
    tali authored Junio C Hamano committed

Apr 22, 2007

  1. Merge branch 'lt/objalloc'

    * 'lt/objalloc':
      Clean up object creation to use more common code
      Use proper object allocators for unknown object nodes too
    Junio C Hamano authored

Apr 17, 2007

  1. Linus Torvalds

    Clean up object creation to use more common code

    This replaces the fairly odd "created_object()" function that did _most_
    of the object setup with a more complete "create_object()" function that
    also has a more natural calling convention.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    torvalds authored Junio C Hamano committed

Apr 10, 2007

  1. Linus Torvalds

    Teach "fsck" not to follow subproject links

    Since the subprojects don't necessarily even exist in the current tree,
    much less in the current git repository (they are totally independent
    repositories), we do not want to try to follow the chain from one git
    repository to another through a gitlink.
    This involves teaching fsck to ignore references to gitlink objects from
    a tree and from the current index.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    torvalds authored Junio C Hamano committed

Mar 21, 2007

  1. Linus Torvalds

    Initialize tree descriptors with a helper function rather than by hand.

    This removes slightly more lines than it adds, but the real reason for
    doing this is that future optimizations will require more setup of the
    tree descriptor, and so we want to do it in one place.
    Also renamed the "desc.buf" field to "desc.buffer" just to trigger
    compiler errors for old-style manual initializations, making sure I
    didn't miss anything.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    torvalds authored Junio C Hamano committed
  2. Linus Torvalds

    Remove "pathlen" from "struct name_entry"

    Since we have the "tree_entry_len()" helper function these days, and
    don't need to do a full strlen(), there's no point in saving the path
    length - it's just redundant information.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    torvalds authored Junio C Hamano committed

Mar 19, 2007

  1. Linus Torvalds

    Trivial cleanup of track_tree_refs()

    This makes "track_tree_refs()" use the same "tree_entry()" function for
    counting the entries as it does for actually traversing them a few lines
    Not a biggie, but the reason I care was that this was the only user of
    "update_tree_entry()" that didn't actually *extract* the tree entry first.
    It doesn't matter as things stand now, but it meant that a separate
    test-patch I had that avoided a few more "strlen()" calls by just saving
    the entry length in the entry descriptor and using it directly when
    updating wouldn't work without this patch.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    torvalds authored Junio C Hamano committed

Feb 27, 2007

  1. convert object type handling from a string to a number

    We currently have two parallel notation for dealing with object types
    in the code: a string and a numerical value.  One of them is obviously
    redundent, and the most used one requires more stack space and a bunch
    of strcmp() all over the place.
    This is an initial step for the removal of the version using a char array
    found in object reading code paths.  The patch is unfortunately large but
    there is no sane way to split it in smaller parts without breaking the
    Signed-off-by: Nicolas Pitre <>
    Signed-off-by: Junio C Hamano <>
    Nicolas Pitre authored Junio C Hamano committed

Dec 20, 2006

  1. simplify inclusion of system header files.

    This is a mechanical clean-up of the way *.c files include
    system header files.
     (1) sources under compat/, platform sha-1 implementations, and
         xdelta code are exempt from the following rules;
     (2) the first #include must be "git-compat-util.h" or one of
         our own header file that includes it first (e.g. config.h,
         builtin.h, pkt-line.h);
     (3) system headers that are included in "git-compat-util.h"
         need not be included in individual C source files.
     (4) "git-compat-util.h" does not have to include subsystem
         specific header files (e.g. expat.h).
    Signed-off-by: Junio C Hamano <>
    Junio C Hamano authored

Aug 23, 2006

  1. Shawn O. Pearce

    Convert memcpy(a,b,20) to hashcpy(a,b).

    This abstracts away the size of the hash values when copying them
    from memory location to memory location, much as the introduction
    of hashcmp abstracted away hash value comparsion.
    A few call sites were using char* rather than unsigned char* so
    I added the cast rather than open hashcpy to be void*.  This is a
    reasonable tradeoff as most call sites already use unsigned char*
    and the existing hashcmp is also declared to be unsigned char*.
    [jc: Splitted the patch to "master" part, to be followed by a
     patch for merge-recursive.c which is not in "master" yet.
     Fixed the cast in the latter hunk to combine-diff.c which was
     wrong in the original.
     Also converted ones left-over in combine-diff.c, diff-lib.c and
     upload-pack.c ]
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
    spearce authored Junio C Hamano committed

Aug 15, 2006

  1. Make track_tree_refs void.

    Signed-off-by: David Rientjes <>
    Signed-off-by: Junio C Hamano <>
    David Rientjes authored Junio C Hamano committed

Jul 13, 2006

  1. Remove TYPE_* constant macros and use object_type enums consistently.

    This updates the type-enumeration constants introduced to reduce
    the memory footprint of "struct object" to match the type bits
    already used in the packfile format, by removing the former
    (i.e. TYPE_* constant macros) and using the latter (i.e. enum
    object_type) throughout the code for consistency.
    Eventually we can stop passing around the "type strings"
    entirely, and this will help - no confusion about two different
    integer enumeration.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed

Jun 20, 2006

  1. Add specialized object allocator

    This creates a simple specialized object allocator for basic
    This avoids wasting space with malloc overhead (metadata and
    extra alignment), since the specialized allocator knows the
    alignment, and that objects, once allocated, are never freed.
    It also allows us to track some basic statistics about object
    allocations. For example, for the mozilla import, it shows
    object usage as follows:
         blobs:   627629 (14710 kB)
         trees:  1119035 (34969 kB)
       commits:   196423  (8440 kB)
          tags:     1336    (46 kB)
    and the simpler allocator shaves off about 2.5% off the memory
    footprint off a "git-rev-list --all --objects", and is a bit
    faster too.
    [ Side note: this concludes the series of "save memory in object storage".
      The thing is, there simply isn't much more to be saved on the objects.
      Doing "git-rev-list --all --objects" on the mozilla archive has a final
      total RSS of 131498 pages for me: that's about 513MB. Of that, the
      object overhead is now just 56MB, the rest is going somewhere else (put
      another way: the fact that this patch shaves off 2.5% of the total
      memory overhead, considering that objects are now not much more than 10%
      of the total shows how big the wasted space really was: this makes
      object allocations much more memory- and time-efficient).
      I haven't looked at where the rest is, but I suspect the bulk of it is
      just the pack-file loading. It may be that we should pack the tree
      objects separately from the blob objects: for git-rev-list --objects, we
      don't actually ever need to even look at the blobs, but since trees and
      blobs are interspersed in the pack-file, we end up not being dense in
      the tree accesses, so we end up looking at more pages than we strictly
      need to.
      So with a 535MB pack-file, it's entirely possible - even likely - that
      most of the remaining RSS is just the mmap of the pack-file itself. We
      don't need to map in _all_ of it, but we do end up mapping a fair
      amount. ]
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed

Jun 18, 2006

  1. Shrink "struct object" a bit

    This shrinks "struct object" by a small amount, by getting rid of the
    "struct type *" pointer and replacing it with a 3-bit bitfield instead.
    In addition, we merge the bitfields and the "flags" field, which
    incidentally should also remove a useless 4-byte padding from the object
    when in 64-bit mode.
    Now, our "struct object" is still too damn large, but it's now less
    obviously bloated, and of the remaining fields, only the "util" (which is
    not used by most things) is clearly something that should be eventually
    This shrinks the "git-rev-list --all" memory use by about 2.5% on the
    kernel archive (and, perhaps more importantly, on the larger mozilla
    archive). That may not sound like much, but I suspect it's more on a
    64-bit platform.
    There are other remaining inefficiencies (the parent lists, for example,
    probably have horrible malloc overhead), but this was pretty obvious.
    Most of the patch is just changing the comparison of the "type" pointer
    from one of the constant string pointers to the appropriate new TYPE_xxx
    small integer constant.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed

May 31, 2006

  1. tree_entry(): new tree-walking helper function

    This adds a "tree_entry()" function that combines the common operation of
    doing a "tree_entry_extract()" + "update_tree_entry()".
    It also has a simplified calling convention, designed for simple loops
    that traverse over a whole tree: the arguments are pointers to the tree
    descriptor and a name_entry structure to fill in, and it returns a boolean
    "true" if there was an entry left to be gotten in the tree.
    This allows tree traversal with
    	struct tree_desc desc;
    	struct name_entry entry;
    	desc.buf = tree->buffer;
    	desc.size = tree->size;
    	while (tree_entry(&desc, &entry) {
    		... use "entry.{path, sha1, mode, pathlen}" ...
    which is not only shorter than writing it out in full, it's hopefully less
    error prone too.
    [ It's actually a tad faster too - we don't need to recalculate the entry
      pathlength in both extract and update, but need to do it only once.
      Also, some callers can avoid doing a "strlen()" on the result, since
      it's returned as part of the name_entry structure.
      However, by now we're talking just 1% speedup on "git-rev-list --objects
      --all", and we're definitely at the point where tree walking is no
      longer the issue any more. ]
    NOTE! Not everybody wants to use this new helper function, since some of
    the tree walkers very much on purpose do the descriptor update separately
    from the entry extraction. So the "extract + update" sequence still
    remains as the core sequence, this is just a simplified interface.
    We should probably add a silly two-line inline helper function for
    initializing the descriptor from the "struct tree" too, just to cut down
    on the noise from that common "desc" initializer.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed

May 30, 2006

  1. Remove last vestiges of generic tree_entry_list

    The old tree_entry_list is dead, long live the unified single tree
    Yes, we now still have a compatibility function to create a bogus
    tree_entry_list in builtin-read-tree.c, but that is now entirely local
    to that very messy piece of code.
    I'd love to clean read-tree.c up too, but I'm too scared right now, so
    the best I can do is to just contain the damage, and try to make sure
    that no new users of the tree_entry_list sprout up by not having it as
    an exported interface any more.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed
  2. Remove unused "zeropad" entry from tree_list_entry

    That was a hack, only needed because 'git fsck-objects' didn't look at
    the raw tree format.  Now that fsck traverses the tree itself, we can
    drop it.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed
  3. Remove "tree->entries" tree-entry list from tree parser

    Instead, just use the tree buffer directly, and use the tree-walk
    infrastructure to walk the buffers instead of the tree-entry list.
    The tree-entry list is inefficient, and generates tons of small
    allocations for no good reason. The tree-walk infrastructure is
    generally no harder to use than following a linked list, and allows
    us to do most tree parsing in-place.
    Some programs still use the old tree-entry lists, and are a bit
    painful to convert without major surgery. For them we have a helper
    function that creates a temporary tree-entry list on demand.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed
  4. Switch "read_tree_recursive()" over to tree-walk functionality

    Don't use the tree_entry list any more.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed
  5. Make "tree_entry" have a SHA1 instead of a union of object pointers

    This is preparatory work for further cleanups, where we try to make
    tree_entry look more like the more efficient tree-walk descriptor.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed
  6. Make "struct tree" contain the pointer to the tree buffer

    This allows us to avoid allocating information for names etc, because
    we can just use the information from the tree buffer directly.
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
    Linus Torvalds authored Junio C Hamano committed

Apr 04, 2006

  1. Replace xmalloc+memset(0) with xcalloc.

    Signed-off-by: Peter Eriksen <>
    Signed-off-by: Junio C Hamano <>
    Peter Eriksen authored Junio C Hamano committed

Jan 26, 2006

  1. Only use a single parser for tree objects

    This makes read_tree_recursive and read_tree take a struct tree
    instead of a buffer. It also move the declaration of read_tree into
    tree.h (where struct tree is defined), and updates ls-tree and
    diff-index (the only places that presently use read_tree*()) to use
    the new versions.
    Signed-off-by: Daniel Barkalow <>
    Signed-off-by: Junio C Hamano <>
    Daniel Barkalow authored Junio C Hamano committed
Something went wrong with that request. Please try again.