Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Commits on Mar 21, 2007
  1. @torvalds

    Initialize tree descriptors with a helper function rather than by hand.

    torvalds authored Junio C Hamano committed
    This removes slightly more lines than it adds, but the real reason for
    doing this is that future optimizations will require more setup of the
    tree descriptor, and so we want to do it in one place.
    
    Also renamed the "desc.buf" field to "desc.buffer" just to trigger
    compiler errors for old-style manual initializations, making sure I
    didn't miss anything.
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
  2. @torvalds

    Remove "pathlen" from "struct name_entry"

    torvalds authored Junio C Hamano committed
    Since we have the "tree_entry_len()" helper function these days, and
    don't need to do a full strlen(), there's no point in saving the path
    length - it's just redundant information.
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Mar 19, 2007
  1. @torvalds

    Trivial cleanup of track_tree_refs()

    torvalds authored Junio C Hamano committed
    This makes "track_tree_refs()" use the same "tree_entry()" function for
    counting the entries as it does for actually traversing them a few lines
    later.
    
    Not a biggie, but the reason I care was that this was the only user of
    "update_tree_entry()" that didn't actually *extract* the tree entry first.
    It doesn't matter as things stand now, but it meant that a separate
    test-patch I had that avoided a few more "strlen()" calls by just saving
    the entry length in the entry descriptor and using it directly when
    updating wouldn't work without this patch.
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Feb 27, 2007
  1. convert object type handling from a string to a number

    Nicolas Pitre authored Junio C Hamano committed
    We currently have two parallel notation for dealing with object types
    in the code: a string and a numerical value.  One of them is obviously
    redundent, and the most used one requires more stack space and a bunch
    of strcmp() all over the place.
    
    This is an initial step for the removal of the version using a char array
    found in object reading code paths.  The patch is unfortunately large but
    there is no sane way to split it in smaller parts without breaking the
    system.
    
    Signed-off-by: Nicolas Pitre <nico@cam.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Dec 20, 2006
  1. simplify inclusion of system header files.

    Junio C Hamano authored
    This is a mechanical clean-up of the way *.c files include
    system header files.
    
     (1) sources under compat/, platform sha-1 implementations, and
         xdelta code are exempt from the following rules;
    
     (2) the first #include must be "git-compat-util.h" or one of
         our own header file that includes it first (e.g. config.h,
         builtin.h, pkt-line.h);
    
     (3) system headers that are included in "git-compat-util.h"
         need not be included in individual C source files.
    
     (4) "git-compat-util.h" does not have to include subsystem
         specific header files (e.g. expat.h).
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Aug 23, 2006
  1. @spearce

    Convert memcpy(a,b,20) to hashcpy(a,b).

    spearce authored Junio C Hamano committed
    This abstracts away the size of the hash values when copying them
    from memory location to memory location, much as the introduction
    of hashcmp abstracted away hash value comparsion.
    
    A few call sites were using char* rather than unsigned char* so
    I added the cast rather than open hashcpy to be void*.  This is a
    reasonable tradeoff as most call sites already use unsigned char*
    and the existing hashcmp is also declared to be unsigned char*.
    
    [jc: Splitted the patch to "master" part, to be followed by a
     patch for merge-recursive.c which is not in "master" yet.
    
     Fixed the cast in the latter hunk to combine-diff.c which was
     wrong in the original.
    
     Also converted ones left-over in combine-diff.c, diff-lib.c and
     upload-pack.c ]
    
    Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Aug 15, 2006
  1. Make track_tree_refs void.

    David Rientjes authored Junio C Hamano committed
    Signed-off-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Jul 13, 2006
  1. Remove TYPE_* constant macros and use object_type enums consistently.

    Linus Torvalds authored Junio C Hamano committed
    This updates the type-enumeration constants introduced to reduce
    the memory footprint of "struct object" to match the type bits
    already used in the packfile format, by removing the former
    (i.e. TYPE_* constant macros) and using the latter (i.e. enum
    object_type) throughout the code for consistency.
    
    Eventually we can stop passing around the "type strings"
    entirely, and this will help - no confusion about two different
    integer enumeration.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Jun 20, 2006
  1. Add specialized object allocator

    Linus Torvalds authored Junio C Hamano committed
    This creates a simple specialized object allocator for basic
    objects.
    
    This avoids wasting space with malloc overhead (metadata and
    extra alignment), since the specialized allocator knows the
    alignment, and that objects, once allocated, are never freed.
    
    It also allows us to track some basic statistics about object
    allocations. For example, for the mozilla import, it shows
    object usage as follows:
    
         blobs:   627629 (14710 kB)
         trees:  1119035 (34969 kB)
       commits:   196423  (8440 kB)
          tags:     1336    (46 kB)
    
    and the simpler allocator shaves off about 2.5% off the memory
    footprint off a "git-rev-list --all --objects", and is a bit
    faster too.
    
    [ Side note: this concludes the series of "save memory in object storage".
      The thing is, there simply isn't much more to be saved on the objects.
    
      Doing "git-rev-list --all --objects" on the mozilla archive has a final
      total RSS of 131498 pages for me: that's about 513MB. Of that, the
      object overhead is now just 56MB, the rest is going somewhere else (put
      another way: the fact that this patch shaves off 2.5% of the total
      memory overhead, considering that objects are now not much more than 10%
      of the total shows how big the wasted space really was: this makes
      object allocations much more memory- and time-efficient).
    
      I haven't looked at where the rest is, but I suspect the bulk of it is
      just the pack-file loading. It may be that we should pack the tree
      objects separately from the blob objects: for git-rev-list --objects, we
      don't actually ever need to even look at the blobs, but since trees and
      blobs are interspersed in the pack-file, we end up not being dense in
      the tree accesses, so we end up looking at more pages than we strictly
      need to.
    
      So with a 535MB pack-file, it's entirely possible - even likely - that
      most of the remaining RSS is just the mmap of the pack-file itself. We
      don't need to map in _all_ of it, but we do end up mapping a fair
      amount. ]
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Jun 18, 2006
  1. Shrink "struct object" a bit

    Linus Torvalds authored Junio C Hamano committed
    This shrinks "struct object" by a small amount, by getting rid of the
    "struct type *" pointer and replacing it with a 3-bit bitfield instead.
    
    In addition, we merge the bitfields and the "flags" field, which
    incidentally should also remove a useless 4-byte padding from the object
    when in 64-bit mode.
    
    Now, our "struct object" is still too damn large, but it's now less
    obviously bloated, and of the remaining fields, only the "util" (which is
    not used by most things) is clearly something that should be eventually
    discarded.
    
    This shrinks the "git-rev-list --all" memory use by about 2.5% on the
    kernel archive (and, perhaps more importantly, on the larger mozilla
    archive). That may not sound like much, but I suspect it's more on a
    64-bit platform.
    
    There are other remaining inefficiencies (the parent lists, for example,
    probably have horrible malloc overhead), but this was pretty obvious.
    
    Most of the patch is just changing the comparison of the "type" pointer
    from one of the constant string pointers to the appropriate new TYPE_xxx
    small integer constant.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on May 31, 2006
  1. tree_entry(): new tree-walking helper function

    Linus Torvalds authored Junio C Hamano committed
    This adds a "tree_entry()" function that combines the common operation of
    doing a "tree_entry_extract()" + "update_tree_entry()".
    
    It also has a simplified calling convention, designed for simple loops
    that traverse over a whole tree: the arguments are pointers to the tree
    descriptor and a name_entry structure to fill in, and it returns a boolean
    "true" if there was an entry left to be gotten in the tree.
    
    This allows tree traversal with
    
    	struct tree_desc desc;
    	struct name_entry entry;
    
    	desc.buf = tree->buffer;
    	desc.size = tree->size;
    	while (tree_entry(&desc, &entry) {
    		... use "entry.{path, sha1, mode, pathlen}" ...
    	}
    
    which is not only shorter than writing it out in full, it's hopefully less
    error prone too.
    
    [ It's actually a tad faster too - we don't need to recalculate the entry
      pathlength in both extract and update, but need to do it only once.
      Also, some callers can avoid doing a "strlen()" on the result, since
      it's returned as part of the name_entry structure.
    
      However, by now we're talking just 1% speedup on "git-rev-list --objects
      --all", and we're definitely at the point where tree walking is no
      longer the issue any more. ]
    
    NOTE! Not everybody wants to use this new helper function, since some of
    the tree walkers very much on purpose do the descriptor update separately
    from the entry extraction. So the "extract + update" sequence still
    remains as the core sequence, this is just a simplified interface.
    
    We should probably add a silly two-line inline helper function for
    initializing the descriptor from the "struct tree" too, just to cut down
    on the noise from that common "desc" initializer.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on May 30, 2006
  1. Remove last vestiges of generic tree_entry_list

    Linus Torvalds authored Junio C Hamano committed
    The old tree_entry_list is dead, long live the unified single tree
    parser.
    
    Yes, we now still have a compatibility function to create a bogus
    tree_entry_list in builtin-read-tree.c, but that is now entirely local
    to that very messy piece of code.
    
    I'd love to clean read-tree.c up too, but I'm too scared right now, so
    the best I can do is to just contain the damage, and try to make sure
    that no new users of the tree_entry_list sprout up by not having it as
    an exported interface any more.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
  2. Remove unused "zeropad" entry from tree_list_entry

    Linus Torvalds authored Junio C Hamano committed
    That was a hack, only needed because 'git fsck-objects' didn't look at
    the raw tree format.  Now that fsck traverses the tree itself, we can
    drop it.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
  3. Remove "tree->entries" tree-entry list from tree parser

    Linus Torvalds authored Junio C Hamano committed
    Instead, just use the tree buffer directly, and use the tree-walk
    infrastructure to walk the buffers instead of the tree-entry list.
    
    The tree-entry list is inefficient, and generates tons of small
    allocations for no good reason. The tree-walk infrastructure is
    generally no harder to use than following a linked list, and allows
    us to do most tree parsing in-place.
    
    Some programs still use the old tree-entry lists, and are a bit
    painful to convert without major surgery. For them we have a helper
    function that creates a temporary tree-entry list on demand.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
  4. Switch "read_tree_recursive()" over to tree-walk functionality

    Linus Torvalds authored Junio C Hamano committed
    Don't use the tree_entry list any more.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
  5. Make "tree_entry" have a SHA1 instead of a union of object pointers

    Linus Torvalds authored Junio C Hamano committed
    This is preparatory work for further cleanups, where we try to make
    tree_entry look more like the more efficient tree-walk descriptor.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
  6. Make "struct tree" contain the pointer to the tree buffer

    Linus Torvalds authored Junio C Hamano committed
    This allows us to avoid allocating information for names etc, because
    we can just use the information from the tree buffer directly.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Apr 4, 2006
  1. Replace xmalloc+memset(0) with xcalloc.

    Peter Eriksen authored Junio C Hamano committed
    Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Jan 26, 2006
  1. Only use a single parser for tree objects

    Daniel Barkalow authored Junio C Hamano committed
    This makes read_tree_recursive and read_tree take a struct tree
    instead of a buffer. It also move the declaration of read_tree into
    tree.h (where struct tree is defined), and updates ls-tree and
    diff-index (the only places that presently use read_tree*()) to use
    the new versions.
    
    Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Jan 7, 2006
  1. [PATCH] Compilation: zero-length array declaration.

    Junio C Hamano authored
    ISO C99 (and GCC 3.x or later) lets you write a flexible array
    at the end of a structure, like this:
    
    	struct frotz {
    		int xyzzy;
    		char nitfol[]; /* more */
    	};
    
    GCC 2.95 and 2.96 let you to do this with "char nitfol[0]";
    unfortunately this is not allowed by ISO C90.
    
    This declares such construct like this:
    
    	struct frotz {
    		int xyzzy;
    		char nitfol[FLEX_ARRAY]; /* more */
    	};
    
    and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and
    empty for others.
    
    If you are using a C90 C compiler, you should be able
    to override this with CFLAGS=-DFLEX_ARRAY=1 from the
    command line of "make".
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Dec 5, 2005
  1. struct tree: remove unused field "parent"

    Junio C Hamano authored
    The field is not used anymore, after the recent ls-tree rewrite.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Nov 29, 2005
  1. ls-tree: major rewrite to do pathspec

    Linus Torvalds authored Junio C Hamano committed
    git-ls-tree should be rewritten to use a pathspec the same way everybody
    else does. Right now it's the odd man out: if you do
    
    	git-ls-tree HEAD divers/char drivers/
    
    it will show the same files _twice_, which is not how pathspecs in general
    work.
    
    How about this patch? It breaks some of the git-ls-tree tests, but it
    makes git-ls-tree work a lot more like other git pathspec commands, and it
    removes more than 150 lines by re-using the recursive tree traversal (but
    the "-d" flag is gone for good, so I'm not pushing this too hard).
    
    		Linus
Commits on Nov 15, 2005
  1. @sigprof

    Rework object refs tracking to reduce memory usage

    sigprof authored Junio C Hamano committed
    Store pointers to referenced objects in a variable sized array instead
    of linked list.  This cuts down memory usage of utilities which use
    object references; e.g., git-fsck-objects --full on the git.git
    repository consumes about 2 MB of memory tracked by Massif instead of
    7 MB before the change.  Object refs are still the biggest consumer of
    memory (57%), but the malloc overhead for a single block instead of a
    linked list is substantially smaller.
    
    Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Sep 11, 2005
  1. [PATCH] Add a function for getting a struct tree for an ent.

    Daniel Barkalow authored Junio C Hamano committed
    Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Jul 28, 2005
  1. git-fsck-cache: be stricter about "tree" objects

    Linus Torvalds authored Junio C Hamano committed
    In particular, warn about things like zero-padding of the mode bits,
    which is a big no-no, since it makes otherwise identical trees have
    different representations (and thus different SHA1 numbers).
    
    Also make the warnings more regular.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on Jul 14, 2005
  1. Fix up read_tree() pathspec matching to use "const char **"

    Linus Torvalds authored
    The same way the other pathspecs work.  Also fix missing success return
    from the matching - not that anything actually uses this yet ;)
  2. Start adding interfaces to read in partial trees

    Linus Torvalds authored
    The same way "git-diff-tree" can limit its output to just a set of matches,
    we can read in just a partial tree for comparison purposes.
Commits on Jun 25, 2005
  1. [PATCH] Fix oversimplified optimization for add_cache_entry().

    Junio C Hamano authored Linus Torvalds committed
    An earlier change to optimize directory-file conflict check
    broke what "read-tree --emu23" expects.  This is fixed by this
    commit.
    
    (1) Introduces an explicit flag to tell add_cache_entry() not to
        check for conflicts and use it when reading an existing tree
        into an empty stage --- by definition this case can never
        introduce such conflicts.
    
    (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name()
        aware of the cache stages, and flag conflict only with paths
        in the same stage.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on Jun 8, 2005
  1. [PATCH] Anal retentive 'const unsigned char *sha1'

    Jason McMullan authored Linus Torvalds committed
    Make 'sha1' parameters const where possible
    
    Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 29, 2005
  1. [PATCH] Rewrite ls-tree to behave more like "/bin/ls -a"

    Junio C Hamano authored Linus Torvalds committed
    This is a complete rewrite of ls-tree to make it behave more
    like what "/bin/ls -a" does in the current working directory.
    
    Namely, the changes are:
    
     - Unlike the old ls-tree behaviour that used paths arguments to
       restrict output (not that it worked as intended---as pointed
       out in the mailing list discussion, it was quite incoherent),
       this rewrite uses paths arguments to specify what to show.
    
     - Without arguments, it implicitly uses the root level as its
       sole argument ("/bin/ls -a" behaves as if "." is given
       without argument).
    
     - Without -r (recursive) flag, it shows the named blob (either
       file or symlink), or the named tree and its immediate
       children.
    
     - With -r flag, it shows the named path, and recursively
       descends into it if it is a tree.
    
     - With -d flag, it shows the named path and does not show its
       children even if the path is a tree, nor descends into it
       recursively.
    
    This is still request-for-comments patch.  There is no mailing
    list consensus that this proposed new behaviour is a good one.
    
    The patch to t/t3100-ls-tree-restrict.sh illustrates
    user-visible behaviour changes.  Namely:
    
     * "git-ls-tree $tree path1 path0" lists path1 first and then
       path0.  It used to use paths as an output restrictor and
       showed output in cache entry order (i.e. path0 first and then
       path1) regardless of the order of paths arguments.
    
     * "git-ls-tree $tree path2" lists path2 and its immediate
       children but having explicit paths argument does not imply
       recursive behaviour anymore, hence paths/baz is shown but not
       paths/baz/b.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 20, 2005
  1. [PATCH] delta check

    Nicolas Pitre authored Linus Torvalds committed
    This adds knowledge of delta objects to fsck-cache and various object
    parsing code.  A new switch to git-fsck-cache is provided to display the
    maximum delta depth found in a repository.
    
    Signed-off-by: Nicolas Pitre <nico@cam.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 11, 2005
  1. @jonas

    [PATCH] read_tree_recursive(): Fix leaks

    jonas authored Petr Baudis committed
    Fix two potential leaks.
    
    Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
    Signed-off-by: Petr Baudis <pasky@ucw.cz>
Commits on May 8, 2005
  1. Add git-update-cache --replace option.

    Junio C Hamano authored
    When "path" exists as a file or a symlink in the index, an
    attempt to add "path/file" is refused because it results in file
    vs directory conflict.  Similarly when "path/file1",
    "path/file2", etc. exist, an attempt to add "path" as a file or
    a symlink is refused.  With git-update-cache --replace, these
    existing entries that conflict with the entry being added are
    automatically removed from the cache, with warning messages.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on May 6, 2005
  1. [PATCH] don't load and decompress objects twice with parse_object()

    Nicolas Pitre authored Linus Torvalds committed
    It turns out that parse_object() is loading and decompressing given
    object to free it just before calling the specific object parsing
    function which does mmap and decompress the same object again. This
    patch introduces the ability to parse specific objects directly from a
    memory buffer.
    
    Without this patch, running git-fsck-cache on the kernel repositorytake:
    
    	real    0m13.006s
    	user    0m11.421s
    	sys     0m1.218s
    
    With this patch applied:
    
    	real    0m8.060s
    	user    0m7.071s
    	sys     0m0.710s
    
    The performance increase is significant, and this is kind of a
    prerequisite for sane delta object support with fsck.
    
    Signed-off-by: Nicolas Pitre <nico@cam.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 5, 2005
  1. Be more careful about tree entry modes.

    Linus Torvalds authored
    The tree object parsing used to get the executable bit wrong,
    and didn't know about symlinks. Also, fsck really wants the
    full mode value so that it can verify the other bits for sanity,
    so save it all in struct tree_entry.
Something went wrong with that request. Please try again.