Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Aug 23, 2006

  1. Shawn O. Pearce

    Convert memcpy(a,b,20) to hashcpy(a,b).

    This abstracts away the size of the hash values when copying them
    from memory location to memory location, much as the introduction
    of hashcmp abstracted away hash value comparsion.
    
    A few call sites were using char* rather than unsigned char* so
    I added the cast rather than open hashcpy to be void*.  This is a
    reasonable tradeoff as most call sites already use unsigned char*
    and the existing hashcmp is also declared to be unsigned char*.
    
    [jc: Splitted the patch to "master" part, to be followed by a
     patch for merge-recursive.c which is not in "master" yet.
    
     Fixed the cast in the latter hunk to combine-diff.c which was
     wrong in the original.
    
     Also converted ones left-over in combine-diff.c, diff-lib.c and
     upload-pack.c ]
    
    Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    spearce authored Junio C Hamano committed

Aug 15, 2006

  1. Make track_tree_refs void.

    Signed-off-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    David Rientjes authored Junio C Hamano committed

Jul 13, 2006

  1. Remove TYPE_* constant macros and use object_type enums consistently.

    This updates the type-enumeration constants introduced to reduce
    the memory footprint of "struct object" to match the type bits
    already used in the packfile format, by removing the former
    (i.e. TYPE_* constant macros) and using the latter (i.e. enum
    object_type) throughout the code for consistency.
    
    Eventually we can stop passing around the "type strings"
    entirely, and this will help - no confusion about two different
    integer enumeration.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed

Jun 20, 2006

  1. Add specialized object allocator

    This creates a simple specialized object allocator for basic
    objects.
    
    This avoids wasting space with malloc overhead (metadata and
    extra alignment), since the specialized allocator knows the
    alignment, and that objects, once allocated, are never freed.
    
    It also allows us to track some basic statistics about object
    allocations. For example, for the mozilla import, it shows
    object usage as follows:
    
         blobs:   627629 (14710 kB)
         trees:  1119035 (34969 kB)
       commits:   196423  (8440 kB)
          tags:     1336    (46 kB)
    
    and the simpler allocator shaves off about 2.5% off the memory
    footprint off a "git-rev-list --all --objects", and is a bit
    faster too.
    
    [ Side note: this concludes the series of "save memory in object storage".
      The thing is, there simply isn't much more to be saved on the objects.
    
      Doing "git-rev-list --all --objects" on the mozilla archive has a final
      total RSS of 131498 pages for me: that's about 513MB. Of that, the
      object overhead is now just 56MB, the rest is going somewhere else (put
      another way: the fact that this patch shaves off 2.5% of the total
      memory overhead, considering that objects are now not much more than 10%
      of the total shows how big the wasted space really was: this makes
      object allocations much more memory- and time-efficient).
    
      I haven't looked at where the rest is, but I suspect the bulk of it is
      just the pack-file loading. It may be that we should pack the tree
      objects separately from the blob objects: for git-rev-list --objects, we
      don't actually ever need to even look at the blobs, but since trees and
      blobs are interspersed in the pack-file, we end up not being dense in
      the tree accesses, so we end up looking at more pages than we strictly
      need to.
    
      So with a 535MB pack-file, it's entirely possible - even likely - that
      most of the remaining RSS is just the mmap of the pack-file itself. We
      don't need to map in _all_ of it, but we do end up mapping a fair
      amount. ]
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed

Jun 18, 2006

  1. Shrink "struct object" a bit

    This shrinks "struct object" by a small amount, by getting rid of the
    "struct type *" pointer and replacing it with a 3-bit bitfield instead.
    
    In addition, we merge the bitfields and the "flags" field, which
    incidentally should also remove a useless 4-byte padding from the object
    when in 64-bit mode.
    
    Now, our "struct object" is still too damn large, but it's now less
    obviously bloated, and of the remaining fields, only the "util" (which is
    not used by most things) is clearly something that should be eventually
    discarded.
    
    This shrinks the "git-rev-list --all" memory use by about 2.5% on the
    kernel archive (and, perhaps more importantly, on the larger mozilla
    archive). That may not sound like much, but I suspect it's more on a
    64-bit platform.
    
    There are other remaining inefficiencies (the parent lists, for example,
    probably have horrible malloc overhead), but this was pretty obvious.
    
    Most of the patch is just changing the comparison of the "type" pointer
    from one of the constant string pointers to the appropriate new TYPE_xxx
    small integer constant.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed

May 31, 2006

  1. tree_entry(): new tree-walking helper function

    This adds a "tree_entry()" function that combines the common operation of
    doing a "tree_entry_extract()" + "update_tree_entry()".
    
    It also has a simplified calling convention, designed for simple loops
    that traverse over a whole tree: the arguments are pointers to the tree
    descriptor and a name_entry structure to fill in, and it returns a boolean
    "true" if there was an entry left to be gotten in the tree.
    
    This allows tree traversal with
    
    	struct tree_desc desc;
    	struct name_entry entry;
    
    	desc.buf = tree->buffer;
    	desc.size = tree->size;
    	while (tree_entry(&desc, &entry) {
    		... use "entry.{path, sha1, mode, pathlen}" ...
    	}
    
    which is not only shorter than writing it out in full, it's hopefully less
    error prone too.
    
    [ It's actually a tad faster too - we don't need to recalculate the entry
      pathlength in both extract and update, but need to do it only once.
      Also, some callers can avoid doing a "strlen()" on the result, since
      it's returned as part of the name_entry structure.
    
      However, by now we're talking just 1% speedup on "git-rev-list --objects
      --all", and we're definitely at the point where tree walking is no
      longer the issue any more. ]
    
    NOTE! Not everybody wants to use this new helper function, since some of
    the tree walkers very much on purpose do the descriptor update separately
    from the entry extraction. So the "extract + update" sequence still
    remains as the core sequence, this is just a simplified interface.
    
    We should probably add a silly two-line inline helper function for
    initializing the descriptor from the "struct tree" too, just to cut down
    on the noise from that common "desc" initializer.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed

May 30, 2006

  1. Remove last vestiges of generic tree_entry_list

    The old tree_entry_list is dead, long live the unified single tree
    parser.
    
    Yes, we now still have a compatibility function to create a bogus
    tree_entry_list in builtin-read-tree.c, but that is now entirely local
    to that very messy piece of code.
    
    I'd love to clean read-tree.c up too, but I'm too scared right now, so
    the best I can do is to just contain the damage, and try to make sure
    that no new users of the tree_entry_list sprout up by not having it as
    an exported interface any more.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed
  2. Remove unused "zeropad" entry from tree_list_entry

    That was a hack, only needed because 'git fsck-objects' didn't look at
    the raw tree format.  Now that fsck traverses the tree itself, we can
    drop it.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed
  3. Remove "tree->entries" tree-entry list from tree parser

    Instead, just use the tree buffer directly, and use the tree-walk
    infrastructure to walk the buffers instead of the tree-entry list.
    
    The tree-entry list is inefficient, and generates tons of small
    allocations for no good reason. The tree-walk infrastructure is
    generally no harder to use than following a linked list, and allows
    us to do most tree parsing in-place.
    
    Some programs still use the old tree-entry lists, and are a bit
    painful to convert without major surgery. For them we have a helper
    function that creates a temporary tree-entry list on demand.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed
  4. Switch "read_tree_recursive()" over to tree-walk functionality

    Don't use the tree_entry list any more.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed
  5. Make "tree_entry" have a SHA1 instead of a union of object pointers

    This is preparatory work for further cleanups, where we try to make
    tree_entry look more like the more efficient tree-walk descriptor.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed
  6. Make "struct tree" contain the pointer to the tree buffer

    This allows us to avoid allocating information for names etc, because
    we can just use the information from the tree buffer directly.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Linus Torvalds authored Junio C Hamano committed

Apr 04, 2006

  1. Replace xmalloc+memset(0) with xcalloc.

    Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Peter Eriksen authored Junio C Hamano committed

Jan 26, 2006

  1. Only use a single parser for tree objects

    This makes read_tree_recursive and read_tree take a struct tree
    instead of a buffer. It also move the declaration of read_tree into
    tree.h (where struct tree is defined), and updates ls-tree and
    diff-index (the only places that presently use read_tree*()) to use
    the new versions.
    
    Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Daniel Barkalow authored Junio C Hamano committed

Jan 07, 2006

  1. [PATCH] Compilation: zero-length array declaration.

    ISO C99 (and GCC 3.x or later) lets you write a flexible array
    at the end of a structure, like this:
    
    	struct frotz {
    		int xyzzy;
    		char nitfol[]; /* more */
    	};
    
    GCC 2.95 and 2.96 let you to do this with "char nitfol[0]";
    unfortunately this is not allowed by ISO C90.
    
    This declares such construct like this:
    
    	struct frotz {
    		int xyzzy;
    		char nitfol[FLEX_ARRAY]; /* more */
    	};
    
    and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and
    empty for others.
    
    If you are using a C90 C compiler, you should be able
    to override this with CFLAGS=-DFLEX_ARRAY=1 from the
    command line of "make".
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Junio C Hamano authored

Dec 05, 2005

  1. struct tree: remove unused field "parent"

    The field is not used anymore, after the recent ls-tree rewrite.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Junio C Hamano authored

Nov 29, 2005

  1. ls-tree: major rewrite to do pathspec

    git-ls-tree should be rewritten to use a pathspec the same way everybody
    else does. Right now it's the odd man out: if you do
    
    	git-ls-tree HEAD divers/char drivers/
    
    it will show the same files _twice_, which is not how pathspecs in general
    work.
    
    How about this patch? It breaks some of the git-ls-tree tests, but it
    makes git-ls-tree work a lot more like other git pathspec commands, and it
    removes more than 150 lines by re-using the recursive tree traversal (but
    the "-d" flag is gone for good, so I'm not pushing this too hard).
    
    		Linus
    Linus Torvalds authored Junio C Hamano committed

Nov 15, 2005

  1. Sergey Vlasov

    Rework object refs tracking to reduce memory usage

    Store pointers to referenced objects in a variable sized array instead
    of linked list.  This cuts down memory usage of utilities which use
    object references; e.g., git-fsck-objects --full on the git.git
    repository consumes about 2 MB of memory tracked by Massif instead of
    7 MB before the change.  Object refs are still the biggest consumer of
    memory (57%), but the malloc overhead for a single block instead of a
    linked list is substantially smaller.
    
    Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    sigprof authored Junio C Hamano committed

Sep 11, 2005

  1. [PATCH] Add a function for getting a struct tree for an ent.

    Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Daniel Barkalow authored Junio C Hamano committed

Jul 28, 2005

  1. git-fsck-cache: be stricter about "tree" objects

    In particular, warn about things like zero-padding of the mode bits,
    which is a big no-no, since it makes otherwise identical trees have
    different representations (and thus different SHA1 numbers).
    
    Also make the warnings more regular.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Linus Torvalds authored Junio C Hamano committed

Jul 14, 2005

  1. Fix up read_tree() pathspec matching to use "const char **"

    The same way the other pathspecs work.  Also fix missing success return
    from the matching - not that anything actually uses this yet ;)
    Linus Torvalds authored
  2. Start adding interfaces to read in partial trees

    The same way "git-diff-tree" can limit its output to just a set of matches,
    we can read in just a partial tree for comparison purposes.
    Linus Torvalds authored

Jun 25, 2005

  1. [PATCH] Fix oversimplified optimization for add_cache_entry().

    An earlier change to optimize directory-file conflict check
    broke what "read-tree --emu23" expects.  This is fixed by this
    commit.
    
    (1) Introduces an explicit flag to tell add_cache_entry() not to
        check for conflicts and use it when reading an existing tree
        into an empty stage --- by definition this case can never
        introduce such conflicts.
    
    (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name()
        aware of the cache stages, and flag conflict only with paths
        in the same stage.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Junio C Hamano authored Linus Torvalds committed

Jun 08, 2005

  1. [PATCH] Anal retentive 'const unsigned char *sha1'

    Make 'sha1' parameters const where possible
    
    Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Jason McMullan authored Linus Torvalds committed

May 29, 2005

  1. [PATCH] Rewrite ls-tree to behave more like "/bin/ls -a"

    This is a complete rewrite of ls-tree to make it behave more
    like what "/bin/ls -a" does in the current working directory.
    
    Namely, the changes are:
    
     - Unlike the old ls-tree behaviour that used paths arguments to
       restrict output (not that it worked as intended---as pointed
       out in the mailing list discussion, it was quite incoherent),
       this rewrite uses paths arguments to specify what to show.
    
     - Without arguments, it implicitly uses the root level as its
       sole argument ("/bin/ls -a" behaves as if "." is given
       without argument).
    
     - Without -r (recursive) flag, it shows the named blob (either
       file or symlink), or the named tree and its immediate
       children.
    
     - With -r flag, it shows the named path, and recursively
       descends into it if it is a tree.
    
     - With -d flag, it shows the named path and does not show its
       children even if the path is a tree, nor descends into it
       recursively.
    
    This is still request-for-comments patch.  There is no mailing
    list consensus that this proposed new behaviour is a good one.
    
    The patch to t/t3100-ls-tree-restrict.sh illustrates
    user-visible behaviour changes.  Namely:
    
     * "git-ls-tree $tree path1 path0" lists path1 first and then
       path0.  It used to use paths as an output restrictor and
       showed output in cache entry order (i.e. path0 first and then
       path1) regardless of the order of paths arguments.
    
     * "git-ls-tree $tree path2" lists path2 and its immediate
       children but having explicit paths argument does not imply
       recursive behaviour anymore, hence paths/baz is shown but not
       paths/baz/b.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Junio C Hamano authored Linus Torvalds committed

May 20, 2005

  1. [PATCH] delta check

    This adds knowledge of delta objects to fsck-cache and various object
    parsing code.  A new switch to git-fsck-cache is provided to display the
    maximum delta depth found in a repository.
    
    Signed-off-by: Nicolas Pitre <nico@cam.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Nicolas Pitre authored Linus Torvalds committed

May 11, 2005

  1. Jonas Fonseca

    [PATCH] read_tree_recursive(): Fix leaks

    Fix two potential leaks.
    
    Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
    Signed-off-by: Petr Baudis <pasky@ucw.cz>
    jonas authored Petr Baudis committed

May 08, 2005

  1. Add git-update-cache --replace option.

    When "path" exists as a file or a symlink in the index, an
    attempt to add "path/file" is refused because it results in file
    vs directory conflict.  Similarly when "path/file1",
    "path/file2", etc. exist, an attempt to add "path" as a file or
    a symlink is refused.  With git-update-cache --replace, these
    existing entries that conflict with the entry being added are
    automatically removed from the cache, with warning messages.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Junio C Hamano authored

May 06, 2005

  1. [PATCH] don't load and decompress objects twice with parse_object()

    It turns out that parse_object() is loading and decompressing given
    object to free it just before calling the specific object parsing
    function which does mmap and decompress the same object again. This
    patch introduces the ability to parse specific objects directly from a
    memory buffer.
    
    Without this patch, running git-fsck-cache on the kernel repositorytake:
    
    	real    0m13.006s
    	user    0m11.421s
    	sys     0m1.218s
    
    With this patch applied:
    
    	real    0m8.060s
    	user    0m7.071s
    	sys     0m0.710s
    
    The performance increase is significant, and this is kind of a
    prerequisite for sane delta object support with fsck.
    
    Signed-off-by: Nicolas Pitre <nico@cam.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Nicolas Pitre authored Linus Torvalds committed

May 05, 2005

  1. Be more careful about tree entry modes.

    The tree object parsing used to get the executable bit wrong,
    and didn't know about symlinks. Also, fsck really wants the
    full mode value so that it can verify the other bits for sanity,
    so save it all in struct tree_entry.
    Linus Torvalds authored

May 04, 2005

  1. Sergey Vlasov

    [PATCH] Fix memory leaks in git-fsck-cache

    This patch fixes memory leaks in parse_object() and related functions;
    these leaks were very noticeable when running git-fsck-cache.
    
    Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    sigprof authored Linus Torvalds committed

May 02, 2005

  1. Make fsck-cache do better tree checking.

    We check the ordering of the entries, and we verify that none
    of the entries has a slash in it (this allows us to remove the
    hacky "has_full_path" member from the tree structure, since we
    now just test it by walking the tree entries instead).
    Linus Torvalds authored

Apr 26, 2005

  1. [PATCH] introduce xmalloc and xrealloc

    Introduce xmalloc and xrealloc to die gracefully with a descriptive
    message when out of memory, rather than taking a SIGSEGV. 
    
    Signed-off-by: Christopher Li<chrislgit@chrisli.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    Christopher Li authored Linus Torvalds committed

Apr 24, 2005

  1. Don't add references to objects we couldn't find.

    That would SIGSEGV.
    Linus Torvalds authored
  2. Verify that the object type matches for tree/commit objects even befo…

    …re parsing.
    
    The type doesn't come from the parsing, the type also has to match the usage.
    Linus Torvalds authored
Something went wrong with that request. Please try again.