Skip to content
Commits on Oct 27, 2011
  1. @pclouds @gitster

    tree_entry_interesting(): give meaningful names to return values

    pclouds committed with gitster
    It is a basic code hygiene to avoid magic constants that are unnamed.
    Besides, this helps extending the value later on for "interesting, but
    cannot decide if the entry truely matches yet" (ie. prefix matches)
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Sep 1, 2011
  1. @gitster

    list-objects: pass callback data to show_objects()

    gitster committed
    The traverse_commit_list() API takes two callback functions, one to show
    commit objects, and the other to show other kinds of objects. Even though
    the former has a callback data parameter, so that the callback does not
    have to rely on global state, the latter does not.
    
    Give the show_objects() callback the same callback data parameter.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on May 6, 2011
  1. @gitster

    Merge branch 'nd/struct-pathspec'

    gitster committed
    * nd/struct-pathspec:
      pathspec: rename per-item field has_wildcard to use_wildcard
      Improve tree_entry_interesting() handling code
      Convert read_tree{,_recursive} to support struct pathspec
      Reimplement read_tree_recursive() using tree_entry_interesting()
Commits on Mar 25, 2011
  1. @pclouds @gitster

    Improve tree_entry_interesting() handling code

    pclouds committed with gitster
    t_e_i() can return -1 or 2 to early shortcut a search. Current code
    may use up to two variables to handle it. One for saving return value
    from t_e_i temporarily, one for saving return code 2.
    
    The second variable is not needed. If we make sure the first variable
    does not change until the next t_e_i() call, then we can do something
    like this:
    
    int ret = 0;
    
    while (...) {
    	if (ret != 2) {
    		ret = t_e_i();
    		if (ret < 0) /* no longer interesting */
    			break;
    		if (ret == 0) /* skip this round */
    			continue;
    	}
    	/* ret > 0, interesting */
    }
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Mar 23, 2011
  1. @gitster

    Merge branch 'jc/maint-rev-list-culled-boundary'

    gitster committed
    * jc/maint-rev-list-culled-boundary:
      list-objects.c: don't add an unparsed NULL as a pending tree
    
    Conflicts:
    	list-objects.c
Commits on Mar 14, 2011
  1. @gitster

    list-objects.c: don't add an unparsed NULL as a pending tree

    gitster committed
    "git rev-list --first-parent --boundary $commit^..$commit" segfaults on a
    merge commit since 8d2dfc4 (process_{tree,blob}: show objects without
    buffering, 2009-04-10), as it tried to dereference a commit that was
    discarded as UNINTERESTING without being parsed (hence lacking "tree").
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Feb 3, 2011
  1. @newren @gitster

    Make rev-list --objects work together with pathspecs

    newren committed with gitster
    When traversing commits, the selection of commits would heed the list of
    pathspecs passed, but subsequent walking of the trees of those commits
    would not.  This resulted in 'rev-list --objects HEAD -- <paths>'
    displaying objects at unwanted paths.
    
    Have process_tree() call tree_entry_interesting() to determine which paths
    are interesting and should be walked.
    
    Naturally, this change can provide a large speedup when paths are specified
    together with --objects, since many tree entries are now correctly ignored.
    Interestingly, though, this change also gives me a small (~1%) but
    repeatable speedup even when no paths are specified with --objects.
    
    Signed-off-by: Elijah Newren <newren@gmail.com>
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Apr 18, 2009
  1. @gitster

    Merge branch 'lt/pack-object-memuse'

    gitster committed
    * lt/pack-object-memuse:
      show_object(): push path_name() call further down
      process_{tree,blob}: show objects without buffering
    
    Conflicts:
    	builtin-pack-objects.c
    	builtin-rev-list.c
    	list-objects.c
    	list-objects.h
    	upload-pack.c
Commits on Apr 13, 2009
  1. @torvalds @gitster

    process_{tree,blob}: show objects without buffering

    torvalds committed with gitster
    Here's a less trivial thing, and slightly more dubious one.
    
    I was looking at that "struct object_array objects", and wondering why we
    do that. I have honestly totally forgotten. Why not just call the "show()"
    function as we encounter the objects? Rather than add the objects to the
    object_array, and then at the very end going through the array and doing a
    'show' on all, just do things more incrementally.
    
    Now, there are possible downsides to this:
    
     - the "buffer using object_array" _can_ in theory result in at least
       better I-cache usage (two tight loops rather than one more spread out
       one). I don't think this is a real issue, but in theory..
    
     - this _does_ change the order of the objects printed. Instead of doing a
       "process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
       over the commits (which puts all the root trees _first_ in the object
       list, this patch just adds them to the list of pending objects, and
       then we'll traverse them in that order (and thus show each root tree
       object together with the objects we discover under it)
    
       I _think_ the new ordering actually makes more sense, but the object
       ordering is actually a subtle thing when it comes to packing
       efficiency, so any change in order is going to have implications for
       packing. Good or bad, I dunno.
    
     - There may be some reason why we did it that odd way with the object
       array, that I have simply forgotten.
    
    Anyway, now that we don't buffer up the objects before showing them
    that may actually result in lower memory usage during that whole
    traverse_commit_list() phase.
    
    This is seriously not very deeply tested. It makes sense to me, it seems
    to pass all the tests, it looks ok, but...
    
    Does anybody remember why we did that "object_array" thing? It used to be
    an "object_list" a long long time ago, but got changed into the array due
    to better memory usage patterns (those linked lists of obejcts are
    horrible from a memory allocation standpoint). But I wonder why we didn't
    do this back then. Maybe there's a reason for it.
    
    Or maybe there _used_ to be a reason, and no longer is.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. @torvalds @gitster

    show_object(): push path_name() call further down

    torvalds committed with gitster
    In particular, pushing the "path_name()" call _into_ the show() function
    would seem to allow
    
     - more clarity into who "owns" the name (ie now when we free the name in
       the show_object callback, it's because we generated it ourselves by
       calling path_name())
    
     - not calling path_name() at all, either because we don't care about the
       name in the first place, or because we are actually happy walking the
       linked list of "struct name_path *" and the last component.
    
    Now, I didn't do that latter optimization, because it would require some
    more coding, but especially looking at "builtin-pack-objects.c", we really
    don't even want the whole pathname, we really would be better off with the
    list of path components.
    
    Why? We use that name for two things:
     - add_preferred_base_object(), which actually _wants_ to traverse the
       path, and now does it by looking for '/' characters!
     - for 'name_hash()', which only cares about the last 16 characters of a
       name, so again, generating the full name seems to be just unnecessary
       work.
    
    Anyway, so I didn't look any closer at those things, but it did convince
    me that the "show_object()" calling convention was crazy, and we're
    actually better off doing _less_ in list-objects.c, and giving people
    access to the internal data structures so that they can decide whether
    they want to generate a path-name or not.
    
    This patch does that, and then for people who did use the name (even if
    they might do something more clever in the future), it just does the
    straightforward "name = path_name(path, component); .. free(name);" thing.
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Apr 12, 2009
  1. @gitster

    Merge branch 'cc/bisect-filter'

    gitster committed
    * cc/bisect-filter: (21 commits)
      rev-list: add "int bisect_show_flags" in "struct rev_list_info"
      rev-list: remove last static vars used in "show_commit"
      list-objects: add "void *data" parameter to show functions
      bisect--helper: string output variables together with "&&"
      rev-list: pass "int flags" as last argument of "show_bisect_vars"
      t6030: test bisecting with paths
      bisect: use "bisect--helper" and remove "filter_skipped" function
      bisect: implement "read_bisect_paths" to read paths in "$GIT_DIR/BISECT_NAMES"
      bisect--helper: implement "git bisect--helper"
      bisect: use the new generic "sha1_pos" function to lookup sha1
      rev-list: call new "filter_skip" function
      patch-ids: use the new generic "sha1_pos" function to lookup sha1
      sha1-lookup: add new "sha1_pos" function to efficiently lookup sha1
      rev-list: pass "revs" to "show_bisect_vars"
      rev-list: make "show_bisect_vars" non static
      rev-list: move code to show bisect vars into its own function
      rev-list: move bisect related code into its own file
      rev-list: make "bisect_list" variable local to "cmd_rev_list"
      refs: add "for_each_ref_in" function to refactor "for_each_*_ref" functions
      quote: add "sq_dequote_to_argv" to put unwrapped args in an argv array
      ...
Commits on Apr 9, 2009
  1. @dotdash @gitster

    process_{tree,blob}: Remove useless xstrdup calls

    dotdash committed with gitster
    The name of the processed object was duplicated for passing it to
    add_object(), but that already calls path_name, which allocates a new
    string anyway. So the memory allocated by the xstrdup calls just went
    nowhere, leaking memory.
    
    This reduces the RSS usage for a "rev-list --all --objects" by about 10% on
    the gentoo repo (fully packed) as well as linux-2.6.git:
    
        gentoo:
                        | old           | new
        ----------------|-------------------------------
        RSS             |       1537284 |       1388408
        VSZ             |       1816852 |       1667952
        time elapsed    |       1:49.62 |       1:48.99
        min. page faults|        417178 |        379919
    
        linux-2.6.git:
                        | old           | new
        ----------------|-------------------------------
        RSS             |        324452 |        292996
        VSZ             |        491792 |        460376
        time elapsed    |       0:14.53 |       0:14.28
        min. page faults|         89360 |         81613
    
    Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Apr 8, 2009
  1. @chriscool @gitster

    list-objects: add "void *data" parameter to show functions

    chriscool committed with gitster
    The goal of this patch is to get rid of the "static struct rev_info
    revs" static variable in "builtin-rev-list.c".
    
    To do that, we need to pass the revs to the "show_commit" function
    in "builtin-rev-list.c" and this in turn means that the
    "traverse_commit_list" function in "list-objects.c" must be passed
    functions pointers to functions with 2 parameters instead of one.
    
    So we have to change all the callers and all the functions passed
    to "traverse_commit_list".
    
    Anyway this makes the code more clean and more generic, so it
    should be a good thing in the long run.
    
    Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Feb 19, 2008
  1. @gitster

    list-objects.c::process_tree/blob: check for NULL

    Martin Koegler committed with gitster
    As these functions are directly called with the result
    from lookup_tree/blob, they must handle NULL.
    
    Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Nov 10, 2007
  1. @spearce @gitster

    Fix memory leak in traverse_commit_list

    spearce committed with gitster
    If we were listing objects too then the objects were buffered in an
    array only reachable from a stack allocated structure.  When this
    function returns that array would be leaked as nobody would have
    a reference to it anymore.
    
    Historically this hasn't been a problem as the primary user of
    traverse_commit_list() (the noble git-rev-list) would terminate
    as soon as the function was finished, thus allowing the operating
    system to cleanup memory.  However we have been leaking this data
    in git-pack-objects ever since that program learned how to run the
    revision listing internally, rather than relying on reading object
    names from git-rev-list.
    
    To better facilitate reuse of traverse_commit_list during other
    builtin tools (such as git-fetch) we shouldn't leak temporary memory
    like this and instead we need to clean up properly after ourselves.
    
    Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on May 22, 2007
  1. @tali

    rename dirlink to gitlink.

    tali committed with Junio C Hamano
    Unify naming of plumbing dirlink/gitlink concept:
    
    git ls-files -z '*.[ch]' |
    xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Apr 14, 2007
  1. @torvalds

    Teach git list-objects logic to not follow gitlinks

    torvalds committed with Junio C Hamano
    This allows us to pack superprojects and thus clone them (but not yet
    check them out on the receiving side.. That's the next patch)
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Mar 21, 2007
  1. @torvalds

    Initialize tree descriptors with a helper function rather than by hand.

    torvalds committed with Junio C Hamano
    This removes slightly more lines than it adds, but the real reason for
    doing this is that future optimizations will require more setup of the
    tree descriptor, and so we want to do it in one place.
    
    Also renamed the "desc.buf" field to "desc.buffer" just to trigger
    compiler errors for old-style manual initializations, making sure I
    didn't miss anything.
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Sep 7, 2006
  1. pack-objects: further work on internal rev-list logic.

    Junio C Hamano committed
    This teaches the internal rev-list logic to understand options
    that are needed for pack handling: --all, --unpacked, and --thin.
    
    It also moves two functions from builtin-rev-list to list-objects
    so that the two programs can share more code.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
  2. Separate object listing routines out of rev-list

    Junio C Hamano committed
    Create a separate file, list-objects.c, and move object listing
    routines from rev-list to it.  The next round will use it in
    pack-objects directly.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Something went wrong with that request. Please try again.