Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Commits on Oct 7, 2011
  1. pickaxe: factor out pickaxe

    René Scharfe authored committed
    Move the duplicate diff queue loop into its own function that accepts
    a match function: has_changes() for -S and diff_grep() for -G.
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. pickaxe: give diff_grep the same signature as has_changes

    René Scharfe authored committed
    Change diff_grep() to match the signature of has_changes() as a
    preparation for the next patch that will use function pointers to
    the two.
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  3. pickaxe: pass diff_options to contains and has_changes

    René Scharfe authored committed
    Remove the unused parameter needle from contains() and has_changes().
    
    Also replace the parameter len with a pointer to the diff_options.  We
    can use its member pickaxe to check if the needle is an empty string
    and use the kwsmatch structure to find out the length of the match
    instead.
    
    This change is done as a preparation to unify the signatures of
    has_changes() and diff_grep(), which will be used in the patch after
    the next one to factor out common code.
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  4. pickaxe: factor out has_changes

    René Scharfe authored committed
    Move duplicate if/else construct into its own helper function.
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  5. pickaxe: plug regex/kws leak

    René Scharfe authored committed
    With -S... --pickaxe-all, free the regex or the kws before returning
    even if we found a match.  Also get rid of the variable has_changes,
    as we can simply break out of the loop.
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  6. pickaxe: plug regex leak

    René Scharfe authored committed
    With -G... --pickaxe-all, free the regex before returning even if we
    found a match.  Also get rid of the variable has_changes, as we can
    simply break out of the loop.
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  7. pickaxe: plug diff filespec leak with empty needle

    René Scharfe authored committed
    Check first for the unlikely case of an empty needle string and only
    then populate the filespec, lest we leak it.
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Aug 21, 2011
  1. Use kwset in pickaxe

    Fredrik Kuivinen authored committed
    Benchmarks in the hot cache case:
    
    before:
    $ perf stat --repeat=5 git log -Sqwerty
    
    Performance counter stats for 'git log -Sqwerty' (5 runs):
    
           47,092,744 cache-misses             #      2.825 M/sec   ( +-   1.607% )
          123,368,389 cache-references         #      7.400 M/sec   ( +-   0.812% )
          330,040,998 branch-misses            #      3.134 %       ( +-   0.257% )
       10,530,896,750 branches                 #    631.663 M/sec   ( +-   0.121% )
       62,037,201,030 instructions             #      1.399 IPC     ( +-   0.142% )
       44,331,294,321 cycles                   #   2659.073 M/sec   ( +-   0.326% )
               96,794 page-faults              #      0.006 M/sec   ( +-  11.952% )
                   25 CPU-migrations           #      0.000 M/sec   ( +-  25.266% )
                1,424 context-switches         #      0.000 M/sec   ( +-   0.540% )
         16671.708650 task-clock-msecs         #      0.997 CPUs    ( +-   0.343% )
    
          16.728692052  seconds time elapsed   ( +-   0.344% )
    
    after:
    $ perf stat --repeat=5 git log -Sqwerty
    
    Performance counter stats for 'git log -Sqwerty' (5 runs):
    
           51,385,522 cache-misses             #      4.619 M/sec   ( +-   0.565% )
          129,177,880 cache-references         #     11.611 M/sec   ( +-   0.219% )
          319,222,775 branch-misses            #      6.946 %       ( +-   0.134% )
        4,595,913,233 branches                 #    413.086 M/sec   ( +-   0.112% )
       31,395,042,533 instructions             #      1.062 IPC     ( +-   0.129% )
       29,558,348,598 cycles                   #   2656.740 M/sec   ( +-   0.204% )
               93,224 page-faults              #      0.008 M/sec   ( +-   4.487% )
                   19 CPU-migrations           #      0.000 M/sec   ( +-  10.425% )
                  950 context-switches         #      0.000 M/sec   ( +-   0.360% )
         11125.796039 task-clock-msecs         #      0.997 CPUs    ( +-   0.239% )
    
          11.164216599  seconds time elapsed   ( +-   0.240% )
    
    So the kwset code is about 33% faster.
    
    Signed-off-by: Fredrik Kuivinen <frekui@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Oct 6, 2010
  1. @drafnel

    diffcore-pickaxe.c: a void function shouldn't try to return something

    drafnel authored committed
    Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. Merge branch 'maint'

    authored
    * maint:
      Documentation/git-clone: describe --mirror more verbosely
      do not depend on signed integer overflow
      work around buggy S_ISxxx(m) implementations
      xdiff: cast arguments for ctype functions to unsigned char
      init: plug tiny one-time memory leak
      diffcore-pickaxe.c: remove unnecessary curly braces
      t3020 (ls-files-error-unmatch): remove stray '1' from end of file
      setup: make sure git dir path is in a permanent buffer
      environment.c: remove unused variable
      git-svn: fix processing of decorated commit hashes
      git-svn: check_cherry_pick should exclude commits already in our history
      Documentation/git-svn: discourage "noMetadata"
Commits on Oct 5, 2010
  1. @drafnel

    diffcore-pickaxe.c: remove unnecessary curly braces

    drafnel authored committed
    Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Aug 31, 2010
  1. git log/diff: add -G<regexp> that greps in the patch text

    authored
    Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the
    "git diff" family of commands.  This limits the diff queue to filepairs
    whose patch text actually has an added or a deleted line that matches the
    given regexp.  Unlike "-S<regexp>", changing other parts of the line that
    has a substring that matches the given regexp IS counted as a change, as
    such a change would appear as one deletion followed by one addition in a
    patch text.
    
    Unlike -S (pickaxe) that is intended to be used to quickly detect a commit
    that changes the number of occurrences of hits between the preimage and
    the postimage to serve as a part of larger toolchain, this is meant to be
    used as the top-level Porcelain feature.
    
    The implementation unfortunately has to run "diff" twice if you are
    running "log" family of commands to produce patches in the final output
    (e.g. "git log -p" or "git format-patch").  I think we _could_ cache the
    result in-core if we wanted to, but that would require larger surgery to
    the diffcore machinery (i.e. adding an extra pointer in the filepair
    structure to keep a pointer to a strbuf around, stuff the textual diff to
    the strbuf inside diffgrep_consume(), and make use of it in later stages
    when it is available) and it may not be worth it.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. diff: pass the entire diff-options to diffcore_pickaxe()

    authored
    That would make it easier to give enhanced feature to the
    pickaxe transformation.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on May 7, 2010
  1. @byang

    Add a macro DIFF_QUEUE_CLEAR.

    byang authored committed
    Refactor the diff_queue_struct code, this macro help
    to reset the structure.
    
    Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Mar 17, 2009
  1. pickaxe: count regex matches only once

    René Scharfe authored committed
    When --pickaxe-regex is used, forward past the end of matches instead of
    advancing to the byte after their start.  This way matches count only
    once, even if the regular expression matches their tail -- like in the
    fixed-string fork of the code.
    
    E.g.: /.*/ used to count the number of bytes instead of the number of
    lines.  /aa/ resulted in a count of two in "aaa" instead of one.
    
    Also document the fact that regexec() needs a NUL-terminated string as
    its second argument by adding an assert().
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Mar 3, 2009
  1. diffcore-pickaxe: use memmem()

    René Scharfe authored committed
    Use memmem() instead of open-coding it.  The system libraries usually have a
    much faster version than the memcmp()-loop here.  Even our own fall-back in
    compat/, which is used on Windows, is slightly faster.
    
    The following commands were run in a Linux kernel repository and timed, the
    best of five results is shown:
    
      $ STRING='Ensure that the real time constraints are schedulable.'
      $ git log -S"$STRING" HEAD -- kernel/sched.c >/dev/null
    
    On Ubuntu 8.10 x64, before (v1.6.2-rc2):
    
      8.09user 0.04system 0:08.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+30952minor)pagefaults 0swaps
    
    And with the patch:
    
      1.50user 0.04system 0:01.54elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+30645minor)pagefaults 0swaps
    
    On Fedora 10 x64, before:
    
      8.34user 0.05system 0:08.39elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+29268minor)pagefaults 0swaps
    
    And with the patch:
    
      1.15user 0.05system 0:01.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+32253minor)pagefaults 0swaps
    
    On Windows Vista x64, before:
    
      real    0m9.204s
      user    0m0.000s
      sys     0m0.000s
    
    And with the patch:
    
      real    0m8.470s
      user    0m0.000s
      sys     0m0.000s
    
    Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jun 7, 2007
  1. War on whitespace

    authored
    This uses "git-apply --whitespace=strip" to fix whitespace errors that have
    crept in to our source files over time.  There are a few files that need
    to have trailing whitespaces (most notably, test vectors).  The results
    still passes the test, and build result in Documentation/ area is unchanged.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on May 7, 2007
  1. diff -S: release the image after looking for needle in it

    Junio C Hamano authored
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Jan 26, 2007
  1. @peff

    diffcore-pickaxe: fix infinite loop on zero-length needle

    peff authored Junio C Hamano committed
    The "contains" algorithm runs into an infinite loop if the needle string
    has zero length. The loop could be modified to handle this, but it makes
    more sense to simply have an empty needle return no matches. Thus, a
    command like
      git log -S
    produces no output.
    
    We place the check at the top of the function so that we get the same
    results with or without --pickaxe-regex. Note that until now,
      git log -S --pickaxe-regex
    would match everything, not nothing.
    
    Arguably, an empty pickaxe string should simply produce an error
    message; however, this is still a useful assertion to add to the
    algorithm at this layer of the code.
    
    Noticed by Bill Lear.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Dec 20, 2006
  1. simplify inclusion of system header files.

    Junio C Hamano authored
    This is a mechanical clean-up of the way *.c files include
    system header files.
    
     (1) sources under compat/, platform sha-1 implementations, and
         xdelta code are exempt from the following rules;
    
     (2) the first #include must be "git-compat-util.h" or one of
         our own header file that includes it first (e.g. config.h,
         builtin.h, pkt-line.h);
    
     (3) system headers that are included in "git-compat-util.h"
         need not be included in individual C source files.
    
     (4) "git-compat-util.h" does not have to include subsystem
         specific header files (e.g. expat.h).
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Apr 5, 2006
  1. @dscho

    On some platforms, certain headers need to be included before regex.h

    dscho authored Junio C Hamano committed
    Happily, these are already included in cache.h, which is included anyway...
    so: change the order of includes.
    
    Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
    Signed-off-by: Junio C Hamano <junkio@cox.net>
Commits on Apr 4, 2006
  1. Support for pickaxe matching regular expressions

    Petr Baudis authored Junio C Hamano committed
    git-diff-* --pickaxe-regex will change the -S pickaxe to match
    POSIX extended regular expressions instead of fixed strings.
    
    The regex.h library is a rather stupid interface and I like pcre too, but
    with any luck it will be everywhere we will want to run Git on, it being
    POSIX.2 and all. I'm not sure if we can expect platforms like AIX to
    conform to POSIX.2 or if win32 has regex.h. We might add a flag to
    Makefile if there is a portability trouble potential.
    
    Signed-off-by: Petr Baudis <pasky@suse.cz>
Commits on Jul 24, 2005
  1. [PATCH] diffcore-pickaxe: switch to "counting" behaviour.

    Junio C Hamano authored Linus Torvalds committed
    Instead of finding old/new pair that one side has and the
    other side does not have the specified string, find old/new pair
    that contains the specified string as a substring different
    number of times.  This would still not catch a case where you
    introduce two static variable declarations and remove two static
    function definitions from a file with -S"static", but would make
    it behave a bit more intuitively.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 29, 2005
  1. [PATCH] Do not include unused header files.

    Junio C Hamano authored Linus Torvalds committed
    Some source files were including "delta.h" without actually
    needing it.  Remove them.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
  2. [PATCH] Optimize diff-tree -[CM] --stdin

    Junio C Hamano authored Linus Torvalds committed
    This attempts to optimize "diff-tree -[CM] --stdin", which
    compares successible tree pairs.  This optimization does not
    make much sense for other commands in the diff-* brothers.
    
    When reading from --stdin and using rename/copy detection, the
    patch makes diff-tree to read the current index file first.
    This is done to reuse the optimization used by diff-cache in the
    non-cached case.  Similarity estimator can avoid expanding a
    blob if the index says what is in the work tree has an exact
    copy of that blob already expanded.
    
    Another optimization the patch makes is to check only file sizes
    first to terminate similarity estimation early.  In order for
    this to work, it needs a way to tell the size of the blob
    without expanding it.  Since an obvious way of doing it, which
    is to keep all the blobs previously used in the memory, is too
    costly, it does so by keeping the filesize for each object it
    has already seen in memory.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
  3. [PATCH] Add --pickaxe-all to diff-* brothers.

    Junio C Hamano authored Linus Torvalds committed
    When --pickaxe-all is given in addition to -S, pickaxe shows the
    entire diffs contained in the changeset, not just the diffs for
    the filepair that touched the sought-after string.  This is
    useful to see the changes in context.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
  4. [PATCH] Introduce diff_free_filepair() funcion.

    Junio C Hamano authored Linus Torvalds committed
    This introduces a new function to free a common data structure,
    and plugs some leaks.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 23, 2005
  1. [PATCH] Performance fix for pickaxe.

    Junio C Hamano authored Linus Torvalds committed
    The pickaxe was expanding the blobs and searching in them even
    when it should have already known that both sides are the same.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
  2. [PATCH] Rename/copy detection fix.

    Junio C Hamano authored Linus Torvalds committed
    The rename/copy detection logic in earlier round was only good
    enough to show patch output and discussion on the mailing list
    about the diff-raw format updates revealed many problems with
    it.  This patch fixes all the ones known to me, without making
    things I want to do later impossible, mostly related to patch
    reordering.
    
     (1) Earlier rename/copy detector determined which one is rename
         and which one is copy too early, which made it impossible
         to later introduce diffcore transformers to reorder
         patches.  This patch fixes it by moving that logic to the
         very end of the processing.
    
     (2) Earlier output routine diff_flush() was pruning all the
         "no-change" entries indiscriminatingly.  This was done due
         to my false assumption that one of the requirements in the
         diff-raw output was not to show such an entry (which
         resulted in my incorrect comment about "diff-helper never
         being able to be equivalent to built-in diff driver").  My
         special thanks go to Linus for correcting me about this.
         When we produce diff-raw output, for the downstream to be
         able to tell renames from copies, sometimes it _is_
         necessary to output "no-change" entries, and this patch
         adds diffcore_prune() function for doing it.
    
     (3) Earlier diff_filepair structure was trying to be not too
         specific about rename/copy operations, but the purpose of
         the structure was to record one or two paths, which _was_
         indeed about rename/copy.  This patch discards xfrm_msg
         field which was trying to be generic for this wrong reason,
         and introduces a couple of fields (rename_score and
         rename_rank) that are explicitly specific to rename/copy
         logic.  One thing to note is that the information in a
         single diff_filepair structure _still_ does not distinguish
         renames from copies, and it is deliberately so.  This is to
         allow patches to be reordered in later stages.
    
     (4) This patch also adds some tests about diff-raw format
         output and makes sure that necessary "no-change" entries
         appear on the output.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 22, 2005
  1. [PATCH] Diffcore updates.

    Junio C Hamano authored Linus Torvalds committed
    This moves the path selection logic from individual programs to a new
    diffcore transformer (diff-tree still needs to have its own for
    performance reasons).  Also the header printing code in diff-tree was
    tweaked not to produce anything when pickaxe is in effect and there is
    nothing interesting to report.  An interesting example is the following
    in the GIT archive itself:
    
        $ git-whatchanged -p -C -S'or something in a real script'
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
  2. [PATCH] The diff-raw format updates.

    Junio C Hamano authored Linus Torvalds committed
    Update the diff-raw format as Linus and I discussed, except that
    it does not use sequence of underscore '_' letters to express
    nonexistence.  All '0' mode is used for that purpose instead.
    
    The new diff-raw format can express rename/copy, and the earlier
    restriction that -M and -C _must_ be used with the patch format
    output is no longer necessary.  The patch makes -M and -C flags
    independent of -p flag, so you need to say git-whatchanged -M -p
    to get the diff/patch format.
    
    Updated are both documentations and tests.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
  3. [PATCH] Prepare diffcore interface for diff-tree header supression.

    Junio C Hamano authored Linus Torvalds committed
    This does not actually supress the extra headers when pickaxe is
    used, but prepares enough support for diff-tree to implement it.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Commits on May 21, 2005
  1. [PATCH] Introducing software archaeologist's tool "pickaxe".

    Junio C Hamano authored Linus Torvalds committed
    This steals the "pickaxe" feature from JIT and make it available
    to the bare Plumbing layer.  From the command line, the user
    gives a string he is intersted in.
    
    Using the diff-core infrastructure previously introduced, it
    filters the differences to limit the output only to the diffs
    between <src> and <dst> where the string appears only in one but
    not in the other.  For example:
    
     $ ./git-rev-list HEAD | ./git-diff-tree -Sdiff-tree-helper --stdin -M
    
    would show the diffs that touch the string "diff-tree-helper".
    
    In real software-archaeologist application, you would typically
    look for a few to several lines of code and see where that code
    came from.
    
    The "pickaxe" module runs after "rename/copy detection" module,
    so it even crosses the file rename boundary, as the above
    example demonstrates.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Something went wrong with that request. Please try again.