Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Commits on Jul 13, 2012
  1. @gitster

    Merge branch 'jc/refactor-diff-stdin'

    gitster authored
    Due to the way "git diff --no-index" is bolted onto by touching the
    low level code that is shared with the rest of the "git diff" code,
    even though it has to work in a very different way, any comparison
    that involves a file "-" at the root level incorrectly tried to read
    from the standard input.  This cleans up the no-index codepath
    further to remove code that reads from the standard input from the
    core side, which is never necessary when git is running its usual
    diff operation.
    * jc/refactor-diff-stdin:
      diff-index.c: "git diff" has no need to read blob from the standard input
      diff-index.c: unify handling of command line paths
      diff-index.c: do not pretend paths are pathspecs
Commits on Jun 28, 2012
  1. @gitster

    diff-index.c: "git diff" has no need to read blob from the standard i…

    gitster authored
    Only "diff --no-index -" does.  Bolting the logic into the low-level
    function diff_populate_filespec() was a layering violation from day
    one.  Move populate_from_stdin() function out of the generic diff.c
    to its only user, diff-index.c.
    Also make sure "-" from the command line stays a special token "read
    from the standard input", even if we later decide to sanitize the
    result from prefix_filename() function in a few obvious ways,
    e.g. removing unnecessary "./" prefix, duplicated slashes "//" in
    the middle, etc.
    Signed-off-by: Junio C Hamano <>
Commits on Aug 21, 2011
  1. @gitster

    combine-diff: support format_callback

    gitster authored
    This teaches combine-diff machinery to feed a combined merge to a callback
    function when DIFF_FORMAT_CALLBACK is specified.
    So far, format callback functions are not used for anything but 2-way
    diffs. A callback is given a diff_queue_struct, which is an array of
    diff_filepair. As its name suggests, a diff_filepair is a _pair_ of
    diff_filespec that represents a single preimage and a single postimage.
    Since "diff -c" is to compare N parents with a single merge result and
    filter out any paths whose result match one (or more) of the parent(s),
    its output has to be able to represent N preimages and 1 postimage. For
    this reason, a callback function that inspects a diff_filepair that
    results from this new infrastructure can and is expected to view the
    preimage side (i.e. pair->one) as an array of diff_filespec. Each element
    in the array, except for the last one, is marked with "has_more_entries"
    bit, so that the same callback function can be used for 2-way diffs and
    combined diffs.
    Signed-off-by: Junio C Hamano <>
Commits on Aug 31, 2010
  1. @gitster

    diff: pass the entire diff-options to diffcore_pickaxe()

    gitster authored
    That would make it easier to give enhanced feature to the
    pickaxe transformation.
    Signed-off-by: Junio C Hamano <>
Commits on Aug 13, 2010
  1. @gitster

    diff --follow: do call diffcore_std() as necessary

    gitster authored
    Usually, diff frontends populate the output queue with filepairs without
    any rename information and call diffcore_std() to sort the renames out.
    When --follow is in effect, however, diff-tree family of frontend has a
    hack that looks like this:
        diff-tree frontend
        -> diff_tree_sha1()
           . populate diff_queued_diff
           . if --follow is in effect and there is only one change that
             creates the target path, then
           -> try_to_follow_renames()
    	  -> diff_tree_sha1() with no pathspec but with -C
    	  -> diffcore_std() to find renames
    	  . if rename is found, tweak diff_queued_diff and put a
    	    single filepair that records the found rename there
        -> diffcore_std()
           . tweak elements on diff_queued_diff by
           - rename detection
           - path ordering
           - pickaxe filtering
    We need to skip parts of the second call to diffcore_std() that is related
    to rename detection, and do so only when try_to_follow_renames() did find
    a rename.  Earlier 1da6175 (Make diffcore_std only can run once before a
    diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it
    unconditionally disabled any second call to diffcore_std().
    This hopefully fixes the breakage.
    Signed-off-by: Junio C Hamano <>
Commits on Aug 12, 2010
  1. @jrn @gitster

    Standardize do { ... } while (0) style

    jrn authored gitster committed
    Signed-off-by: Jonathan Nieder <>
    Signed-off-by: Junio C Hamano <>
Commits on Aug 9, 2010
  1. @moy @gitster

    Document -B<n>[/<m>], -M<n> and -C<n> variants of -B, -M and -C

    moy authored gitster committed
    These options take an optional argument, but this optional argument was
    not documented.
    Original patch by Matthieu Moy, but documentation for -B mostly copied
    from the explanations of Junio C Hamano.
    While we're there, fix a typo in a comment in diffcore.h.
    Signed-off-by: Matthieu Moy <>
    Signed-off-by: Junio C Hamano <>
Commits on May 7, 2010
  1. @byang @gitster

    Make diffcore_std only can run once before a diff_flush

    byang authored gitster committed
    When file renames/copies detection is turned on, the
    second diffcore_std will degrade a 'C' pair to a 'R' pair.
    And this may happen when we run 'git log --follow' with
    hard copies finding. That is, the try_to_follow_renames()
    will run diffcore_std to find the copies, and then
    'git log' will issue another diffcore_std, which will reduce
    'src->rename_used' and recognize this copy as a rename.
    This is not what we want.
    So, I think we really don't need to run diffcore_std more
    than one time.
    Signed-off-by: Bo Yang <>
    Signed-off-by: Junio C Hamano <>
  2. @byang @gitster

    Add a macro DIFF_QUEUE_CLEAR.

    byang authored gitster committed
    Refactor the diff_queue_struct code, this macro help
    to reset the structure.
    Signed-off-by: Bo Yang <>
    Signed-off-by: Junio C Hamano <>
Commits on Mar 5, 2010
  1. @jlehmann @gitster

    git diff --submodule: Show detailed dirty status of submodules

    jlehmann authored gitster committed
    When encountering a dirty submodule while doing "git diff --submodule"
    print an extra line for new untracked content and another for modified
    but already tracked content. And if the HEAD of the submodule is equal
    to the ref diffed against in the superproject, drop the output which
    would just show the same SHA1s and no commit message headlines.
    To achieve that, the dirty_submodule bitfield is expanded to two bits.
    The output of "git status" inside the submodule is parsed to set the
    according bits.
    Signed-off-by: Jens Lehmann <>
    Signed-off-by: Junio C Hamano <>
Commits on Jan 19, 2010
  1. @jlehmann @gitster

    Performance optimization for detection of modified submodules

    jlehmann authored gitster committed
    In the worst case is_submodule_modified() got called three times for
    each submodule. The information we got from scanning the whole
    submodule tree the first time can be reused instead.
    New parameters have been added to diff_change() and diff_addremove(),
    the information is stored in a new member of struct diff_filespec. Its
    value is then reused instead of calling is_submodule_modified() again.
    When no explicit "-dirty" is needed in the output the call to
    is_submodule_modified() is not necessary when the submodules HEAD
    already disagrees with the ref of the superproject, as this alone
    marks it as modified. To achieve that, get_stat_data() got an extra
    Signed-off-by: Jens Lehmann <>
    Signed-off-by: Junio C Hamano <>
Commits on Nov 3, 2008
  1. @gitster

    Merge branch 'maint'

    gitster authored
    * maint:
      Add reference for status letters in documentation.
      Document that git-log takes --all-match.
      Update draft release notes
Commits on Nov 2, 2008
  1. @ydirson @gitster

    Add reference for status letters in documentation.

    ydirson authored gitster committed
    Also fix error in diff_filepair::status documentation, and point to
    the in-code reference as well as the doc.
    Signed-off-by: Yann Dirson <>
    Signed-off-by: Junio C Hamano <>
Commits on Oct 18, 2008
  1. @peff @gitster

    diff: introduce diff.<driver>.binary

    peff authored gitster committed
    The "diff" gitattribute is somewhat overloaded right now. It
    can say one of three things:
      1. this file is definitely binary, or definitely not
         (i.e., diff or !diff)
      2. this file should use an external diff engine (i.e.,
         diff=foo, = custom-script)
      3. this file should use particular funcname patterns
         (i.e., diff=foo, = some-regex)
    Most of the time, there is no conflict between these uses,
    since using one implies that the other is irrelevant (e.g.,
    an external diff engine will decide for itself whether the
    file is binary).
    However, there is at least one conflicting situation: there
    is no way to say "use the regular rules to determine whether
    this file is binary, but if we do diff it textually, use
    this funcname pattern." That is, currently setting diff=foo
    indicates that the file is definitely text.
    This patch introduces a "binary" config option for a diff
    driver, so that one can explicitly set We
    default this value to "don't know". That is, setting a diff
    attribute to "foo" and using "" will have
    no effect on the binaryness of a file. To get the current
    behavior, one can set to true.
    This patch also has one additional advantage: it cleans up
    the interface to the userdiff code a bit. Before, calling
    code had to know more about whether attributes were false,
    true, or unset to determine binaryness. Now that binaryness
    is a property of a driver, we can represent these situations
    just by passing back a driver struct.
    Signed-off-by: Jeff King <>
    Signed-off-by: Shawn O. Pearce <>
Commits on Sep 20, 2008
  1. @ydirson @gitster

    Bust the ghost of long-defunct diffcore-pathspec.

    ydirson authored gitster committed
    This concept was retired by 77882f6 (Retire diffcore-pathspec.,
    2006-04-10), more than 2 years ago.
    Signed-off-by: Yann Dirson <>
    Signed-off-by: Junio C Hamano <>
Commits on Oct 27, 2007
  1. @torvalds @gitster

    copy vs rename detection: avoid unnecessary O(n*m) loops

    torvalds authored gitster committed
    The core rename detection had some rather stupid code to check if a
    pathname was used by a later modification or rename, which basically
    walked the whole pathname space for all renames for each rename, in
    order to tell whether it was a pure rename (no remaining users) or
    should be considered a copy (other users of the source file remaining).
    That's really silly, since we can just keep a count of users around, and
    replace all those complex and expensive loops with just testing that
    simple counter (but this all depends on the previous commit that shared
    the diff_filespec data structure by using a separate reference count).
    Note that the reference count is not the same as the rename count: they
    behave otherwise rather similarly, but the reference count is tied to
    the allocation (and decremented at de-allocation, so that when it turns
    zero we can get rid of the memory), while the rename count is tied to
    the renames and is decremented when we find a rename (so that when it
    turns zero we know that it was a rename, not a copy).
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
  2. @torvalds @gitster

    Ref-count the filespecs used by diffcore

    torvalds authored gitster committed
    Rather than copy the filespecs when introducing new versions of them
    (for rename or copy detection), use a refcount and increment the count
    when reusing the diff_filespec.
    This avoids unnecessary allocations, but the real reason behind this is
    a future enhancement: we will want to track shared data across the
    copy/rename detection.  In order to efficiently notice when a filespec
    is used by a rename, the rename machinery wants to keep track of a
    rename usage count which is shared across all different users of the
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Junio C Hamano <>
Commits on Oct 3, 2007
  1. @gitster

    rename diff_free_filespec_data_large() to diff_free_filespec_blob()

    gitster authored
    Signed-off-by: Junio C Hamano <>
  2. @peff @gitster

    diffcore-rename: cache file deltas

    peff authored gitster committed
    We find rename candidates by computing a fingerprint hash of
    each file, and then comparing those fingerprints. There are
    inherently O(n^2) comparisons, so it pays in CPU time to
    hoist the (rather expensive) computation of the fingerprint
    out of that loop (or to cache it once we have computed it once).
    Previously, we didn't keep the filespec information around
    because then we had the potential to consume a great deal of
    memory. However, instead of keeping all of the filespec
    data, we can instead just keep the fingerprint.
    This patch implements and uses diff_free_filespec_data_large
    to accomplish that goal. We also have to change
    estimate_similarity not to needlessly repopulate the
    filespec data when we already have the hash.
    Practical tests showed 4.5x speedup for a 10% memory usage
    Signed-off-by: Jeff King <>
    Signed-off-by: Junio C Hamano <>
Commits on Jul 7, 2007
  1. @gitster

    Fix configuration syntax to specify customized hunk header patterns.

    gitster authored
    This updates the hunk header customization syntax.  The special
    case 'funcname' attribute is gone.
    You assign the name of the type of contents to path's "diff"
    attribute as a string value in .gitattributes like this:
    	*.java diff=java
    	*.perl diff=perl
    	*.doc diff=doc
    If you supply "diff.<name>.funcname" variable via the
    configuration mechanism (e.g. in $HOME/.gitconfig), the value is
    used as the regexp set to find the line to use for the hunk
    header (the variable is called "funcname" because such a line
    typically is the one that has the name of the function in
    programming language source text).
    If there is no such configuration, built-in default is used, if
    any.  Currently there are two default patterns: default and java.
    Signed-off-by: Junio C Hamano <>
Commits on Jul 6, 2007
  1. @gitster

    Per-path attribute based hunk header selection.

    gitster authored
    This makes"diff -p" hunk headers customizable via gitattributes mechanism.
    It is based on Johannes's earlier patch that allowed to define a single
    regexp to be used for everything.
    The mechanism to arrive at the regexp that is used to define hunk header
    is the same as other use of gitattributes.  You assign an attribute, funcname
    (because "diff -p" typically uses the name of the function the patch is about
    as the hunk header), a simple string value.  This can be one of the names of
    built-in pattern (currently, "java" is defined) or a custom pattern name, to
    be looked up from the configuration file.
      (in .gitattributes)
      *.java   funcname=java
      *.perl   funcname=perl
      (in .git/config)
        java = ... # ugly and complicated regexp to override the built-in one.
        perl = ... # another ugly and complicated regexp to define a new one.
    Signed-off-by: Junio C Hamano <>
  2. @gitster

    Introduce diff_filespec_is_binary()

    gitster authored
    This replaces an explicit initialization of filespec->is_binary
    field used for rename/break followed by direct access to that
    field with a wrapper function that lazily iniaitlizes and
    accesses the field.  We would add more attribute accesses for
    the use of diff routines, and it would be better to make this
    abstraction earlier.
    Signed-off-by: Junio C Hamano <>
Commits on Jul 1, 2007
  1. @gitster

    diffcore_filespec: add is_binary

    gitster authored
    diffcore-break and diffcore-rename would want to behave slightly
    differently depending on the binary-ness of the data, so add one
    bit to the filespec, as the structure is now passed down to
    diffcore_count_changes() function.
    Signed-off-by: Junio C Hamano <>
  2. @gitster

    diffcore_count_changes: pass diffcore_filespec

    gitster authored
    We may want to use richer information on the data we are dealing
    with in this function, so instead of passing a buffer address
    and length, just pass the diffcore_filespec structure.  Existing
    callers always call this function with parameters taken from a
    filespec anyway, so there is no functionality changes.
    Signed-off-by: Junio C Hamano <>
Commits on Apr 29, 2007
  1. Make macros to prevent double-inclusion in headers consistent.

    Junio C Hamano authored
    Signed-off-by: Junio C Hamano <>
Commits on Jan 7, 2007
  1. diff-index --cached --raw: show tree entry on the LHS for unmerged en…

    Junio C Hamano authored
    This updates the way diffcore represents an unmerged pair
    somewhat.  It used to be that entries with mode=0 on both sides
    were used to represent an unmerged pair, but now it has an
    explicit flag.  This is to allow diff-index --cached to report
    the entry from the tree when the path is unmerged in the index.
    This is used in updating "git reset <tree> -- <path>" to restore
    absense of the path in the index from the tree.
    Signed-off-by: Junio C Hamano <>
Commits on Aug 3, 2006
  1. diff.c: do not use pathname comparison to tell renames

    Junio C Hamano authored
    The final output from diff used to compare pathnames between
    preimage and postimage to tell if the filepair is a rename/copy.
    By explicitly marking the filepair created by diffcore_rename(),
    the output routine, resolve_rename_copy(), does not have to do
    so anymore.  This helps feeding a filepair that has different
    pathnames in one and two elements to the diff machinery (most
    notably, comparing two blobs).
    Signed-off-by: Junio C Hamano <>
Commits on Mar 12, 2006
  1. diffcore-rename: somewhat optimized.

    Junio C Hamano authored
    This changes diffcore-rename to reuse statistics information
    gathered during similarity estimation, and updates the hashtable
    implementation used to keep track of the statistics to be
    denser.  This seems to give better performance.
    Signed-off-by: Junio C Hamano <>
Commits on Mar 4, 2006
  1. diffcore-break: similarity estimator fix.

    Junio C Hamano authored
    This is a companion patch to the previous fix to diffcore-rename.
    The merging-back process should use a logic similar to what is used
    Signed-off-by: Junio C Hamano <>
Commits on Mar 1, 2006
  1. diffcore-rename: split out the delta counting code.

    Junio C Hamano authored
    This is to rework diffcore break/rename/copy detection code
    so that it does not affected when deltifier code gets improved.
    Signed-off-by: Junio C Hamano <>
Commits on Jan 16, 2006
  1. diffcore-break/diffcore-rename: integer overflow.

    Junio C Hamano authored
    While reviewing the end user tutorial rewrite by J. Bruce
    Fields, I noticed that "git-diff-tree -B -C" did not correctly
    break the total rewrite of Documentation/tutorial.txt.  It turns
    out that we had integer overflow during the break score
    Cop out by using floating point.  This is not a kernel.
    Signed-off-by: Junio C Hamano <>
Commits on Sep 25, 2005
  1. Diff: -l<num> to limit rename/copy detection.

    Junio C Hamano authored
    When many paths are modified, rename detection takes a lot of time.
    The new option -l<num> can be used to disable rename detection when
    more than <num> paths are possibly created as renames.
    Signed-off-by: Junio C Hamano <>
Commits on Sep 14, 2005
  1. Revert "[PATCH] plug memory leak in diff.c::diff_free_filepair()"

    Junio C Hamano authored
    This reverts 068eac9 commit.
Commits on Aug 14, 2005
  1. @yashi

    [PATCH] plug memory leak in diff.c::diff_free_filepair()

    yashi authored Junio C Hamano committed
    When I run git-diff-tree on big change, it seems the command eats so
    much memory.  so I just put git under valgrind to see what's going on.
    diff_free_filespec_data() doesn't free diff_filespec itself.
    [jc: I ended up doing things slightly differently from Yasushi's
    patch.  The original idea was to use free_filespec_data() only to
    free the data portion and keep useing the filespec itself, but
    no existing code seems to do things that way, so I just yanked
    that part out.]
    Signed-off-by: Yasushi SHOJI <>
    Signed-off-by: Junio C Hamano <>
Commits on Jun 13, 2005
  1. [PATCH] Re-Fix SIGSEGV on unmerged files in git-diff-files -p

    Junio C Hamano authored Linus Torvalds committed
    When an unmerged path was fed via diff_unmerged() into diffcore,
    it eventually called run_diff() with "one" and "two" parameters
    with NULL, but run_diff() was not written carefully enough to
    notice this situation.
    Signed-off-by: Junio C Hamano <>
    Signed-off-by: Linus Torvalds <>
Something went wrong with that request. Please try again.