Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Commits on Feb 7, 2012
  1. @peff @gitster

    drop odd return value semantics from userdiff_config

    peff authored gitster committed
    When the userdiff_config function was introduced in be58e70
    (diff: unify external diff and funcname parsing code,
    2008-10-05), it used a return value convention unlike any
    other config callback. Like other callbacks, it used "-1" to
    signal error. But it returned "1" to indicate that it found
    something, and "0" otherwise; other callbacks simply
    returned "0" to indicate that no error occurred.
    
    This distinction was necessary at the time, because the
    userdiff namespace overlapped slightly with the color
    configuration namespace. So "diff.color.foo" could mean "the
    'foo' slot of diff coloring" or "the 'foo' component of the
    "color" userdiff driver". Because the color-parsing code
    would die on an unknown color slot, we needed the userdiff
    code to indicate that it had matched the variable, letting
    us bypass the color-parsing code entirely.
    
    Later, in 8b8e862 (ignore unknown color configuration,
    2009-12-12), the color-parsing code learned to silently
    ignore unknown slots. This means we no longer need to
    protect userdiff-matched variables from reaching the
    color-parsing code.
    
    We can therefore change the userdiff_config calling
    convention to a more normal one. This drops some code from
    each caller, which is nice. But more importantly, it reduces
    the cognitive load for readers who may wonder why
    userdiff_config is unlike every other config callback.
    
    There's no need to add a new test confirming that this
    works; t4020 already contains a test that sets
    diff.color.external.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Dec 14, 2011
  1. @gitster

    Merge branch 'tr/userdiff-c-returns-pointer'

    gitster authored
    * tr/userdiff-c-returns-pointer:
      userdiff: allow * between cpp funcname words
Commits on Dec 6, 2011
  1. @trast @gitster

    userdiff: allow * between cpp funcname words

    trast authored gitster committed
    The cpp pattern, used for C and C++, would not match the start of a
    declaration such as
    
      static char *prepare_index(int argc,
    
    because it did not allow for * anywhere between the various words that
    constitute the modifiers, type and function name.  Fix it.
    
    Signed-off-by: Thomas Rast <trast@student.ethz.ch>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Nov 16, 2011
  1. @hendeby @gitster

    Add built-in diff patterns for MATLAB code

    hendeby authored gitster committed
    MATLAB is often used in industry and academia for scientific
    computations motivating it being included as a built-in pattern.
    
    Signed-off-by: Gustaf Hendeby <hendeby@isy.liu.se>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Aug 4, 2011
  1. @mhagger @gitster

    Rename git_checkattr() to git_check_attr()

    mhagger authored gitster committed
    Suggested by: Junio Hamano <gitster@pobox.com>
    
    Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jun 30, 2011
  1. @gitster

    Merge branch 'jk/combine-diff-binary-etc'

    gitster authored
    * jk/combine-diff-binary-etc:
      combine-diff: respect textconv attributes
      refactor get_textconv to not require diff_filespec
      combine-diff: handle binary files as binary
      combine-diff: calculate mode_differs earlier
      combine-diff: split header printing into its own function
Commits on May 23, 2011
  1. @peff @gitster

    refactor get_textconv to not require diff_filespec

    peff authored gitster committed
    This function actually does two things:
    
      1. Load the userdiff driver for the filespec.
    
      2. Decide whether the driver has a textconv component, and
         initialize the textconv cache if applicable.
    
    Only part (1) requires the filespec object, and some callers
    may not have a filespec at all. So let's split them it into
    two functions, and put part (2) with the userdiff code,
    which is a better fit.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. @jrn @gitster

    userdiff/perl: tighten BEGIN/END block pattern to reject here-doc del…

    jrn authored gitster committed
    …imiters
    
    A naive method of treating BEGIN/END blocks with a brace on the second
    line as diff/grep funcname context involves also matching unrelated
    lines that consist of all-caps letters:
    
    	sub foo {
    		print <<'EOF'
    	text goes here
    	...
    	EOF
    		... rest of foo ...
    	}
    
    That's not so great, because it means that "git diff" and "git grep
    --show-function" would write "=EOF" or "@@ EOF" as context instead of
    a more useful reminder like "@@ sub foo {".
    
    To avoid this, tighten the pattern to only match the special block
    names that perl accepts (namely BEGIN, END, INIT, CHECK, UNITCHECK,
    AUTOLOAD, and DESTROY).  The list is taken from perl's toke.c.
    
    Suggested-by: Jakub Narebski <jnareb@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on May 22, 2011
  1. @jrn @gitster

    userdiff/perl: catch sub with brace on second line

    jrn authored gitster committed
    Accept
    
    	sub foo
    	{
    	}
    
    as an alternative to a more common style that introduces perl
    functions with a brace on the first line (and likewise for BEGIN/END
    blocks).  The new regex is a little hairy to avoid matching
    
    	# forward declaration
    	sub foo;
    
    while continuing to match "sub foo($;@) {" and
    
    	sub foo { # This routine is interesting;
    		# in fact, the lines below explain how...
    
    While at it, pay attention to Perl 5.14's "package foo {" syntax as an
    alternative to the traditional "package foo;".
    
    Requested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. @jrn @gitster

    userdiff/perl: match full line of POD headers

    jrn authored gitster committed
    The builtin perl userdiff driver is not greedy enough about catching
    POD header lines.  Capture the whole line, so instead of just
    declaring that we are in some "@@ =head1" section, diff/grep output
    can explain that the enclosing section is about "@@ =head1 OPTIONS".
    
    Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  3. @jrn @gitster

    userdiff/perl: anchor "sub" and "package" patterns on the left

    jrn authored gitster committed
    The userdiff funcname mechanism has no concept of nested scopes ---
    instead, "git diff" and "git grep --show-function" simply label the
    diff header with the most recent matching line.  Unfortunately that
    means text following a subroutine in a POD section:
    
    	=head1 DESCRIPTION
    
    	You might use this facility like so:
    
    		sub example {
    			foo;
    		}
    
    	Now, having said that, let's say more about the facility.
    	Blah blah blah ... etc etc.
    
    gets the subroutine name instead of the POD header in its diff/grep
    funcname header, making it harder to get oriented when reading a
    diff without enough context.
    
    The fix is simple: anchor the funcname syntax to the left margin so
    nested subroutines and packages like this won't get picked up.  (The
    builtin C++ funcname pattern already does the same thing.)  This means
    the userdiff driver will misparse the idiom
    
    	{
    		my $static;
    		sub foo {
    			... use $static ...
    		}
    	}
    
    but I think that's worth it; we can revisit this later if the userdiff
    mechanism learns to keep track of the beginning and end of nested
    scopes.
    
    Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Feb 10, 2011
  1. @gitster

    Merge branch 'tr/diff-words-test'

    gitster authored
    * tr/diff-words-test:
      t4034 (diff --word-diff): add a minimum Perl drier test vector
      t4034 (diff --word-diff): style suggestions
      userdiff: simplify word-diff safeguard
      t4034: bulk verify builtin word regex sanity
Commits on Jan 24, 2011
  1. @gitster

    Merge branch 'as/userdiff-pascal'

    gitster authored
    * as/userdiff-pascal:
      userdiff: match Pascal class methods
Commits on Jan 18, 2011
  1. @jrn @gitster

    userdiff: simplify word-diff safeguard

    jrn authored gitster committed
    git's diff-words support has a detail that can be a little dangerous:
    any text not matched by a given language's tokenization pattern is
    treated as whitespace and changes in such text would go unnoticed.
    Therefore each of the built-in regexes allows a special token type
    consisting of a single non-whitespace character [^[:space:]].
    
    To make sure UTF-8 sequences remain human readable, the builtin
    regexes also have a special token type for runs of bytes with the high
    bit set.  In English, non-ASCII characters are usually isolated so
    this is analogous to the [^[:space:]] pattern, except it matches a
    single _multibyte_ character despite use of the C locale.
    
    Unfortunately it is easy to make typos or forget entirely to include
    these catch-all token types when adding support for new languages (see
    v1.7.3.5~16, userdiff: fix typo in ruby and python word regexes,
    2010-12-18).  Avoid this by including them automatically within the
    PATTERNS and IPATTERN macros.
    
    While at it, change the UTF-8 sequence token type to match exactly one
    non-ASCII multi-byte character, rather than an arbitrary run of them.
    
    Suggested-by: Thomas Rast <trast@student.ethz.ch>
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jan 11, 2011
  1. @Zapped @gitster

    userdiff: match Pascal class methods

    Zapped authored gitster committed
    Class declarations were already covered by the second pattern, but class
    methods have the 'class' keyword in front too. Account for it.
    
    Signed-off-by: Alexey Shumkin <zapped@mail.ru>
    Acked-by: Thomas Rast <trast@student.ethz.ch>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Dec 27, 2010
  1. @gitster

    userdiff/perl: catch BEGIN/END/... and POD as headers

    gitster authored
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. @jrn @gitster

    diff: funcname and word patterns for perl

    jrn authored gitster committed
    The default function name discovery already works quite well for Perl
    code... with the exception of here-documents (or rather their ending).
    
     sub foo {
    	print <<END
     here-document
     END
    	return 1;
     }
    
    The default funcname pattern treats the unindented END line as a
    function declaration and puts it in the @@ line of diff and "grep
    --show-function" output.
    
    With a little knowledge of perl syntax, we can do better.  You can
    try it out by adding "*.perl diff=perl" to the gitattributes file.
    
    Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Dec 19, 2010
  1. @trast @gitster

    userdiff: fix typo in ruby and python word regexes

    trast authored gitster committed
    Both had an unclosed ] that ruined the safeguard against not matching
    a non-space char.
    
    Signed-off-by: Thomas Rast <trast@student.ethz.ch>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Sep 10, 2010
  1. @drafnel @gitster

    userdiff.c: add builtin fortran regex patterns

    drafnel authored gitster committed
    This adds fortran xfuncname and wordRegex patterns to the list of builtin
    patterns.  The intention is for the patterns to be appropriate for all
    versions of fortran including 77, 90, 95.  The patterns can be enabled by
    adding the diff=fortran attribute to the .gitattributes file for the
    desired file glob.
    
    This also adds a new macro named IPATTERN which is just like the PATTERNS
    macro except it sets the REG_ICASE flag so that case will be ignored.
    
    The test code in t4018 and the docs were updated as appropriate.
    
    Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Aug 17, 2010
  1. @svick @gitster

    Userdiff patterns for C#

    svick authored gitster committed
    Add userdiff patterns for C#. This code is an improved version of
    code by Adam Petaccia from 21 June 2009 mail to the list.
    
    Signed-off-by: Petr Onderka <gsvick@gmail.com>
    Acked-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jun 13, 2010
  1. @gitster

    Merge branch 'bs/userdiff-php'

    gitster authored
    * bs/userdiff-php:
      diff: Support visibility modifiers in the PHP hunk header regexp
Commits on May 27, 2010
  1. @dotdash @gitster

    diff: Support visibility modifiers in the PHP hunk header regexp

    dotdash authored gitster committed
    Starting with PHP5, class methods can have a visibility modifier, which
    caused the methods not to be matched by the existing regexp, so extend
    the regexp to match those modifiers. And while we're at it, allow the
    "static" modifier as well.
    
    Since the "static" modifier can appear either before or after the
    visibility modifier, let's just allow any number of modifiers to appear
    in any order, as that simplifies the regexp and shouldn't cause any
    false positives.
    
    Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Apr 2, 2010
  1. @peff @gitster

    diff: cache textconv output

    peff authored gitster committed
    Running a textconv filter can take a long time. It's
    particularly bad for a large file which needs to be spooled
    to disk, but even for small files, the fork+exec overhead
    can add up for something like "git log -p".
    
    This patch uses the notes-cache mechanism to keep a fast
    cache of textconv output. Caches are stored in
    refs/notes/textconv/$x, where $x is the userdiff driver
    defined in gitattributes.
    
    Caching is enabled only if diff.$x.cachetextconv is true.
    
    In my test repo, on a commit with 45 jpg and avi files
    changed and a textconv to show their exif tags:
    
      [before]
      $ time git show >/dev/null
      real    0m13.724s
      user    0m12.057s
      sys     0m1.624s
    
      [after, first run]
      $ git config diff.mfo.cachetextconv true
      $ time git show >/dev/null
      real    0m14.252s
      user    0m12.197s
      sys     0m1.800s
    
      [after, subsequent runs]
      $ time git show >/dev/null
      real    0m0.352s
      user    0m0.148s
      sys     0m0.200s
    
    So for a slight (3.8%) cost on the first run, we achieve an
    almost 40x speed up on subsequent runs.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jan 17, 2010
  1. @gitster

    git_attr(): fix function signature

    gitster authored
    The function took (name, namelen) as its arguments, but all the public
    callers wanted to pass a full string.
    
    Demote the counted-string interface to an internal API status, and allow
    public callers to just pass the string to the function.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jun 18, 2009
  1. @bonzini @gitster

    avoid exponential regex match for java and objc function names

    bonzini authored gitster committed
    In the old regex
    
    ^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\([^;]*)$
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
    you can backtrack arbitrarily from [A-Za-z_0-9]* into [A-Za-z_], thus
    causing an exponential number of backtracks.  Ironically it also causes
    the regex not to work as intended; for example "catch" can match the
    underlined part of the regex, the first repetition matching "c" and
    the second matching "atch".
    
    The replacement regex avoids this problem, because it makes sure that
    at least a space/tab is eaten on each repetition.  In other words,
    a suffix of a repetition can never be a prefix of the next repetition.
    
    Signed-off-by: Paolo Bonzini <bonzini@gnu.org>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jan 22, 2009
  1. @stephen-smith @gitster

    Change the spelling of "wordregex".

    stephen-smith authored gitster committed
    Use "wordRegex" for configuration variable names.  Use "word_regex" for C
    language tokens.
    
    Signed-off-by: Boyd Stephen Smith Jr. <bss@iguanasuicide.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jan 17, 2009
  1. @trast @gitster

    color-words: make regex configurable via attributes

    trast authored gitster committed
    Make the --color-words splitting regular expression configurable via
    the diff driver's 'wordregex' attribute.  The user can then set the
    driver on a file in .gitattributes.  If a regex is given on the
    command line, it overrides the driver's setting.
    
    We also provide built-in regexes for the languages that already had
    funcname patterns, and add an appropriate diff driver entry for C/++.
    (The patterns are designed to run UTF-8 sequences into a single chunk
    to make sure they remain readable.)
    
    Signed-off-by: Thomas Rast <trast@student.ethz.ch>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Oct 26, 2008
  1. @peff @gitster

    userdiff: require explicitly allowing textconv

    peff authored gitster committed
    Diffs that have been produced with textconv almost certainly
    cannot be applied, so we want to be careful not to generate
    them in things like format-patch.
    
    This introduces a new diff options, ALLOW_TEXTCONV, which
    controls this behavior. It is off by default, but is
    explicitly turned on for the "log" family of commands, as
    well as the "diff" porcelain (but not diff-* plumbing).
    
    Because both text conversion and external diffing are
    controlled by these diff options, we can get rid of the
    "plumbing versus porcelain" distinction when reading the
    config. This was an attempt to control the same thing, but
    suffered from being too coarse-grained.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Oct 18, 2008
  1. @peff @gitster

    diff: add filter for converting binary to text

    peff authored gitster committed
    When diffing binary files, it is sometimes nice to see the
    differences of a canonical text form rather than either a
    binary patch or simply "binary files differ."
    
    Until now, the only option for doing this was to define an
    external diff command to perform the diff. This was a lot of
    work, since the external command needed to take care of
    doing the diff itself (including mode changes), and lost the
    benefit of git's colorization and other options.
    
    This patch adds a text conversion option, which converts a
    file to its canonical format before performing the diff.
    This is less flexible than an arbitrary external diff, but
    is much less work to set up. For example:
    
      $ echo '*.jpg diff=exif' >>.gitattributes
      $ git config diff.exif.textconv exiftool
      $ git config diff.exif.binary false
    
    allows one to see jpg diffs represented by the text output
    of exiftool.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
  2. @peff @gitster

    diff: introduce diff.<driver>.binary

    peff authored gitster committed
    The "diff" gitattribute is somewhat overloaded right now. It
    can say one of three things:
    
      1. this file is definitely binary, or definitely not
         (i.e., diff or !diff)
      2. this file should use an external diff engine (i.e.,
         diff=foo, diff.foo.command = custom-script)
      3. this file should use particular funcname patterns
         (i.e., diff=foo, diff.foo.(x?)funcname = some-regex)
    
    Most of the time, there is no conflict between these uses,
    since using one implies that the other is irrelevant (e.g.,
    an external diff engine will decide for itself whether the
    file is binary).
    
    However, there is at least one conflicting situation: there
    is no way to say "use the regular rules to determine whether
    this file is binary, but if we do diff it textually, use
    this funcname pattern." That is, currently setting diff=foo
    indicates that the file is definitely text.
    
    This patch introduces a "binary" config option for a diff
    driver, so that one can explicitly set diff.foo.binary. We
    default this value to "don't know". That is, setting a diff
    attribute to "foo" and using "diff.foo.funcname" will have
    no effect on the binaryness of a file. To get the current
    behavior, one can set diff.foo.binary to true.
    
    This patch also has one additional advantage: it cleans up
    the interface to the userdiff code a bit. Before, calling
    code had to know more about whether attributes were false,
    true, or unset to determine binaryness. Now that binaryness
    is a property of a driver, we can represent these situations
    just by passing back a driver struct.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
  3. @peff @gitster

    diff: unify external diff and funcname parsing code

    peff authored gitster committed
    Both sets of code assume that one specifies a diff profile
    as a gitattribute via the "diff=foo" attribute. They then
    pull information about that profile from the config as
    diff.foo.*.
    
    The code for each is currently completely separate from the
    other, which has several disadvantages:
    
      - there is duplication as we maintain code to create and
        search the separate lists of external drivers and
        funcname patterns
    
      - it is difficult to add new profile options, since it is
        unclear where they should go
    
      - the code is difficult to follow, as we rely on the
        "check if this file is binary" code to find the funcname
        pattern as a side effect. This is the first step in
        refactoring the binary-checking code.
    
    This patch factors out these diff profiles into "userdiff"
    drivers. A file with "diff=foo" uses the "foo" driver, which
    is specified by a single struct.
    
    Note that one major difference between the two pieces of
    code is that the funcname patterns are always loaded,
    whereas external drivers are loaded only for the "git diff"
    porcelain; the new code takes care to retain that situation.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Something went wrong with that request. Please try again.