Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Commits on Dec 17, 2014
  1. @peff @gitster

    utf8: add is_hfs_dotgit() helper

    peff authored gitster committed
    We do not allow paths with a ".git" component to be added to
    the index, as that would mean repository contents could
    overwrite our repository files. However, asking "is this
    path the same as .git" is not as simple as strcmp() on some
    HFS+'s case-folding does more than just fold uppercase into
    lowercase (which we already handle with strcasecmp). It may
    also skip past certain "ignored" Unicode code points, so
    that (for example) ".gi\u200ct" is mapped ot ".git".
    The full list of folds can be found in the tables at:
    Implementing a full "is this path the same as that path"
    comparison would require us importing the whole set of
    tables.  However, what we want to do is much simpler: we
    only care about checking ".git". We know that 'G' is the
    only thing that folds to 'g', and so on, so we really only
    need to deal with the set of ignored code points, which is
    much smaller.
    Signed-off-by: Jeff King <>
    Signed-off-by: Junio C Hamano <>
Commits on Jul 10, 2013
  1. @peff @gitster

    add missing "format" function attributes

    peff authored gitster committed
    For most of our functions that take printf-like formats, we
    use gcc's __attribute__((format)) to get compiler warnings
    when the functions are misused. Let's give a few more
    functions the same protection.
    In most cases, the annotations do not uncover any actual
    bugs; the only code change needed is that we passed a size_t
    to transfer_debug, which expected an int. Since we expect
    the passed-in value to be a relatively small buffer size
    (and cast a similar value to int directly below), we can
    just cast away the problem.
    Signed-off-by: Jeff King <>
    Signed-off-by: Junio C Hamano <>
Commits on Apr 18, 2013
  1. @pclouds @gitster

    pretty: support %>> that steal trailing spaces

    pclouds authored gitster committed
    This is pretty useful in `%<(100)%s%Cred%>(20)% an' where %s does not
    use up all 100 columns and %an needs more than 20 columns. By
    replacing %>(20) with %>>(20), %an can steal spaces from %s.
    %>> understands escape sequences, so %Cred does not stop it from
    stealing spaces in %<(100).
    Signed-off-by: Nguyễn Thái Ngọc Duy <>
    Signed-off-by: Junio C Hamano <>
  2. @pclouds @gitster

    pretty: support truncating in %>, %< and %><

    pclouds authored gitster committed
    %>(N,trunc) truncates the right part after N columns and replace the
    last two letters with "..". ltrunc does the same on the left. mtrunc
    cuts the middle out.
    Signed-off-by: Nguyễn Thái Ngọc Duy <>
    Signed-off-by: Junio C Hamano <>
  3. @pclouds @gitster

    utf8.c: add reencode_string_len() that can handle NULs in string

    pclouds authored gitster committed
    Signed-off-by: Nguyễn Thái Ngọc Duy <>
    Signed-off-by: Junio C Hamano <>
  4. @pclouds @gitster

    utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences

    pclouds authored gitster committed
    Signed-off-by: Nguyễn Thái Ngọc Duy <>
    Signed-off-by: Junio C Hamano <>
Commits on Mar 25, 2013
  1. @gitster

    Merge branch 'ks/rfc2047-one-char-at-a-time'

    gitster authored
    When "format-patch" quoted a non-ascii strings on the header files,
    it incorrectly applied rfc2047 and chopped a single character in
    the middle of it.
    * ks/rfc2047-one-char-at-a-time:
      format-patch: RFC 2047 says multi-octet character may not be split
Commits on Mar 9, 2013
  1. @gitster

    format-patch: RFC 2047 says multi-octet character may not be split

    Kirill Smelkov authored gitster committed
    Even though an earlier attempt (bafc478..41dd00b) cleaned
    up RFC 2047 encoding, pretty.c::add_rfc2047() still decides
    where to split the output line by going through the input
    one byte at a time, and potentially splits a character in
    the middle.  A subject line may end up showing like this:
         ".... fö?? bar".   (instead of  ".... föö bar".)
    if split incorrectly.
    RFC 2047, section 5 (3) explicitly forbids such beaviour
        Each 'encoded-word' MUST represent an integral number of
        characters.  A multi-octet character may not be split across
        adjacent 'encoded- word's.
    that means that e.g. for
        Subject: .... föö bar
        Subject: =?UTF-8?q?....=20f=C3=B6=C3=B6?=
    is correct, and
        Subject: =?UTF-8?q?....=20f=C3=B6=C3?=      <-- NOTE ö is broken here
    is not, because "ö" character UTF-8 encoding C3 B6 is split here across
    adjacent encoded words.
    To fix the problem, make the loop grab one _character_ at a time and
    determine its output length to see where to break the output line.  Note
    that this version only knows about UTF-8, but the logic to grab one
    character is abstracted out in mbs_chrlen() function to make it possible
    to extend it to other encodings with the help of iconv in the future.
    Signed-off-by: Kirill Smelkov <>
    Signed-off-by: Junio C Hamano <>
Commits on Feb 14, 2013
  1. @gitster

    Merge branch 'jx/utf8-printf-width'

    gitster authored
    Use a new helper that prints a message and counts its display width
    to align the help messages parse-options produces.
    * jx/utf8-printf-width:
      Add utf8_fprintf helper that returns correct number of columns
Commits on Feb 11, 2013
  1. @jiangxin @gitster

    Add utf8_fprintf helper that returns correct number of columns

    jiangxin authored gitster committed
    Since command usages can be translated, they may include utf-8
    encoded strings, and the output in console may not align well any
    more. This is because strlen() is different from strwidth() on utf-8
    A wrapper utf8_fprintf() can help to return the correct number of
    columns required.
    Signed-off-by: Jiang Xin <>
    Signed-off-by: Nguyễn Thái Ngọc Duy <>
    Reviewed-by: Torsten Bögershausen <>
    Signed-off-by: Junio C Hamano <>
Commits on Jan 2, 2013
  1. @gitster

    Merge branch 'sp/shortlog-missing-lf'

    gitster authored
    When a line to be wrapped has a solid run of non space characters
    whose length exactly is the wrap width, "git shortlog -w" failed to
    add a newline after such a line.
    * sp/shortlog-missing-lf:
      strbuf_add_wrapped*(): Remove unused return value
      shortlog: fix wrapping lines of wraplen
Commits on Dec 11, 2012
  1. @sprohaska @gitster

    strbuf_add_wrapped*(): Remove unused return value

    sprohaska authored gitster committed
    Since shortlog isn't using the return value anymore (see previous
    commit), the functions can be changed to void.
    Signed-off-by: Steffen Prohaska <>
    Signed-off-by: Junio C Hamano <>
Commits on Nov 4, 2012
  1. @gitster @peff

    reencode_string(): introduce and use same_encoding()

    gitster authored peff committed
    Callers of reencode_string() that re-encodes a string from one
    encoding to another all used ad-hoc way to bypass the case where the
    input and the output encodings are the same.  Some did strcmp(),
    some did strcasecmp(), yet some others when converting to UTF-8 used
    Introduce same_encoding() helper function to make these callers use
    the same logic.  Notably, is_encoding_utf8() has a work-around for
    common misconfiguration to use "utf8" to name UTF-8 encoding, which
    does not match "UTF-8" hence strcasecmp() would not consider the
    same.  Make use of it in this helper function.
    Signed-off-by: Junio C Hamano <>
Commits on Jul 9, 2012
  1. @tboegi @gitster

    git on Mac OS and precomposed unicode

    tboegi authored gitster committed
    Mac OS X mangles file names containing unicode on file systems HFS+,
    VFAT or SAMBA.  When a file using unicode code points outside ASCII
    is created on a HFS+ drive, the file name is converted into
    decomposed unicode and written to disk. No conversion is done if
    the file name is already decomposed unicode.
    Calling open("\xc3\x84", ...) with a precomposed "Ä" yields the same
    result as open("\x41\xcc\x88",...) with a decomposed "Ä".
    As a consequence, readdir() returns the file names in decomposed
    unicode, even if the user expects precomposed unicode.  Unlike on
    HFS+, Mac OS X stores files on a VFAT drive (e.g. an USB drive) in
    precomposed unicode, but readdir() still returns file names in
    decomposed unicode.  When a git repository is stored on a network
    share using SAMBA, file names are send over the wire and written to
    disk on the remote system in precomposed unicode, but Mac OS X
    readdir() returns decomposed unicode to be compatible with its
    behaviour on HFS+ and VFAT.
    The unicode decomposition causes many problems:
    - The names "git add" and other commands get from the end user may
      often be precomposed form (the decomposed form is not easily input
      from the keyboard), but when the commands read from the filesystem
      to see what it is going to update the index with already is on the
      filesystem, readdir() will give decomposed form, which is different.
    - Similarly "git log", "git mv" and all other commands that need to
      compare pathnames found on the command line (often but not always
      precomposed form; a command line input resulting from globbing may
      be in decomposed) with pathnames found in the tree objects (should
      be precomposed form to be compatible with other systems and for
      consistency in general).
    - The same for names stored in the index, which should be
      precomposed, that may need to be compared with the names read from
    NFS mounted from Linux is fully transparent and does not suffer from
    the above.
    As Mac OS X treats precomposed and decomposed file names as equal,
    we can
     - wrap readdir() on Mac OS X to return the precomposed form, and
     - normalize decomposed form given from the command line also to the
       precomposed form,
    to ensure that all pathnames used in Git are always in the
    precomposed form.  This behaviour can be requested by setting
    "core.precomposedunicode" configuration variable to true.
    The code in compat/precomposed_utf8.c implements basically 4 new
    functions: precomposed_utf8_opendir(), precomposed_utf8_readdir(),
    precomposed_utf8_closedir() and precompose_argv().  The first three
    are to wrap opendir(3), readdir(3), and closedir(3) functions.
    The argv[] conversion allows to use the TAB filename completion done
    by the shell on command line.  It tolerates other tools which use
    readdir() to feed decomposed file names into git.
    When creating a new git repository with "git init" or "git clone",
    "core.precomposedunicode" will be set "false".
    The user needs to activate this feature manually.  She typically
    sets core.precomposedunicode to "true" on HFS and VFAT, or file
    systems mounted via SAMBA.
    Helped-by: Junio C Hamano <>
    Signed-off-by: Torsten Bögershausen <>
    Signed-off-by: Junio C Hamano <>
Commits on Feb 23, 2011
  1. @peff @gitster

    strbuf: add fixed-length version of add_wrapped_text

    peff authored gitster committed
    The function strbuf_add_wrapped_text takes a NUL-terminated
    string. This makes it annoying to wrap strings we have as a
    pointer and a length.
    Refactoring strbuf_add_wrapped_text and all of its
    sub-functions to handle fixed-length strings turned out to
    be really ugly. So this implementation is lame; it just
    strdups the text and operates on the NUL-terminated version.
    This should be fine as the strings we are wrapping are
    generally pretty short.  If it becomes a problem, we can
    optimize later.
    Signed-off-by: Jeff King <>
    Signed-off-by: Junio C Hamano <>
Commits on Mar 2, 2010
  1. @gitster

    Merge branch 'rs/optim-text-wrap'

    gitster authored
    * rs/optim-text-wrap:
      utf8.c: speculatively assume utf-8 in strbuf_add_wrapped_text()
      utf8.c: remove strbuf_write()
      utf8.c: remove print_spaces()
      utf8.c: remove print_wrapped_text()
Commits on Feb 20, 2010
  1. @gitster

    utf8.c: remove print_wrapped_text()

    René Scharfe authored gitster committed
    strbuf_add_wrapped_text() is called only from print_wrapped_text()
    without a strbuf (in which case it writes its results to stdout).
    At its only callsite, supply a strbuf, call strbuf_add_wrapped_text()
    directly and remove the wrapper function.
    Signed-off-by: Rene Scharfe <>
    Signed-off-by: Junio C Hamano <>
Commits on Jan 12, 2010
  1. @gitster

    utf8.c: mark file-local function static

    gitster authored
    Signed-off-by: Junio C Hamano <>
Commits on Oct 19, 2009
  1. @dscho @gitster

    Add strbuf_add_wrapped_text() to utf8.[ch]

    dscho authored gitster committed
    The newly added function can rewrap text according to a given first-line
    indent, other-indent and text width.
    Signed-off-by: Johannes Schindelin <>
Commits on Feb 5, 2009
  1. @geofft @gitster

    utf8: add utf8_strwidth()

    geofft authored gitster committed
    I'm about to use this pattern more than once, so make it a common function.
    Signed-off-by: Geoffrey Thomas <>
    Signed-off-by: Junio C Hamano <>
Commits on Jan 7, 2008
  1. @gitster

    utf8_width(): allow non NUL-terminated input

    gitster authored
    The original interface assumed that the input string is
    always terminated with a NUL, but that wasn't too useful.
    Signed-off-by: Junio C Hamano <>
  2. @gitster

    utf8: pick_one_utf8_char()

    gitster authored
    utf8_width() function was doing two different things.  To pick a
    valid character from UTF-8 stream, and compute the display width of
    that character.  This splits the former to a separate function
    Signed-off-by: Junio C Hamano <>
Commits on Feb 28, 2007
  1. @dscho

    Actually make print_wrapped_text() useful

    dscho authored Junio C Hamano committed
    Now, it returns the current column, does not add a newline, and you can
    pass a negative indent, to indicate that the indent was already printed.
    With this, you can actually continue in the middle of a paragraph, not
    having to print everything into a buffer first.
    Signed-off-by: Johannes Schindelin <>
    Signed-off-by: Junio C Hamano <>
Commits on Dec 30, 2006
  1. commit-tree: cope with different ways "utf-8" can be spelled.

    Junio C Hamano authored
    People can spell config.commitencoding differently from what we
    internally have ("utf-8") to mean UTF-8.  Try to accept them and
    treat them equally.
    Signed-off-by: Junio C Hamano <>
Commits on Dec 26, 2006
  1. Move encoding conversion routine out of mailinfo to utf8.c

    Junio C Hamano authored
    This moves the body of convert_to_utf8() routine used in mailinfo
    to the utf8.c i18n library.
    Signed-off-by: Junio C Hamano <>
Commits on Dec 24, 2006
  1. @dscho

    commit-tree: encourage UTF-8 commit messages.

    dscho authored Junio C Hamano committed
    Introduce is_utf() to check if a text looks like it is encoded
    in UTF-8, utf8_width() to count display width, and implements
    print_wrapped_text() using them.
    git-commit-tree warns if the commit message does not minimally
    conform to the UTF-8 encoding when i18n.commitencoding is either
    unset, or set to "utf-8".
    Signed-off-by: Junio C Hamano <>
Something went wrong with that request. Please try again.