Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Commits on Aug 24, 2012
  1. @Jojo-Schmitz

    sha1_file.c: introduce get_max_fd_limit() helper

    Jojo-Schmitz authored committed
    Not all platforms have getrlimit(), but there are other ways to see
    the maximum number of files that a process can have open.  If
    getrlimit() is unavailable, fall back to sysconf(_SC_OPEN_MAX) if
    available, and use OPEN_MAX from <limits.h>.
    
    Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Jul 30, 2012
  1. Merge branch 'hv/link-alt-odb-entry'

    authored
    The code to avoid mistaken attempt to add the object directory
    itself as its own alternate could read beyond end of a string while
    comparison.
    
    * hv/link-alt-odb-entry:
      link_alt_odb_entry: fix read over array bounds reported by valgrind
  2. link_alt_odb_entry: fix read over array bounds reported by valgrind

    Heiko Voigt authored committed
    pfxlen can be longer than the path in objdir when relative_base
    contains the path to gits object directory.  Here we are interested
    in checking if ent->base[] (the part that corresponds to .git/objects)
    is the same string as objdir, and the code NUL-terminated ent->base[]
    to
    
    	LEADING PATH\0XX/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\0
    
    in preparation for these "duplicate check" step (before we return
    from the function, the first NUL is turned into '/' so that we can
    fill XX when probing for loose objects).  All we need to do is to
    compare the string with the path to our object directory.
    
    Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on May 23, 2012
  1. Merge branch 'hv/submodule-alt-odb'

    authored
    When peeking into object stores of submodules, the code forgot that they
    might borrow objects from alternate object stores on their own.
    
    By Heiko Voigt
    * hv/submodule-alt-odb:
      teach add_submodule_odb() to look for alternates
Commits on May 14, 2012
  1. teach add_submodule_odb() to look for alternates

    Heiko Voigt authored committed
    Since we allow to link other object databases when loading a submodules
    database we should also load possible alternates.
    
    Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Apr 30, 2012
  1. remove blank filename in error message

    Pete Wyckoff authored committed
    When write_loose_object() finds that it is unable to
    create a temporary file, it complains, for instance:
    
        unable to create temporary sha1 filename : Too many open files
    
    That extra space was supposed to be the name of the file,
    and will be an empty string if the git_mkstemps_mode() fails.
    
    The name of the temporary file is unimportant; delete it.
    
    Signed-off-by: Pete Wyckoff <pw@padd.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. remove superfluous newlines in error messages

    Pete Wyckoff authored committed
    The error handling routines add a newline.  Remove
    the duplicate ones in error messages.
    
    Signed-off-by: Pete Wyckoff <pw@padd.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Mar 7, 2012
  1. @pclouds

    parse_object: avoid putting whole blob in core

    pclouds authored committed
    Traditionally, all the callers of check_sha1_signature() first
    called read_sha1_file() to prepare the whole object data in core,
    and called this function.  The function is used to revalidate what
    we read from the object database actually matches the object name we
    used to ask for the data from the object database.
    
    Update the API to allow callers to pass NULL as the object data, and
    have the function read and hash the object data using streaming API
    to recompute the object name, without having to hold everything in
    core at the same time.  This is most useful in parse_object() that
    parses a blob object, because this caller does not have to keep the
    actual blob data around in memory after a "struct blob" is returned.
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Mar 5, 2012
  1. Merge branch 'jk/maint-avoid-streaming-filtered-contents' into maint

    authored
    * jk/maint-avoid-streaming-filtered-contents:
      do not stream large files to pack when filters are in use
      teach dry-run convert_to_git not to require a src buffer
      teach convert_to_git a "dry run" mode
Commits on Feb 27, 2012
  1. Merge branch 'jk/maint-avoid-streaming-filtered-contents'

    authored
    * jk/maint-avoid-streaming-filtered-contents:
      do not stream large files to pack when filters are in use
      teach dry-run convert_to_git not to require a src buffer
      teach convert_to_git a "dry run" mode
Commits on Feb 24, 2012
  1. @peff

    do not stream large files to pack when filters are in use

    peff authored committed
    Because git's object format requires us to specify the
    number of bytes in the object in its header, we must know
    the size before streaming a blob into the object database.
    This is not a problem when adding a regular file, as we can
    get the size from stat(). However, when filters are in use
    (such as autocrlf, or the ident, filter, or eol
    gitattributes), we have no idea what the ultimate size will
    be.
    
    The current code just punts on the whole issue and ignores
    filter configuration entirely for files larger than
    core.bigfilethreshold. This can generate confusing results
    if you use filters for large binary files, as the filter
    will suddenly stop working as the file goes over a certain
    size.  Rather than try to handle unknown input sizes with
    streaming, this patch just turns off the streaming
    optimization when filters are in use.
    
    This has a slight performance regression in a very specific
    case: if you have autocrlf on, but no gitattributes, a large
    binary file will avoid the streaming code path because we
    don't know beforehand whether it will need conversion or
    not. But if you are handling large binary files, you should
    be marking them as such via attributes (or at least not
    using autocrlf, and instead marking your text files as
    such). And the flip side is that if you have a large
    _non_-binary file, there is a correctness improvement;
    before we did not apply the conversion at all.
    
    The first half of the new t1051 script covers these failures
    on input. The second half tests the matching output code
    paths. These already work correctly, and do not need any
    adjustment.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Feb 21, 2012
  1. Merge branch 'nd/find-pack-entry-recent-cache-invalidation' into maint

    authored
    * nd/find-pack-entry-recent-cache-invalidation:
      find_pack_entry(): do not keep packed_git pointer locally
      sha1_file.c: move the core logic of find_pack_entry() into fill_pack_entry()
Commits on Feb 16, 2012
  1. Merge branch 'mm/empty-loose-error-message' into maint

    authored
    * mm/empty-loose-error-message:
      fsck: give accurate error message on empty loose object files
Commits on Feb 13, 2012
  1. Merge branch 'nd/find-pack-entry-recent-cache-invalidation'

    authored
    * nd/find-pack-entry-recent-cache-invalidation:
      find_pack_entry(): do not keep packed_git pointer locally
      sha1_file.c: move the core logic of find_pack_entry() into fill_pack_entry()
  2. Merge branch 'mm/empty-loose-error-message'

    authored
    * mm/empty-loose-error-message:
      fsck: give accurate error message on empty loose object files
Commits on Feb 6, 2012
  1. @moy

    fsck: give accurate error message on empty loose object files

    moy authored committed
    Since 3ba7a06 (A loose object is not corrupt if it
    cannot be read due to EMFILE), "git fsck" on a repository with an empty
    loose object file complains with the error message
    
      fatal: failed to read object <sha1>: Invalid argument
    
    This comes from a failure of mmap on this empty file, which sets errno to
    EINVAL. Instead of calling xmmap on empty file, we display a clean error
    message ourselves, and return a NULL pointer. The new message is
    
      error: object file .git/objects/09/<rest-of-sha1> is empty
      fatal: loose object <sha1> (stored in .git/objects/09/<rest-of-sha1>) is corrupt
    
    The second line was already there before the regression in 3ba7a06,
    and the first is an additional message, that should help diagnosing the
    problem for the user.
    
    Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Feb 1, 2012
  1. @pclouds

    find_pack_entry(): do not keep packed_git pointer locally

    pclouds authored committed
    Commit f7c22cc (always start looking up objects in the last used pack
    first - 2007-05-30) introduce a static packed_git* pointer as an
    optimization.  The kept pointer however may become invalid if
    free_pack_by_name() happens to free that particular pack.
    
    Current code base does not access packs after calling
    free_pack_by_name() so it should not be a problem. Anyway, move the
    pointer out so that free_pack_by_name() can reset it to avoid running
    into troubles in future.
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Acked-by: Nicolas Pitre <nico@fluxnic.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. @pclouds

    sha1_file.c: move the core logic of find_pack_entry() into fill_pack_…

    pclouds authored committed
    …entry()
    
    The new helper function implements the logic to find the offset for the
    object in one pack and fill a pack_entry structure. The next patch will
    restructure the loop and will call the helper from two places.
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Acked-by: Nicolas Pitre <nico@fluxnic.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Dec 21, 2011
  1. @avar

    Appease Sun Studio by renaming "tmpfile"

    avar authored committed
    On Solaris the system headers define the "tmpfile" name, which'll
    cause Git compiled with Sun Studio 12 Update 1 to whine about us
    redefining the name:
    
        "pack-write.c", line 76: warning: name redefined by pragma redefine_extname declared static: tmpfile     (E_PRAGMA_REDEFINE_STATIC)
        "sha1_file.c", line 2455: warning: name redefined by pragma redefine_extname declared static: tmpfile    (E_PRAGMA_REDEFINE_STATIC)
        "fast-import.c", line 858: warning: name redefined by pragma redefine_extname declared static: tmpfile   (E_PRAGMA_REDEFINE_STATIC)
        "builtin/index-pack.c", line 175: warning: name redefined by pragma redefine_extname declared static: tmpfile    (E_PRAGMA_REDEFINE_STATIC)
    
    Just renaming the "tmpfile" variable to "tmp_file" in the relevant
    places is the easiest way to fix this.
    
    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Dec 17, 2011
  1. Merge branch 'jc/stream-to-pack'

    authored
    * jc/stream-to-pack:
      bulk-checkin: replace fast-import based implementation
      csum-file: introduce sha1file_checkpoint
      finish_tmp_packfile(): a helper function
      create_tmp_packfile(): a helper function
      write_pack_header(): a helper function
    
    Conflicts:
    	pack.h
Commits on Dec 14, 2011
  1. Merge branch 'nd/misc-cleanups' into maint

    authored
    * nd/misc-cleanups:
      unpack_object_header_buffer(): clear the size field upon error
      tree_entry_interesting: make use of local pointer "item"
      tree_entry_interesting(): give meaningful names to return values
      read_directory_recursive: reduce one indentation level
      get_tree_entry(): do not call find_tree_entry() on an empty tree
      tree-walk.c: do not leak internal structure in tree_entry_len()
Commits on Dec 5, 2011
  1. Merge branch 'nd/misc-cleanups'

    authored
    * nd/misc-cleanups:
      unpack_object_header_buffer(): clear the size field upon error
      tree_entry_interesting: make use of local pointer "item"
      tree_entry_interesting(): give meaningful names to return values
      read_directory_recursive: reduce one indentation level
      get_tree_entry(): do not call find_tree_entry() on an empty tree
      tree-walk.c: do not leak internal structure in tree_entry_len()
Commits on Dec 1, 2011
  1. bulk-checkin: replace fast-import based implementation

    authored
    This extends the earlier approach to stream a large file directly from the
    filesystem to its own packfile, and allows "git add" to send large files
    directly into a single pack. Older code used to spawn fast-import, but the
    new bulk-checkin API replaces it.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Nov 16, 2011
  1. @artagnon

    sha1_file: don't mix enum with int

    artagnon authored committed
    Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Oct 27, 2011
  1. unpack_object_header_buffer(): clear the size field upon error

    authored
    The callers do not use the returned size when the function says
    it did not use any bytes and sets the type to OBJ_BAD, so this
    should not matter in practice, but it is a good code hygiene
    anyway.
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Oct 21, 2011
  1. Merge branch 'jk/maint-pack-objects-compete-with-delete'

    authored
    * jk/maint-pack-objects-compete-with-delete:
      downgrade "packfile cannot be accessed" errors to warnings
      pack-objects: protect against disappearing packs
Commits on Oct 14, 2011
  1. @peff

    downgrade "packfile cannot be accessed" errors to warnings

    peff authored committed
    These can happen if another process simultaneously prunes a
    pack. But that is not usually an error condition, because a
    properly-running prune should have repacked the object into
    a new pack. So we will notice that the pack has disappeared
    unexpectedly, print a message, try other packs (possibly
    after re-scanning the list of packs), and find it in the new
    pack.
    
    Acked-by: Nicolas Pitre <nico@fluxnic.net>
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
  2. @peff

    pack-objects: protect against disappearing packs

    peff authored committed
    It's possible that while pack-objects is running, a
    simultaneously running prune process might delete a pack
    that we are interested in. Because we load the pack indices
    early on, we know that the pack contains our item, but by
    the time we try to open and map it, it is gone.
    
    Since c715f78, we already protect against this in the normal
    object access code path, but pack-objects accesses the packs
    at a lower level.  In the normal access path, we call
    find_pack_entry, which will call find_pack_entry_one on each
    pack index, which does the actual lookup. If it gets a hit,
    we will actually open and verify the validity of the
    matching packfile (using c715f78's is_pack_valid). If we
    can't open it, we'll issue a warning and pretend that we
    didn't find it, causing us to go on to the next pack (or on
    to loose objects).
    
    Furthermore, we will cache the descriptor to the opened
    packfile. Which means that later, when we actually try to
    access the object, we are likely to still have that packfile
    opened, and won't care if it has been unlinked from the
    filesystem.
    
    Notice the "likely" above. If there is another pack access
    in the interim, and we run out of descriptors, we could
    close the pack. And then a later attempt to access the
    closed pack could fail (we'll try to re-open it, of course,
    but it may have been deleted). In practice, this doesn't
    happen because we tend to look up items and then access them
    immediately.
    
    Pack-objects does not follow this code path. Instead, it
    accesses the packs at a much lower level, using
    find_pack_entry_one directly. This means we skip the
    is_pack_valid check, and may end up with the name of a
    packfile, but no open descriptor.
    
    We can add the same is_pack_valid check here. Unfortunately,
    the access patterns of pack-objects are not quite as nice
    for keeping lookup and object access together. We look up
    each object as we find out about it, and the only later when
    writing the packfile do we necessarily access it. Which
    means that the opened packfile may be closed in the interim.
    
    In practice, however, adding this check still has value, for
    three reasons.
    
      1. If you have a reasonable number of packs and/or a
         reasonable file descriptor limit, you can keep all of
         your packs open simultaneously. If this is the case,
         then the race is impossible to trigger.
    
      2. Even if you can't keep all packs open at once, you
         may end up keeping the deleted one open (i.e., you may
         get lucky).
    
      3. The race window is shortened. You may notice early that
         the pack is gone, and not try to access it. Triggering
         the problem without this check means deleting the pack
         any time after we read the list of index files, but
         before we access the looked-up objects.  Triggering it
         with this check means deleting the pack means deleting
         the pack after we do a lookup (and successfully access
         the packfile), but before we access the object. Which
         is a smaller window.
    
    Acked-by: Nicolas Pitre <nico@fluxnic.net>
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Oct 5, 2011
  1. Merge branch 'wh/normalize-alt-odb-path'

    authored
    * wh/normalize-alt-odb-path:
      sha1_file: normalize alt_odb path before comparing and storing
Commits on Sep 7, 2011
  1. sha1_file: normalize alt_odb path before comparing and storing

    Hui Wang authored committed
    When it needs to compare and add an alt object path to the
    alt_odb_list, we normalize this path first since comparing normalized
    path is easy to get correct result.
    
    Use strbuf to replace some string operations, since it is cleaner and
    safer.
    
    Helped-by: Junio C Hamano <gitster@pobox.com>
    Signed-off-by: Hui Wang <Hui.Wang@windriver.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Aug 29, 2011
  1. Merge branch 'jc/maint-clone-alternates'

    authored
    * jc/maint-clone-alternates:
      clone: clone from a repository with relative alternates
      clone: allow more than one --reference
    
    Conflicts:
    	builtin/clone.c
Commits on Aug 23, 2011
  1. Merge branch 'rt/zlib-smaller-window'

    authored
    * rt/zlib-smaller-window:
      test: consolidate definition of $LF
      Tolerate zlib deflation with window size < 32Kb
  2. clone: clone from a repository with relative alternates

    authored
    Cloning from a local repository blindly copies or hardlinks all the files
    under objects/ hierarchy. This results in two issues:
    
     - If the repository cloned has an "objects/info/alternates" file, and the
       command line of clone specifies --reference, the ones specified on the
       command line get overwritten by the copy from the original repository.
    
     - An entry in a "objects/info/alternates" file can specify the object
       stores it borrows objects from as a path relative to the "objects/"
       directory. When cloning a repository with such an alternates file, if
       the new repository is not sitting next to the original repository, such
       relative paths needs to be adjusted so that they can be used in the new
       repository.
    
    This updates add_to_alternates_file() to take the path to the alternate
    object store, including the "/objects" part at the end (earlier, it was
    taking the path to $GIT_DIR and was adding "/objects" itself), as it is
    technically possible to specify in objects/info/alternates file the path
    of a directory whose name does not end with "/objects".
    
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Aug 11, 2011
  1. @rtyley

    Tolerate zlib deflation with window size < 32Kb

    rtyley authored committed
    Git currently reports loose objects as 'corrupt' if they've been
    deflated using a window size less than 32Kb, because the
    experimental_loose_object() function doesn't recognise the header
    byte as a zlib header. This patch makes the function tolerant of
    all valid window sizes (15-bit to 8-bit) - but doesn't sacrifice
    it's accuracy in distingushing the standard loose-object format
    from the experimental (now abandoned) format.
    
    On memory constrained systems zlib may use a much smaller window
    size - working on Agit, I found that Android uses a 4KB window;
    giving a header byte of 0x48, not 0x78. Consequently all loose
    objects generated appear 'corrupt', which is why Agit is a read-only
    Git client at this time - I don't want my client to generate Git
    repos that other clients treat as broken :(
    
    This patch makes Git tolerant of different deflate settings - it
    might appear that it changes experimental_loose_object() to the point
    where it could incorrectly identify the experimental format as the
    standard one, but the two criteria (bitmask & checksum) can only
    give a false result for an experimental object where both of the
    following are true:
    
    1) object size is exactly 8 bytes when uncompressed (bitmask)
    2) [single-byte in-pack git type&size header] * 256
       + [1st byte of the following zlib header] % 31 = 0 (checksum)
    
    As it happens, for all possible combinations of valid object type
    (1-4) and window bits (0-7), the only time when the checksum will be
    divisible by 31 is for 0x1838 - ie object type *1*, a Commit - which,
    due the fields all Commit objects must contain, could never be as
    small as 8 bytes in size.
    
    Given this, the combination of the two criteria (bitmask & checksum)
    always correctly determines the buffer format, and is more tolerant
    than the previous version.
    
    The alternative to this patch is simply removing support for the
    experimental format, which I am also totally cool with.
    
    References:
    
    Android uses a 4KB window for deflation:
    http://android.git.kernel.org/?p=platform/libcore.git;a=blob;f=luni/src/main/native/java_util_zip_Deflater.cpp;h=c0b2feff196e63a7b85d97cf9ae5bb2583409c28;hb=refs/heads/gingerbread#l53
    
    Code snippet searching for false positives with the zlib checksum:
    https://gist.github.com/1118177
    
    Signed-off-by: Roberto Tyley <roberto.tyley@guardian.co.uk>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits on Aug 5, 2011
  1. Merge branch 'jc/pack-order-tweak'

    authored
    * jc/pack-order-tweak:
      pack-objects: optimize "recency order"
      core: log offset pack data accesses happened
Something went wrong with that request. Please try again.