Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Commits on Apr 20, 2010
  1. @spearce @gitster

    Extract verify_pack_index for reuse from verify_pack

    spearce authored gitster committed
    The dumb HTTP transport should verify an index is completely valid
    before trying to use it.  That requires checking the header/footer
    but also checking the complete content SHA-1.  All of this logic is
    already in the front half of verify_pack, so pull it out into a new
    function that can be reused.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
Commits on May 27, 2007
  1. @spearce

    Lazily open pack index files on demand

    spearce authored Junio C Hamano committed
    In some repository configurations the user may have many packfiles,
    but all of the recent commits/trees/tags/blobs are likely to
    be in the most recent packfile (the one with the newest mtime).
    It is therefore common to be able to complete an entire operation
    by accessing only one packfile, even if there are 25 packfiles
    available to the repository.
    Rather than opening and mmaping the corresponding .idx file for
    every pack found, we now only open and map the .idx when we suspect
    there might be an object of interest in there.
    Of course we cannot known in advance which packfile contains an
    object, so we still need to scan the entire packed_git list to
    locate anything.  But odds are users want to access objects in the
    most recently created packfiles first, and that may be all they
    ever need for the current operation.
    Junio observed in b867092 that placing recent packfiles before
    older ones can slightly improve access times for recent objects,
    without degrading it for historical object access.
    This change improves upon Junio's observations by trying even harder
    to avoid the .idx files that we won't need.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
Commits on Mar 7, 2007
  1. @spearce

    Use off_t when we really mean a file offset.

    spearce authored Junio C Hamano committed
    Not all platforms have declared 'unsigned long' to be a 64 bit value,
    but we want to support a 64 bit packfile (or close enough anyway)
    in the near future as some projects are getting large enough that
    their packed size exceeds 4 GiB.
    By using off_t, the POSIX type that is declared to mean an offset
    within a file, we support whatever maximum file size the underlying
    operating system will handle.  For most modern systems this is up
    around 2^60 or higher.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
  2. @spearce

    Use uint32_t for all packed object counts.

    spearce authored Junio C Hamano committed
    As we permit up to 2^32-1 objects in a single packfile we cannot
    use a signed int to represent the object offset within a packfile,
    after 2^31-1 objects we will start seeing negative indexes and
    error out or compute bad addresses within the mmap'd index.
    This is a minor cleanup that does not introduce any significant
    logic changes.  It is roach free.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
Commits on Dec 29, 2006
  1. @spearce

    Loop over pack_windows when inflating/accessing data.

    spearce authored Junio C Hamano committed
    When multiple mmaps start getting used for all pack file access it
    is not possible to get all data associated with a specific object
    in one contiguous memory region.  This limitation prevents simply
    passing a single address and length to SHA1_Update or to inflate.
    Instead we need to loop until we have processed all data of interest.
    As we loop over the data we are always interested in reusing the same
    window 'cursor', as the prior window will no longer be of any use
    to us.  This allows the use_pack() call to automatically decrement
    the use count of the prior window before setting up access for us
    to the next window.
    Within each loop we need to make use of the available length output
    parameter of use_pack() to tell us how many bytes are available in
    the current memory region, as we cannot tell otherwise.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
  2. @spearce

    Replace use_packed_git with window cursors.

    spearce authored Junio C Hamano committed
    Part of the implementation concept of the sliding mmap window for
    pack access is to permit multiple windows per pack to be mapped
    independently.  Since the inuse_cnt is associated with the mmap and
    not with the file, this value is in struct pack_window and needs to
    be incremented/decremented for each pack_window accessed by any code.
    To faciliate that implementation we need to replace all uses of
    use_packed_git() and unuse_packed_git() with a different API that
    follows struct pack_window objects rather than struct packed_git.
    The way this works is when we need to start accessing a pack for
    the first time we should setup a new window 'cursor' by declaring
    a local and setting it to NULL:
      struct pack_windows *w_curs = NULL;
    To obtain the memory region which contains a specific section of
    the pack file we invoke use_pack(), supplying the address of our
    current window cursor:
      unsigned int len;
      unsigned char *addr = use_pack(p, &w_curs, offset, &len);
    the returned address `addr` will be the first byte at `offset`
    within the pack file.  The optional variable len will also be
    updated with the number of bytes remaining following the address.
    Multiple calls to use_pack() with the same window cursor will
    update the window cursor, moving it from one window to another
    when necessary.  In this way each window cursor variable maintains
    only one struct pack_window inuse at a time.
    Finally before exiting the scope which originally declared the window
    cursor we must invoke unuse_pack() to unuse the current window (which
    may be different from the one that was first obtained from use_pack):
    This implementation is still not complete with regards to multiple
    windows, as only one window per pack file is supported right now.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
  3. @spearce

    Refactor packed_git to prepare for sliding mmap windows.

    spearce authored Junio C Hamano committed
    The idea behind the sliding mmap window pack reader implementation
    is to have multiple mmap regions active against the same pack file,
    thereby allowing the process to mmap in only the active/hot sections
    of the pack and reduce overall virtual address space usage.
    To implement this we need to refactor the mmap related data
    (pack_base, pack_use_cnt) out of struct packed_git and move them
    into a new struct pack_window.
    We are refactoring the code to support a single struct pack_window
    per packfile, thereby emulating the prior behavior of mmap'ing the
    entire pack file.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
  4. @spearce

    Replace unpack_entry_gently with unpack_entry.

    spearce authored Junio C Hamano committed
    The unpack_entry_gently function currently has only two callers:
    the delta base resolution in sha1_file.c and the main loop of
    pack-check.c.  Both of these must change to using unpack_entry
    directly when we implement sliding window mmap logic, so I'm doing
    it earlier to help break down the change set.
    This may cause a slight performance decrease for delta base
    resolution as well as for pack-check.c's verify_packfile(), as
    the pack use counter will be incremented and decremented for every
    object that is unpacked.
    Signed-off-by: Shawn O. Pearce <>
    Signed-off-by: Junio C Hamano <>
Something went wrong with that request. Please try again.