Permalink
Browse files

support large packfiles with index v2

Grit has known about the "v2" pack index format for a while.
However, it never actually handled the extended offsets that
we get when indexing packfiles that are larger than 2
gigabytes.

When an object is at an offset smaller than 2G, its byte
offset into the packfile is placed in the first table of
4-byte offset values. If it's past that, then the MSB is set
on its offset in the 4-byte table, and the rest of the
4-byte integer specifies an offset into an 8-byte table that
follows.

With this patch, grit should handle arbitrarily large packs
(limited only by the pack format itself).

A few notes on the patch itself:

  - I unpack using two "N" formats instead of "Q>", because
    "Q>" is not available in ruby < 1.9.3

  - No automated test is included, because you need a
    packfile that is greater than 2G. I did test it by hand.
  • Loading branch information...
peff committed Dec 23, 2011
1 parent ff01507 commit 7b5c49eff7c7f12866e5ff3ca554635fed5c7c11
Showing with 7 additions and 0 deletions.
  1. +7 −0 lib/grit/git-ruby/internal/pack.rb
@@ -30,6 +30,7 @@ class PackStorage
SHA1Size = 20
IdxOffsetSize = 4
OffsetSize = 4
+ ExtendedOffsetSize = 8
CrcSize = 4
OffsetStart = FanOutCount * IdxOffsetSize
SHA1Start = OffsetStart + OffsetSize
@@ -214,6 +215,12 @@ def find_object_in_index(idx, sha1)
else
pos = OffsetStart + (@size * (SHA1Size + CrcSize)) + (mid * OffsetSize)
offset = idx[pos, OffsetSize].unpack('N')[0]
+ if offset & 0x80000000 > 0
+ offset &= 0x7fffffff
+ pos = OffsetStart + (@size * (SHA1Size + CrcSize + OffsetSize)) + (offset * ExtendedOffsetSize)
+ words = idx[pos, ExtendedOffsetSize].unpack('NN')
+ offset = (words[0] << 32) | words[1]
+ end
return offset
end
else

0 comments on commit 7b5c49e

Please sign in to comment.