Skip to content

Commit

Permalink
pack-format.txt: document sizes at start of delta data
Browse files Browse the repository at this point in the history
We document the delta data as a set of instructions, but forget to
document the two sizes that precede those instructions: the size of the
base object and the size of the object to be reconstructed. Fix this
omission.

Rather than cramming all the details about the encoding into the running
text, introduce a separate section detailing our "size encoding" and
refer to it.

Reported-by: Ross Light <ross@zombiezen.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information
Martin Ågren authored and gitster committed Jan 4, 2021
1 parent 898f807 commit 7b77f5a
Showing 1 changed file with 16 additions and 1 deletion.
17 changes: 16 additions & 1 deletion Documentation/technical/pack-format.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,18 @@ Valid object types are:

Type 5 is reserved for future expansion. Type 0 is invalid.

=== Size encoding

This document uses the following "size encoding" of non-negative
integers: From each byte, the seven least significant bits are
used to form the resulting integer. As long as the most significant
bit is 1, this process continues; the byte with MSB 0 provides the
last seven bits. The seven-bit chunks are concatenated. Later
values are more significant.

This size encoding should not be confused with the "offset encoding",
which is also used in this document.

=== Deltified representation

Conceptually there are only four object types: commit, tree, tag and
Expand All @@ -73,7 +85,10 @@ Ref-delta can also refer to an object outside the pack (i.e. the
so-called "thin pack"). When stored on disk however, the pack should
be self contained to avoid cyclic dependency.

The delta data is a sequence of instructions to reconstruct an object
The delta data starts with the size of the base object and the
size of the object to be reconstructed. These sizes are
encoded using the size encoding from above. The remainder of
the delta data is a sequence of instructions to reconstruct the object
from the base object. If the base object is deltified, it must be
converted to canonical form first. Each instruction appends more and
more data to the target object until it's complete. There are two
Expand Down

0 comments on commit 7b77f5a

Please sign in to comment.