Skip to content
vijairaj edited this page Aug 12, 2015 · 8 revisions

Current directory cache

The data structure of the current directory cache .dircache/index is defined in cache.h

Cache Header
Cache Entry 1
Cache Entry 2
...
Cache Entry 3

Cache Header

struct cache_header {
    unsigned int signature; // 0x44495243 = "DIRC"
    unsigned int version;
    unsigned int entries;
    unsigned char sha1[20];
};

Cache Entry

struct cache_entry {
    struct cache_time ctime;
    struct cache_time mtime;
    unsigned int st_dev;
    unsigned int st_ino;
    unsigned int st_mode;
    unsigned int st_uid;
    unsigned int st_gid;
    unsigned int st_size;
    unsigned char sha1[20];
    unsigned short namelen;
    unsigned char name[0];
};

Object database

Please note that the entire data structure is deflated with zlib before storing it on disk.

Object Header
Object Data

Object Header

Name Value
object type blob or tree or commit
space ' '
object size "NNNN"
null terminator '\0'

Object Data

Object data varies according to the type of the object stored. There are three types of objects:

  1. blob
  2. tree
  3. commit

Sample

  1. blob NNNN \0 [N Bytes blob data]
  2. tree NNNN \0 [N Bytes tree data]
  3. commit NNNN \0 [N Bytes commit data]

blob

A "blob" object is nothing but a binary blob of data, and doesn't refer to anything else. There is no signature or any other verification of the data, so while the object is consistent (it is indexed by its sha1 hash, so the data itself is certainly correct), it has absolutely no other attributes. No name associations, no permissions. It is purely a blob of data (i.e. normally "file contents").

tree

Data structure

File Info 1
File Info 2
File Info 3
File Info 4
...

File info

Name Value
mode ASCII value of stat.st_mode
space ' '
filename File name
null '\0'
sha1 20 byte SHA1 of the above file

Description

A tree object is a list of permission/name/blob data, sorted by name. In other words the tree object is uniquely determined by the set contents, and so two separate but identical trees will always share the exact same object.

Side note on trees: since a "tree" object is a sorted list of "filename+content", you can create a diff between two trees without actually having to unpack two trees. Just ignore all common parts, and your diff will look right. In other words, you can effectively (and efficiently) tell the difference between any two random trees by O(n) where "n" is the size of the difference, rather than the size of the tree.

Side note 2 on trees: since the name of a "blob" depends entirely and exclusively on its contents (ie there are no names or permissions involved), you can see trivial renames or permission changes by noticing that the blob stayed the same. However, renames with data changes need a smarter "diff" implementation.

commit / changeset

Data structure

tree sha1-of-tree
[parent sha1-of-parent]*
author author-gecos <author-email> authored-date
committer committer-gecos <committer-email> committed-date

<commit-message.........................................
.......................................................>

Description

The "changeset" object is an object that introduces the notion of history into the picture. In contrast to the other objects, it doesn't just describe the physical state of a tree, it describes how we got there, and why.

A "changeset" is defined by the tree-object that it results in, the parent changesets (zero, one or more) that led up to that point, and a comment on what happened. Again, a changeset is not trusted per se: the contents are well-defined and "safe" due to the cryptographically strong signatures at all levels, but there is no reason to believe that the tree is "good" or that the merge information makes sense. The parents do not have to actually have any relationship with the result, for example.

Note on changesets: unlike real SCM's, changesets do not contain rename information or file mode chane information. All of that is implicit in the trees involved (the result tree, and the result trees of the parents), and describing that makes no sense in this idiotic file manager.

Clone this wiki locally