## Git Data Model

![git data model](./images/git-data-model.jpg)


# Git's objects

- 4 types of objects in a git
    - Files (also called blobs in git context)
    - directories or trees in git context
    - commits
    - tags

- branches(not objects but an important concept)

- HEAD (pointer to currently checked out branch/commit)

## Reference repo

In [2]:
git clone https://github.com/dvaske/data-model.git

Cloning into 'data-model'...
remote: Enumerating objects: 45, done.[K
remote: Counting objects: 100% (45/45), done.[K
remote: Compressing objects: 100% (26/26), done.[K
remote: Total 45 (delta 11), reused 45 (delta 11), pack-reused 0[K
Unpacking objects: 100% (45/45), done.


In [3]:
cd data-model

In [4]:
ls

another-file.txt  [0m[38;5;33ma_sub_directory[0m  cat-me.txt  hello_world.c  README.md


In [5]:
git help

usage: git [--version] [--help] [-C <path>] [-c <name>=<value>]
           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
           [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare]
           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
           <command> [<args>]

These are common Git commands used in various situations:

start a working area (see also: git help tutorial)
   clone      Clone a repository into a new directory
   init       Create an empty Git repository or reinitialize an existing one

work on the current change (see also: git help everyday)
   add        Add file contents to the index
   mv         Move or rename a file, a directory, or a symlink
   reset      Reset current HEAD to the specified state
   rm         Remove files from the working tree and from the index

examine the history and state (see also: git help revisions)
   bisect     Use binary search to find the commit that introduced a bug
   grep      

In [6]:
git help -a

See 'git help <command>' to read about a specific subcommand

Main Porcelain Commands
   add                  Add file contents to the index
   am                   Apply a series of patches from a mailbox
   archive              Create an archive of files from a named tree
   bisect               Use binary search to find the commit that introduced a bug
   branch               List, create, or delete branches
   bundle               Move objects and refs by archive
   checkout             Switch branches or restore working tree files
   cherry-pick          Apply the changes introduced by some existing commits
   citool               Graphical alternative to git-commit
   clean                Remove untracked files from the working tree
   clone                Clone a repository into a new directory
   commit               Record changes to the repository
   describe             Give an object a human readable name based on an available ref
   diff                 Show changes betwee

In [7]:
git cat-file -p HEAD

tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
author Aske Olsson <aske.olsson@switch-gears.dk> 1386933960 +0100
committer Aske Olsson <aske.olsson@switch-gears.dk> 1386941455 +0100

This is the subject line of the commit message

It should be followed by a blank line then the body, which is this text. Here
you can have multiple paragraphs etc. and explain your commit. It's like an
email with subject and body, so get people's attention in the subject


In [8]:
git log --oneline

[33m34acc37[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m, [m[1;33mtag: v1.0[m[33m, [m[1;31morigin/master[m[33m, [m[1;31morigin/HEAD[m[33m)[m This is the subject line of the commit message
[33ma90d190[m Instructions for compiling hello_world.c
[33m485884e[m Merge branch 'feature/1'
[33mc98f570[m Adds a hello world C program
[33m44f1e05[m Add another file to the repository
[33m08e022e[m More master info in README
[33m8f845c4[m Update README with master branch info
[33m97ce729[m[33m ([m[1;33mtag: v0.1[m[33m)[m Initial commit for data-model-repository


In [10]:
git cat-file HEAD

usage: git cat-file (-t [--allow-unknown-type] | -s [--allow-unknown-type] | -e | -p | <type> | --textconv | --filters) [--path=<path>] <object>
   or: git cat-file (--batch | --batch-check) [--follow-symlinks] [--textconv | --filters]

<type> can be one of: blob, tree, commit, tag
    -t                    show object type
    -s                    show object size
    -e                    exit with zero when there's no error
    -p                    pretty-print object's content
    --textconv            for blob objects, run textconv on object's content
    --filters             for blob objects, run filters on object's content
    --path <blob>         use a specific path for --textconv/--filters
    --allow-unknown-type  allow -s and -t to work with broken/corrupt objects
    --buffer              buffer --batch output
    --batch[=<format>]    show info and content of objects fed from the standard input
    --batch-check[=<format>]
                          show info about obje

: 129

## The Tree Object

In [11]:
git cat-file -p 34fa038544bcd9aed660c08320214bafff94150b

100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15	README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8	a_sub_directory
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a	another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727	cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd	hello_world.c


 - tree = directories
 - blob = files
 

git cat-file -p HEAD^{tree}

The special notation HEAD^{tree} means that from the reference given, (HEAD) recursively dereferences the object at the reference until a tree object is found.

A generic form of the notation is <rev>^<type> and will return the first object of <type> searching recursively from <rev>.

In [13]:
git cat-file -p HEAD^{tree}

100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15	README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8	a_sub_directory
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a	another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727	cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd	hello_world.c


In [21]:
git cat-file -p abc267d04fb803760b75be7e665d3d69eeed32f8^{tree}

100644 blob 6dc3bfbc6db8253b7789af1dee44caf8ec6ffb6e	readme


## The Commit Object

HEAD always points to the current snapshot/commit, so we can use that as a target for our request of the commit we want to have a look at

## The Blob Object

In [23]:
git cat-file -p HEAD

tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
author Aske Olsson <aske.olsson@switch-gears.dk> 1386933960 +0100
committer Aske Olsson <aske.olsson@switch-gears.dk> 1386941455 +0100

This is the subject line of the commit message

It should be followed by a blank line then the body, which is this text. Here
you can have multiple paragraphs etc. and explain your commit. It's like an
email with subject and body, so get people's attention in the subject


In [24]:
git cat-file -p 34fa038544bcd9aed660c08320214bafff94150b

100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15	README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8	a_sub_directory
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a	another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727	cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd	hello_world.c


In [25]:
git cat-file -p f21dc2804e888fee6014d7e5b1ceee533b222c15

Example Git repository

This repository shall explain the Git data model

Master
------

The master branch is the default branch, this is on default checked out when the
repository is cloned

hello_world.c instructions
--------------------------
gcc -Wall hello_world.c -o hello
./hello




the objects are tied together, blobs to trees, trees to other trees, and the root tree to the commit object, all by the SHA-1 identifier of the object.

## The Branch

In [26]:
git cat-file master

usage: git cat-file (-t [--allow-unknown-type] | -s [--allow-unknown-type] | -e | -p | <type> | --textconv | --filters) [--path=<path>] <object>
   or: git cat-file (--batch | --batch-check) [--follow-symlinks] [--textconv | --filters]

<type> can be one of: blob, tree, commit, tag
    -t                    show object type
    -s                    show object size
    -e                    exit with zero when there's no error
    -p                    pretty-print object's content
    --textconv            for blob objects, run textconv on object's content
    --filters             for blob objects, run filters on object's content
    --path <blob>         use a specific path for --textconv/--filters
    --allow-unknown-type  allow -s and -t to work with broken/corrupt objects
    --buffer              buffer --batch output
    --batch[=<format>]    show info and content of objects fed from the standard input
    --batch-check[=<format>]
                          show info about obje

: 129

The branch object is not really like any other Git objects; you can't print it using the cat-file command as we can with the others (if you specify the -p pretty print, you'll just get the commit object it points to):

In [27]:
git cat-file -p master

tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
author Aske Olsson <aske.olsson@switch-gears.dk> 1386933960 +0100
committer Aske Olsson <aske.olsson@switch-gears.dk> 1386941455 +0100

This is the subject line of the commit message

It should be followed by a blank line then the body, which is this text. Here
you can have multiple paragraphs etc. and explain your commit. It's like an
email with subject and body, so get people's attention in the subject


In [29]:
cat .git/refs/heads/master

34acc370b4d6ae53f051255680feaefaf7f7850d


In [30]:
cat .git/HEAD

ref: refs/heads/master


## The Tag Object

three different kinds of tags:\
- lightweight (just a label) tag
- an annotated tag
- a signed tag.

In [31]:
git tag

v0.1
v1.0


In [32]:
git cat-file -p v1.0

object 34acc370b4d6ae53f051255680feaefaf7f7850d
type commit
tag v1.0
tagger Aske Olsson <aske.olsson@switch-gears.dk> 1386941492 +0100

We got the hello world C program merged, let's call that a release 1.0


### git hash-object

In [33]:
git log --oneline

[33m34acc37[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m, [m[1;33mtag: v1.0[m[33m, [m[1;31morigin/master[m[33m, [m[1;31morigin/HEAD[m[33m)[m This is the subject line of the commit message
[33ma90d190[m Instructions for compiling hello_world.c
[33m485884e[m Merge branch 'feature/1'
[33mc98f570[m Adds a hello world C program
[33m44f1e05[m Add another file to the repository
[33m08e022e[m More master info in README
[33m8f845c4[m Update README with master branch info
[33m97ce729[m[33m ([m[1;33mtag: v0.1[m[33m)[m Initial commit for data-model-repository


In [35]:
git cat-file -p HEAD | git hash-object -t commit --stdin

34acc370b4d6ae53f051255680feaefaf7f7850d


In [37]:
git log -1 --oneline

[33m34acc37[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m, [m[1;33mtag: v1.0[m[33m, [m[1;31morigin/master[m[33m, [m[1;31morigin/HEAD[m[33m)[m This is the subject line of the commit message


### git ls-tree

In [38]:
git ls-tree

usage: git ls-tree [<options>] <tree-ish> [<path>...]

    -d                    only show trees
    -r                    recurse into subtrees
    -t                    show trees when recursing
    -z                    terminate entries with NUL byte
    -l, --long            include object size
    --name-only           list only filenames
    --name-status         list only filenames
    --full-name           use full path names
    --full-tree           list entire tree; not just current directory (implies --full-name)
    --abbrev[=<n>]        use <n> digits to display SHA-1s



: 129

In [39]:
git cat-file -p HEAD

tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
author Aske Olsson <aske.olsson@switch-gears.dk> 1386933960 +0100
committer Aske Olsson <aske.olsson@switch-gears.dk> 1386941455 +0100

This is the subject line of the commit message

It should be followed by a blank line then the body, which is this text. Here
you can have multiple paragraphs etc. and explain your commit. It's like an
email with subject and body, so get people's attention in the subject


In [40]:
git ls-tree 34fa038544bcd9aed660c08320214bafff94150b

100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15	README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8	a_sub_directory
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a	another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727	cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd	hello_world.c


In [41]:
git show 34fa038544bcd9aed660c08320214bafff94150b

[33mtree 34fa038544bcd9aed660c08320214bafff94150b[m

README.md
a_sub_directory/
another-file.txt
cat-me.txt
hello_world.c


In [43]:
git show 34fa038544bcd9aed660c08320214bafff94150b

[33mtree 34fa038544bcd9aed660c08320214bafff94150b[m

README.md
a_sub_directory/
another-file.txt
cat-me.txt
hello_world.c


In [44]:
git ls-tree -r 34fa038544bcd9aed660c08320214bafff94150b

100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15	README.md
100644 blob 6dc3bfbc6db8253b7789af1dee44caf8ec6ffb6e	a_sub_directory/readme
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a	another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727	cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd	hello_world.c
