# The plumbing: a walk through

## Initialize and show current folder

In [1]:
%%bash
rm -rf ./.git file.*
ls -lA

total 136
-rw-rw-r-- 1 jovyan  1000   311 Jun 10 00:28 aliases.rc
drwxr-xr-x 2 jovyan users  4096 Jun  9 18:36 confusion
-rw-r--r-- 1 jovyan users 13477 Jun 10 01:19 git.abstractions.ipynb
-rw-rw-r-- 1 jovyan  1000    60 Jun  9 20:56 .gitignore
-rw-r--r-- 1 jovyan users 44255 Jun 10 02:01 git.plumbing.ipynb
-rw-r--r-- 1 jovyan users  1484 Jun  9 23:43 git.references.ipynb
-rw-r--r-- 1 jovyan users 10227 Jun 10 00:57 git.repository.ipynb
-rw-r--r-- 1 jovyan users  1255 Jun 10 00:07 git.review.overview.ipynb
-rw-r--r-- 1 jovyan users 20280 Jun 10 00:53 git.review.walk.through.ipynb
drwxr-xr-x 3 jovyan users  4096 Jun 10 00:24 git.using.remote
drwxr-xr-x 2 jovyan users  4096 Jun 10 00:26 .ipynb_checkpoints
-rw-r--r-- 1 jovyan users  3609 Jun 10 00:34 my.setup.ipynb
drwxr-xr-x 3 jovyan users  4096 Jun 10 00:24 repo
-rw-r--r-- 1 jovyan users  3252 Jun 10 00:25 Summary.ipynb
-rw-r--r-- 1 jovyan users  2281 Jun  9 21:19 Untitled.ipynb


## Initialize the folder to have it be tracked by git

In [2]:
!git init

Initialized empty Git repository in /home/jovyan/work/.git/


In [3]:
%%bash
git config --global user.email "you@example.com"
git config --global user.name "Your Name"

## Verify that git has been initialized

In [4]:
!git status

On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	[31mgit.using.remote/[m
	[31mrepo/[m

nothing added to commit but untracked files present (use "git add" to track)


In [5]:
!git config --list | grep ^user

user.email=you@example.com
user.name=Your Name


In [6]:
!tree .git/objects

[01;34m.git/objects[00m
├── [01;34minfo[00m
└── [01;34mpack[00m

2 directories, 0 files


<br />
<br />
<br />
<br />
<br />

## Create a file in the current working directory and add to KV store

In [None]:
!echo "Hello, world" > file.001

In [None]:
!ls -lA file.*

In [None]:
!cat -n file.001

In [None]:
!git hash-object -w file.001

Notice that when we add a file to the KV store, we get back a key, which is the sha1 of the contents of the file ( almost. )

In [None]:
!tree .git/objects

<br />
<br />
<br />
<br />
<br />

## Change the file and add it to the KV store

In [None]:
!echo "Greetings, earthlings" > file.001

In [None]:
!cat -n file.001

In [None]:
!git hash-object -w file.001

In [None]:
!tree .git/objects

We now have two objects: both from "file.001"

<br />
<br />
<br />
<br />
<br />

## Restore previous version of file.001

In [None]:
!ls -lA file.*

In [None]:
!cat file.001

In [None]:
!git cat-file -p a5c19667710254f835085b99726e523457150e03 > file.001

In [None]:
!cat file.001

If we know the key of a previous version of a file, we can retrieve it from the KV store.

<br />
<br />
<br />
<br />
<br />

## Git objects
Notice that we didn't store the name of the file or any other metadata ( e.g. permissions ), just the contents.  A file with only content is called a **blob**.  To store metadata, we need a new kind of git object.  In git there are four types of objects:
* blob
* tree
* commit
* reference ( tag )

You can query what type of object a file is.

In [None]:
!git cat-file -t a5c19667710254f835085b99726e523457150e03

Or the same for all objects in the .git/objects hierarchy

In [None]:
!find .git/objects -type f | cut -d/ -f3- | tr -d '/' | xargs -n1 -t git cat-file -t

To see the contents of all objects

In [None]:
!find .git/objects -type f | cut -d/ -f3- | tr -d '/' | xargs -n1 -t git cat-file -p

<br />
<br />
<br />
<br />
<br />

## Create a tree
A tree is a collection of metadata and pointers to blobs or other trees.

In [None]:
%%bash
git update-index --add --cacheinfo 100644 a5c19667710254f835085b99726e523457150e03 file.001
git write-tree

In [None]:
!find .git/objects -type f | cut -d/ -f3- | tr -d '/' | xargs -n1 -t git cat-file -t

In [None]:
!find .git/objects -type f | cut -d/ -f3- | tr -d '/' | xargs -n1 -t git cat-file -p | cat -n 

<br />
<br />
<br />
<br />
<br />

## Stage the changed version of the first file

In [None]:
%%bash
git update-index --add --cacheinfo 100644 34a27f74d7d73cd456ce426bfa20bffcfb8fd11c file.001

# write the tree
git write-tree

In [None]:
%%bash
find .git/objects -type f | cut -d/ -f3- | tr -d '/' |
while read object ; do
  echo == ${object} $( git cat-file -t ${object} )
  git cat-file -p ${object}
  echo
done

<br />
<br />
<br />
<br />
<br />

<br />
<br />
<br />
<br />
<br />

## Create a new file and create a tree for it

In [None]:
!echo "Hello, world ... again" > file.002

In [None]:
!ls -lA file*

In [None]:
!cat -n file.002

In [None]:
%%bash
# for a file NOT in the KV store
git update-index --add file.002

# write the tree
git write-tree

In [None]:
%%bash
find .git/objects -type f | cut -d/ -f3- | tr -d '/' |
while read object ; do
  echo == ${object} $( git cat-file -t ${object} )
  git cat-file -p ${object}
  echo
done

These trees represent the "snapshots" of the objects in your working directory.  Tree a13eb has the original file.001, tree 7f1b has the changed file.001, and tree e1df3 has both files file.001 and file.002.  Think of these trees as sub-graphs.  Blobs are end nodes.  Trees point to one or more blobs ( or other trees ).

What we have is similar to this:
![]( https://git-scm.com/book/en/v2/images/data-model-2.png )


Time to stitch the trees together with commits.

<br />
<br />
<br />
<br />
<br />

## Create a commit

In [None]:
!echo 'file number one' | git commit-tree a13eb9d02b9ee2c2f0d073bbc65d91a18c7e7316

In [None]:
!git cat-file -p 0de89b32ae05ddda339c9eb73365b08812c3b76c

In [None]:
%%bash
find .git/objects -type f | cut -d/ -f3- | tr -d '/' |
while read object ; do
  echo == ${object} $( git cat-file -t ${object} )
  git cat-file -p ${object}
  echo
done

<br />
<br />
<br />
<br />
<br />

## Add the second commit

In [None]:
!echo 'changed salutation in file.001' | git commit-tree 7f1ba -p 0de89

In [None]:
%%bash
find .git/objects -type f | cut -d/ -f3- | tr -d '/' |
while read object ; do
  echo == ${object} $( git cat-file -t ${object} )
  git cat-file -p ${object}
  echo
done

<br />
<br />
<br />
<br />
<br />

## Add the third commit

In [None]:
!echo 'new file file.002' | git commit-tree e1df3 -p 90659

In [None]:
%%bash
find .git/objects -type f | cut -d/ -f3- | tr -d '/' |
while read object ; do
  echo == ${object} $( git cat-file -t ${object} )
  git cat-file -p ${object}
  echo
done

## View the git log

In [None]:
!git log --stat c1a42fdc092057c507e0d1f2e9703ea893d3de31

<br />
<br />
<br />
<br />
<br />

## Object storage
A bit of a rewind: I said that the object being stored is **mostly** the contents of the file.  Git actually prepends some metadata to the beginning of the file.


### Creating a git blob
Git prepends the type of object, the length of the object, and a null before the contents.

In [None]:
%%bash
## using git
text='test content'

echo -e "${text}" | git hash-object --stdin


In [None]:
%%bash
## manually
text='test content'
len=$( echo -e ${text} | wc -c | tr -d ' ' )

echo -e "blob ${len}\0${text}" | shasum -a 1 | tr -d '\n -'


### Reading the contents of a blob object

In [None]:
%%bash
## find a blob

find .git/objects -type f |
cut -d/ -f3- |
tr -d '/' |
xargs -n1 -t git cat-file -t 2>&1 |
paste - - |
fgrep blob

In [None]:
%%bash
file=.git/objects/a5/c19667710254f835085b99726e523457150e03

# we can trick gzip to decompress zlib files by prepending some hex code
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" |
cat - ${file} |
gzip -dc 2> /dev/null |
sed -e '1s/\x0/\n/'  # converts the null in the first line to a CR


<br />
<br />
<br />
<br />
<br />

## One last item: references
There are four git objects.  We've covered three of them: blobs, trees, commits.  The last one is references.

In [None]:
!tree .git/refs

In [None]:
%%bash
# create a "master" branch
echo c1a42fdc092057c507e0d1f2e9703ea893d3de31 > .git/refs/heads/master

git log --pretty=oneline master

**Don't do this** <br />
Use the plumbing commands.  Or better, the porcelain commands.


In [None]:
!git update-ref refs/heads/master c1a42fdc092057c507e0d1f2e9703ea893d3de31

In [None]:
!git log --pretty=oneline master

<br />
<br />
<br />
<br />
<br />

## Creating a branch
A branch is just a reference to a commit object.

In [None]:
!git update-ref refs/heads/test 90659809e392bb6f65fd8bccb0369616cfa511f8

In [None]:
!git log --pretty=oneline test

In [None]:
!git branch -a

Now we have something like this.
![]( https://git-scm.com/book/en/v2/images/data-model-4.png )

### The HEAD

In [None]:
!cat .git/HEAD

In [None]:
!git checkout test

In [None]:
!cat .git/HEAD

In [None]:
!git symbolic-ref HEAD

In [None]:
!git branch -a

In [None]:
%%bash
# manually setting the HEAD
git symbolic-ref HEAD refs/heads/master
cat .git/HEAD

In [None]:
!git branch -a

## The Tag
There are two types of tags:
* lightweight
* annotated

In [None]:
%%bash
# lightweight tag

git update-ref refs/tags/v1.0 c1a42fdc092057c507e0d1f2e9703ea893d3de31
cat .git/refs/tags/v1.0

In [None]:
!git tag

In [None]:
%%bash
# annotated tag
git tag -a v1.1 90659809e392bb6f65fd8bccb0369616cfa511f8 -m "Test tag"
cat .git/refs/tags/v1.1 |
tee /dev/stderr |
xargs git cat-file -p