<div align='center'>
<h1>The Structure of Git</h1>
<h3>Dylan Simon, Flatiron Institute</h3>
</div>

## git repository environment

Create a new repository in any existing directory

In [1]:
# Initial setup

GIT=`which git`
declare -A git_helped
git() {
    if [[ -z ${git_helped[$1]} ]] ; then
        whatis -l "git-$1" >&2
        git_helped[$1]=1
    fi
    $GIT "$@"
}

rm -rf ~/myrepo

In [2]:
mkdir -p ~/myrepo
cd ~/myrepo
git init

git-init (1)         - Create an empty Git repository or reinitialize an existing one
Initialized empty Git repository in /mnt/xfs1/home/dylan/myrepo/.git/


In [3]:
ls .git

[0m[00mHEAD[0m  [07mbranches[0m  [00mconfig[0m  [00mdescription[0m  [07mhooks[0m  [07minfo[0m  [07mobjects[0m  [07mrefs[0m


In [4]:
git init -h
git init --help

usage: git init [-q | --quiet] [--bare] [--template=<template-directory>] [--shared[=<permissions>]] [directory]

    --template <template-directory>
                          directory from which templates will be used
    --bare                create a bare repository
    --shared[=<permissions>]
                          specify that the git repository is to be shared amongst several users
    -q, --quiet           be quiet
    --separate-git-dir <gitdir>
                          separate git dir from working tree

GIT-INIT(1)                       Git Manual                       GIT-INIT(1)



NAME
       git-init - Create an empty Git repository or reinitialize an existing
       one

SYNOPSIS
       git init [-q | --quiet] [--bare] [--template=<template_directory>]
                 [--separate-git-dir <git dir>]
                 [--shared[=<permissions>]] [directory]


DESCRIPTION
       This command creates an empty Git repository - basically a .git
       directory with subdi

In [5]:
cat .git/config

[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true


In [6]:
git config --local -l

git-config (1)       - Get and set repository or global options
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true


## git as an object (file) store

Create a new object based on some data (e.g., file contents)

In [7]:
echo 'Hello World!' > f
git hash-object -t blob -w f
rm f

git-hash-object (1)  - Compute object ID and optionally creates a blob from a file
980a0d5f19a64b4b30a87d4206aade58726b60e3


Unique identifier for this data: each distinct file gets its own *hash*

In [8]:
echo -e 'blob 13\0Hello World!' | sha1sum

980a0d5f19a64b4b30a87d4206aade58726b60e3  -


In [9]:
git cat-file -t 980a0d5f19a64b4b30a87d4206aade58726b60e3 # object type
git cat-file -s 980a # any unique prefix of hash         # object size
git cat-file -p 980a0d5                                  # contents
file1=980a0d5f19a64b4b30a87d4206aade58726b60e3 # save for later

git-cat-file (1)     - Provide content or type and size information for repository objects
blob
13
Hello World!


In [10]:
file2=$( echo 'Something completely different.' \
         | git hash-object -t blob -w --stdin )
echo $file2

1a0985327d433bdfc3ea3c2b0a0443b3545064ac


In [11]:
git cat-file -p $file2

Something completely different.


### Where'd the data go?

In [12]:
find .git/objects -type f

.git/objects/98/0a0d5f19a64b4b30a87d4206aade58726b60e3
.git/objects/1a/0985327d433bdfc3ea3c2b0a0443b3545064ac


## Collecting objects: trees (directories)

In [13]:
( echo -e "100644 blob $file1\\thello.txt" \
; echo -e "100644 blob $file2\\tother.txt" \
) | git mktree

git-mktree (1)       - Build a tree-object from ls-tree formatted text
011ed906a8c5b0c0c14c0cad0a69d3969251b71f


A directory with two files, references to their contents by hash

In [14]:
tree1=011ed906a8c5b0c0c14c0cad0a69d3969251b71f
git cat-file -t $tree1
git cat-file -p $tree1

tree
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3	hello.txt
100644 blob 1a0985327d433bdfc3ea3c2b0a0443b3545064ac	other.txt


In [15]:
( echo -e "100644 blob $file1\\tREADME" \
; echo -e "040000 tree $tree1\\tstuff" \
) | git mktree

git-mktree (1)       - Build a tree-object from ls-tree formatted text
c3595f6745f977f2450eeeb5bd94ccd2e4fba498


Another directory, containing the first directory, nested

In [16]:
tree2=c3595f6745f977f2450eeeb5bd94ccd2e4fba498
git cat-file -p $tree2

100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3	README
040000 tree 011ed906a8c5b0c0c14c0cad0a69d3969251b71f	stuff


In [17]:
git ls-tree -tr $tree2

git-ls-tree (1)      - List the contents of a tree object
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3	README
040000 tree 011ed906a8c5b0c0c14c0cad0a69d3969251b71f	stuff
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3	stuff/hello.txt
100644 blob 1a0985327d433bdfc3ea3c2b0a0443b3545064ac	stuff/other.txt


## A *tree* is "snapshot" of a directory

Just as a *blob* is a snapshot of a file

In [18]:
file3=$( echo 'New and improved.' \
         | git hash-object -t blob -w --stdin )
tree2a=$( ( echo -e "100644 blob $file3\\tREADME" \
          ; echo -e "040000 tree $tree1\\tstuff" \
          ) | git mktree )
echo $tree2a
git ls-tree -tr $tree2a

git-mktree (1)       - Build a tree-object from ls-tree formatted text
674e727fabfeb840b5c4e36f2c33610dfb50458e
100644 blob f25e220dd7c5d3082f9754786f7fd6fcae6db473	README
040000 tree 011ed906a8c5b0c0c14c0cad0a69d3969251b71f	stuff
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3	stuff/hello.txt
100644 blob 1a0985327d433bdfc3ea3c2b0a0443b3545064ac	stuff/other.txt


### Comparing trees
Comparing snapshots is what git's for

In [19]:
git diff-tree -p $tree2 $tree2a

git-diff-tree (1)    - Compares the content and mode of blobs found via two tree objects
diff --git a/README b/README
index 980a0d5..f25e220 100644
--- a/README
+++ b/README
@@ -1 +1 @@
-Hello World!
+New and improved.


## *index*: "cache" between filesystem and trees

In [20]:
git read-tree $tree2
git ls-files

git-read-tree (1)    - Reads tree information into the index
git-ls-files (1)     - Show information about files in the index and the working tree
README
stuff/hello.txt
stuff/other.txt


In [21]:
git checkout-index -a
ls -lR

git-checkout-index (1) - Copy files from the index to the working tree
.:
total 4
-rw-r--r-- 1 dylan dylan 13 Jun 20 15:31 [0m[38;5;203mREADME[0m
drwxr-xr-x 2 dylan dylan 50 Jun 20 15:31 [07mstuff[0m

./stuff:
total 8
-rw-r--r-- 1 dylan dylan 13 Jun 20 15:31 [38;5;161mhello.txt[0m
-rw-r--r-- 1 dylan dylan 32 Jun 20 15:31 [38;5;161mother.txt[0m


In [22]:
echo 'New and improved.' > README
git diff

git-diff (1)         - Show changes between commits, commit and working tree, etc
 [1mdiff --git a/README b/README[m[m
 [1mindex 980a0d5..f25e220 100644[m[m
 [1m--- a/README[m[m
 [1m+++ b/README[m[m
 [36m@@ -1 +1 @@[m[m
 [31m-Hello World![m[m
 [32m+[m[32mNew and improved.[m[m
[K[?1l>

In [23]:
git add README

git-add (1)          - Add file contents to the index


In [24]:
git write-tree
echo $tree2a

git-write-tree (1)   - Create a tree object from the current index
674e727fabfeb840b5c4e36f2c33610dfb50458e
674e727fabfeb840b5c4e36f2c33610dfb50458e


In [25]:
git rm -f stuff/other.txt
git ls-files

git-rm (1)           - Remove files from the working tree and from the index
rm 'stuff/other.txt'
README
stuff/hello.txt


In [26]:
git mv README README.md
ls

git-mv (1)           - Move or rename a file, a directory, or a symlink
[0m[00mREADME.md[0m  [07mstuff[0m


`git add`, `mv`, `rm` also work on entire directories

## Versioning: commit (revision)
A *commit* is a tree and some metadata

In [27]:
man gitglossary

GITGLOSSARY(7)                    Git Manual                    GITGLOSSARY(7)



NAME
       gitglossary - A Git Glossary

SYNOPSIS
       *

DESCRIPTION
       alternate object database
           Via the alternates mechanism, a repository can inherit part of its
           object database from another object database, which is called
           "alternate".

       bare repository
           A bare repository is normally an appropriately named directory with
           a .git suffix that does not have a locally checked-out copy of any
           of the files under revision control. That is, all of the Git
           administrative and control files that would normally be present in
           the hidden .git sub-directory are directly present in the
           repository.git directory instead, and no other files are present
           and checked out. Usually publishers of public repositories make
           bare repositories available.

       blob object
           Untyped object,

           to be pre-verified and potentially aborted, and allow for a
           post-notification after the operation is done. The hook scripts are
           found in the $GIT_DIR/hooks/ directory, and are enabled by simply
           removing the .sample suffix from the filename. In earlier versions
           of Git you had to make them executable.

       index
           A collection of files with stat information, whose contents are
           stored as objects. The index is a stored version of your working
           tree. Truth be told, it can also contain a second, and even a third
           version of a working tree, which are used when merging.

       index entry
           The information regarding a particular file, stored in the index.
           An index entry can be unmerged, if a merge was started, but not yet
           finished (i.e. if the index contains multiple versions of that
           file).

       master
           The default development branch. Wheneve

           $GIT_DIR/refs/ directory, or in the $GIT_DIR/packed-refs file.

       reflog
           A reflog shows the local "history" of a ref. In other words, it can
           tell you what the 3rd last revision in this repository was, and
           what was the current state in this repository, yesterday 9:14pm.
           See git-reflog(1) for details.

       refspec
           A "refspec" is used by fetch and push to describe the mapping
           between remote ref and local ref.

       remote-tracking branch
           A regular Git branch that is used to follow changes from another
           repository. A remote-tracking branch should not contain direct
           modifications or have local commits made to it. A remote-tracking
           branch can usually be identified as the right-hand-side ref in a
           Pull: refspec.

       repository
           A collection of refs together with an object database containing
           all objects which are reachable from th