This notebook _roughly_ follows [git-book chapter 10.3 - Git Internals - Git References](https://git-scm.com/book/en/v2/Git-Internals-Git-References).

# Before we git started, let's setup your environment


## NOTE: Run the [git-objects notebook](../1_git-objects/git-objects.ipynb) before running this notebook

In [None]:
from pprint import pprint
from github import Github, GithubException

!sh clean.sh
!sh setup.sh

%env USERNAME="<git config user.name>"
%env USEREMAIL="<git config user.email>"
%env GITHUBACCESSTOKEN=<github access token - follow the instructions - https://help.github.com/en/articles/creating-a-personal-access-token-for-the-command-line>
%env GITHUBUSERNAME=<github user name>
%env GITHUBREPONAME=git-good-remote-repo

# Git Internals - Git References

![A Library](https://upload.wikimedia.org/wikipedia/commons/e/e1/Duke_Humfrey%27s_Library_Interior_6%2C_Bodleian_Library%2C_Oxford%2C_UK_-_Diliff.jpg)

## Git References

If we were intrested in seeing the history of our repo from the perspecive (reach) of a given commit - say 1a410e, we could run ```git log 1a410e``` to display that history

There is a caveat - we would have to remember that 1a410e. Oh dear

It would be nice if we had a file in which we could store that SHA-1 under a simple name that you could use instead...

Enter stage right - git references! These references to SHA-1s are stored in the refs folder

In [None]:
!find .git/refs \( -type d -printf "%p..\n" , -type f -print \) | sed -e "s/[^-][^\/]*\// |/g" -e "s/|\([^ ]\)/|-\1/"

In order to create a ref, we need to select the last commit

In [None]:
all_commits_sha_1 = !git cat-file --batch-check --batch-all-objects | awk '$2 == "commit" { print $1 }'
third_commit_sha_1 = all_commits_sha_1[2]
pprint(third_commit_sha_1)

And then use ```git update-ref``` to create a refrence to the third-commit SHA-1 

In [None]:
!git update-ref refs/heads/master $third_commit_sha_1

This should look familiar - what we've done is create a branch! That is what a branch is - a reference to a particular commit. We can now use ```git log``` to print out the referenced commits

In [None]:
!git log --pretty=oneline master

To prove the point even further, we can create a new ref (branch) which will reference the second commit

In [None]:
second_commit_sha_1 = all_commits_sha_1[1]
pprint(second_commit_sha_1)
!git update-ref refs/heads/test $second_commit_sha_1
!git log --pretty=oneline test

Currently, our git database looks something like this:

![Git directory objects with branch head references](https://git-scm.com/book/en/v2/images/data-model-4.png)

When we run commands like ```git branch <branch>```, git (basically) runs ```update-ref``` to add the SHA-1 of the last commit of the branch you are on into whatever new reference you want to create

## The HEAD

When we run ```git branch <branch>```, how does git know the SHA-1 of the last commit?

The HEAD file! The HEAD file is a symbolic reference to the branch we are currently on. HEAD differs from normal references in that it contains a pointer to another reference

Typically, HEAD will contain text which designates the reference it is pointing to. If I checkout our new ref test, HEAD will contain it's path  

In [None]:
!git checkout test
!cat .git/HEAD

But HEAD can also refrence the SHA-1 value of a git object. This can happen when you checkout a tag, commit, or remote branch. If HEAD contains a SHA-1, your repository is considered to be in the 'detached HEAD' state

In [None]:
first_commit_sha_1 = all_commits_sha_1[0]
!git checkout $first_commit_sha_1
!cat .git/HEAD

An alternative to using ```git checkout``` is the ```git symbolic-ref``` command. If we use this command, we must pass the full-path to the ref we want to use

#### NOTE: ```git symbolic-ref``` will not update your index!

In [None]:
!git symbolic-ref HEAD refs/heads/master
!cat .git/HEAD

If we try to use some symbolic ref outside of the refs/ folder, we will get an error

In [None]:
!git symbolic-ref HEAD master

For now, let's reset our HEAD to master

In [None]:
!git checkout master

## Tags

In the [git-objects notebook](../1_git-objects/git-objects.ipynb), we covered git's three main object types (**blobs**, **trees**, and **commits**), but there is a fourth

This fourth git object is the **tag**. It functions simular to a commit object, but it points to a commit, rather than a tree

It is like a branch reference that never moves - it always points to the same commit, providing it a frendlier name

There are two types of tags:

1. Lightweight - a tag reference that never moves

In [None]:
!git update-ref refs/tags/v1.0 $second_commit_sha_1

2. Annotated - a tag reference which references a tag object (rather than pointing directly to the commit)

In [None]:
!git tag -a v1.1 $third_commit_sha_1 -m 'test tag'
tagv11_sha_1 = !cat .git/refs/tags/v1.1
tagv11_sha_1 = tagv11_sha_1[0]
pprint(tagv11_sha_1)

We can see the object by using ```git cat-file```

In [None]:
!git cat-file -p $tagv11_sha_1

Notice that the object entry points to the commit SHA-1 value you tagged


In [None]:
pprint(third_commit_sha_1)

Also, unlike a commit, we do not need to provide a parent commit - meaning we can tag _any_ git object.

To further prove this point, we can see a tagged object that is not a 'git object'

The maintainer of the git source code has added their GPG public key as a blob then tagged it. 

We can view this public key querying the tag ```junio-gpg-pub```

In [None]:
!git clone https://github.com/git/git.git
!cd git
!git cat-file blob junio-gpg-pub

## Remotes

This third type of reference point remote refs that you add you your repo

If you add a remote called ```origin``` and push your ```master``` branch to it, git will store the value you last pushed in the ```refs/remotes``` directory

In [None]:
token = %env GITHUBACCESSTOKEN
username = %env GITHUBUSERNAME
reponame = %env GITHUBREPONAME
client = Github(token)
user = client.get_user()
try:
    repo = user.create_repo("")
except GithubException:
    repo = client.get_repo(username + "/" + reponame)

In [None]:
!git remote add origin repo.clone_url
remote= "https://"+username+":"+token+"@github.com/"+username+"/"+reponame+".git"
!git remote set-url --push origin $remote
!git push origin master

We can go ahead and see what the master branch on the origins remote was the last we communicated with the server

In [None]:
!cat .git/refs/remotes/origin/master

As you can see, this is the same SHA-1 that is referenced in our local branch

In [None]:
!git show --name-status