# Introduction / review of Git + Github workflow

- Git model: repositories, blobs, trees, commits, and branches
- Using remote git repositories: clone, push, pull, remote
- Using Github pull requests

# Basic git concepts

## Respository

The basic unit of organization in git; this is where your commits, trees, branches, etc. are all managed. Usually exists as a `.git` directory at the root of your working folder

## Blob

This is what Git calls its files. A **blob** is a snapshot of the contents of a file at a certain point in time.

## Tree

This is what Git calls its folders. A **tree** is a snapshot of pointers to all the **blobs** and/or **trees** that are inside a folder at a certain point in time.

## Commit

Git repositories are stored as a linked list of **commit** objects. A **commit** contains the following information:

- The author of the commit 
- The "parent"(s) (predecessor(s) of the commit)
- A commit message
- A pointer to the root **tree** of that commit

# Creating a git repository


To create a git repository ("repo"), just use the `git init` command:

In [1]:
cd data/git-example

/Users/rick446/src/arborian-classes/data/git-example


In [3]:
# could also do the following:
#    $ mkdir my-repo
#    $ cd my-repo
#    $ git init 
!git init my-repo

Reinitialized existing Git repository in /Users/rick446/src/arborian-classes/data/git-example/my-repo/.git/


### Git configuration

Before we get too deep into using git, we need to tell it who we are and what kind of editor we use:

In [11]:
%%bash
git config --global user.name
git config --global user.email
git config --global core.editor

Rick Copeland
rick@arborian.com
subl -w


To add your own configuration, just repeat the following commands with your actual data:

```bash
$ git config --global user.name YOURNAME
$ git config --global user.email YOUREMAIL
$ git config --global core.editor YOUREDITOR
```

### Creating a new file

We can start creating files under the 'my-repo' that we want to add to git:

In [5]:
cd my-repo

/Users/rick446/src/arborian-classes/data/git-example/my-repo


In [6]:
%%file README.md
# My very first project

This is my first git repo.

Writing README.md


### Adding the file to the repo via the "index" (staging area)

Although the file exists on our filesystem, the **repo** doesn't (yet) know anything about it. 

Unlike other version control systems, git requires you to be very explicit about what you want to change in the repo. In this case, we want git to pick up the fact that something has changed about the file `README.md`, so we'll **add** ("stage") the file for our next commit:

In [7]:
!git add README.md

### Checking the status of the repository

At any time, we can ask git what it knows about different kinds of objects using the `git status` command:

In [9]:
!git status

On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	[33mnew file:   README.md[m



### Creating a commit

We can go ahead and create a commit that references our new file with the `git commit` command. The following command will open up your editor and ask for a commit message:

In [12]:
!git commit

[master (root-commit) 67f5fe4] This is my message
 1 file changed, 3 insertions(+)
 create mode 100644 README.md


### Modifying a file

We can modify the file we've created and make another commit:

In [13]:
%%file README.md
# My very first project

This is my first git repo.

Here is some more text

Overwriting README.md


In [14]:
!git status

On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[36mmodified:   README.md[m

no changes added to commit (use "git add" and/or "git commit -a")


In [15]:
!git diff

[1;33mdiff --git a/README.md b/README.md[m
[1;33mindex ac42323..3b6485b 100644[m
[1;33m--- a/README.md[m
[1;33m+++ b/README.md[m
[1;35m@@ -1,3 +1,5 @@[m
 # My very first project[m
 [m
 This is my first git repo.[m
[1;32m+[m
[1;32m+[m[1;32mHere is some more text[m


In [16]:
!git commit

On branch master
Changes not staged for commit:
	[36mmodified:   README.md[m

no changes added to commit


# Nothing happened -- why?

# Let's let git know we'd like to "add" the change to the next commit

In [18]:
!git add README.md

In [19]:
!git status

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[33mmodified:   README.md[m



In [20]:
!git commit -m 'You can also specify your commit message on the command-line'

[master 763daca] You can also specify your commit message on the command-line
 1 file changed, 2 insertions(+)


### Getting the history with `git log`

In [22]:
!git log

[33mcommit 763daca1fb050d623652754de945e8881337d46a[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m)[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:46:18 2019 -0500

    You can also specify your commit message on the command-line

[33mcommit 67f5fe45ca5a46a99e201eee52bbfa3c1c4cfac2[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:43:29 2019 -0500

    This is my message


### What's the deal with the hex strings?

These are actually *hashes* of the content of the objects (blobs, trees, and commits) stored inside git. You can think of them as pointers to objects that git is tracking.

In [23]:
!git cat-file commit 763daca1fb050d623652754de945e8881337d46a

tree 835b23433c6d2ec42baff3541ad90e39c5bb2724
parent 67f5fe45ca5a46a99e201eee52bbfa3c1c4cfac2
author Rick Copeland <rick@arborian.com> 1550522778 -0500
committer Rick Copeland <rick@arborian.com> 1550522778 -0500

You can also specify your commit message on the command-line


In [24]:
!git cat-file tree 835b23433c6d2ec42baff3541ad90e39c5bb2724

100644 README.md ;d����0����}"jQ+-�

### Branches - working on more than one thing at once

Git provides you with the ability to create *lightweight*, *moveable* references to commits called *branches*.

To create a new branch, we just need to execute the `git branch` command:

In [25]:
!git branch branch1

In [26]:
!git branch

  [33mbranch1[m
* [7;33mmaster[m


If we make a commit right now, then the *master* branch will move so that it points ot our new commit. 

We don't want to do that, so we'll *switch* to branch1 by executing `git checkout`, which sets the current branch and updates our working directory to match the new current branch:

In [27]:
!git checkout branch1

Switched to branch 'branch1'


In [28]:
!git branch

* [7;33mbranch1[m
  [33mmaster[m


Now we can make another commit (on branch1 this time) and see what the log shows us. Let's create a new file:

In [29]:
%%file LICENSE.txt
All rights reserved.

Writing LICENSE.txt


In [30]:
!git add LICENSE.txt

In [31]:
!git commit -m "Added license so our stuff doesn't get stolen"

[branch1 761a39e] Added license so our stuff doesn't get stolen
 1 file changed, 1 insertion(+)
 create mode 100644 LICENSE.txt


In [35]:
!git log --decorate

[33mcommit 761a39eba3815742a1ae06eea14ed6434521118e[m[33m ([m[1;36mHEAD -> [m[1;32mbranch1[m[33m)[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:53:37 2019 -0500

    Added license so our stuff doesn't get stolen

[33mcommit 763daca1fb050d623652754de945e8881337d46a[m[33m ([m[1;32mmaster[m[33m)[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:46:18 2019 -0500

    You can also specify your commit message on the command-line

[33mcommit 67f5fe45ca5a46a99e201eee52bbfa3c1c4cfac2[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:43:29 2019 -0500

    This is my message


Note that, while *branch1* points to the new commit, *master* is still pointing to our previous commit.

Let's create another branch:

In [36]:
%%bash
git checkout master   # we want to branch off master, not branch1
git branch branch2    # create the new branch
git checkout branch2  # ... and switch to it

Switched to branch 'master'
Switched to branch 'branch2'


In [37]:
!git log --decorate

[33mcommit 763daca1fb050d623652754de945e8881337d46a[m[33m ([m[1;36mHEAD -> [m[1;32mbranch2[m[33m, [m[1;32mmaster[m[33m)[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:46:18 2019 -0500

    You can also specify your commit message on the command-line

[33mcommit 67f5fe45ca5a46a99e201eee52bbfa3c1c4cfac2[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:43:29 2019 -0500

    This is my message


In [38]:
!git status

On branch branch2
nothing to commit, working tree clean


In [39]:
%%file README.md
# My very first project

This is my first git repo.

Here is some more text.

Here, I'm adding some text on branch1 because I want to see how merging works

Overwriting README.md


In [40]:
!git add README.md
!git commit -m 'Make a change on branch1'

[branch2 23bcd7b] Make a change on branch1
 1 file changed, 3 insertions(+), 1 deletion(-)


In [41]:
!git log --decorate --all

[33mcommit 23bcd7b719dab56f48ebfde270e10dece98135ac[m[33m ([m[1;36mHEAD -> [m[1;32mbranch2[m[33m)[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:57:51 2019 -0500

    Make a change on branch1

[33mcommit 761a39eba3815742a1ae06eea14ed6434521118e[m[33m ([m[1;32mbranch1[m[33m)[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:53:37 2019 -0500

    Added license so our stuff doesn't get stolen

[33mcommit 763daca1fb050d623652754de945e8881337d46a[m[33m ([m[1;32mmaster[m[33m)[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:46:18 2019 -0500

    You can also specify your commit message on the command-line

[33mcommit 67f5fe45ca5a46a99e201eee52bbfa3c1c4cfac2[m
Author: Rick Copeland <rick@arborian.com>
Date:   Mon Feb 18 15:43:29 2019 -0500

    This is my message


There's a couple of nice flags we can use with `git log` to see our branches graphically:

- `--decorate` - show which commits branches point to 
- `--all` - show all branches, not just the current one
- `--oneline` - only use a single line per commit
- `--graph` - show the branching graphically

In [42]:
!git log --decorate --all --oneline --graph

* [33m23bcd7b[m[33m ([m[1;36mHEAD -> [m[1;32mbranch2[m[33m)[m Make a change on branch1
[31m|[m * [33m761a39e[m[33m ([m[1;32mbranch1[m[33m)[m Added license so our stuff doesn't get stolen
[31m|[m[31m/[m  
* [33m763daca[m[33m ([m[1;32mmaster[m[33m)[m You can also specify your commit message on the command-line
* [33m67f5fe4[m This is my message


### Merging commits: fast-forward

We can incorporate the changes from one branch onto another branch by "merging" the branch. This can take one of two forms. The first is a "fast-forward" merge. It turns out that we can merge "branch1" onto "master" by just moving master to point to the same commit that branch1 is pointing to:

In [43]:
!git checkout master    # checkout the "target" branch of the merge
!git merge branch1      # merge the changes from branch1 into the current branch

Switched to branch 'master'
Updating 763daca..761a39e
Fast-forward
 LICENSE.txt | 1 [1;32m+[m
 1 file changed, 1 insertion(+)
 create mode 100644 LICENSE.txt


In [44]:
!git log --decorate --all --oneline --graph

* [33m23bcd7b[m[33m ([m[1;32mbranch2[m[33m)[m Make a change on branch1
[31m|[m * [33m761a39e[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m, [m[1;32mbranch1[m[33m)[m Added license so our stuff doesn't get stolen
[31m|[m[31m/[m  
* [33m763daca[m You can also specify your commit message on the command-line
* [33m67f5fe4[m This is my message


### Merging commits without fast-forward

It turns out that it's not always possible to merge commits by fast-forwarding the target branch. In these cases, git creates a *merge commit* which contains git's best guess (which is usually pretty good) of a commit which incorporates the contents of both branches:

In [45]:
!git merge branch2

Merge made by the 'recursive' strategy.
 README.md | 4 [1;32m+++[m[1;31m-[m
 1 file changed, 3 insertions(+), 1 deletion(-)


In [46]:
!git log --decorate --all --oneline --graph

*   [33m911f10a[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m)[m Merge branch 'branch2'
[31m|[m[32m\[m  
[31m|[m * [33m23bcd7b[m[33m ([m[1;32mbranch2[m[33m)[m Make a change on branch1
* [32m|[m [33m761a39e[m[33m ([m[1;32mbranch1[m[33m)[m Added license so our stuff doesn't get stolen
[32m|[m[32m/[m  
* [33m763daca[m You can also specify your commit message on the command-line
* [33m67f5fe4[m This is my message


### Cleaning up old branches

When we're done with a branch, we can delete it with `git branch -d`:

In [48]:
!git branch -d branch1 branch2

Deleted branch branch1 (was 761a39e).
Deleted branch branch2 (was 23bcd7b).


In [49]:
!git log --decorate --all --oneline --graph

*   [33m911f10a[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m)[m Merge branch 'branch2'
[31m|[m[32m\[m  
[31m|[m * [33m23bcd7b[m Make a change on branch1
* [32m|[m [33m761a39e[m Added license so our stuff doesn't get stolen
[32m|[m[32m/[m  
* [33m763daca[m You can also specify your commit message on the command-line
* [33m67f5fe4[m This is my message


In [51]:
!git branch

* [7;33mmaster[m


# Using git remotes and github

Normally, we won't be just doing things locally, but we will be working on a "clone" (a local copy) of a repo on another machine.

Git can keep track of repos on other machines by using "remotes," but we won't usually do much with those. Instead, we'll usually start our project by cloning a repo from Github:

In [53]:
%%bash
cd ..
rm -fr my-repo

/Users/rick446/src/arborian-classes/data/git-example


In [55]:
!git clone https://github.com/Arborian/my-repo.git

Cloning into 'my-repo'...
remote: Enumerating objects: 14, done.[K
remote: Counting objects: 100% (14/14), done.[K
remote: Compressing objects: 100% (7/7), done.[K
remote: Total 14 (delta 3), reused 14 (delta 3), pack-reused 0[K
Unpacking objects: 100% (14/14), done.


In [56]:
cd my-repo

/Users/rick446/src/arborian-classes/data/git-example/my-repo


### Pushing local changes to the remote

If we've made some local commits that we'd like to sync up to the remote, we can do a 'push' of those commits to the remote with the `git push` command:

In [64]:
%%file README.md
# My very first project

This is my first git repo.

Here is some more text.

This is a commit that will be pushed to the remote

Overwriting README.md


In [65]:
!git add README.md
!git commit -m 'Change for remote pushing'
!git status

[master 05cfbd0] Change for remote pushing
 1 file changed, 1 insertion(+), 1 deletion(-)
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean


In [66]:
!git push

Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 387 bytes | 387.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To github.com:rick446/my-repo.git
   911f10a..05cfbd0  master -> master


# Working with "forks"

Github introduced a concept known as a "fork." 

A fork is a **remote clone** of another repo (typically referred to as the "upstream" repo) that *you own.*

To create a fork, you visit the upstream repo on Github and click the 'fork' button.

Once you've created your fork, you `git clone` your fork, and then you can push/pull to/from your fork whenever you want.

### Managing the upstream with a second remote

Sometimes it can be handy to have a remote reference to both the upstream (which we'll call 'upstream') and your fork (which we'll call 'origin'). In order to add a new remote, we just use the `git remote add` command:

In [67]:
cd ..

/Users/rick446/src/arborian-classes/data


In [68]:
!rm -fr my-repo

In [None]:
!git clone https://github.com/rick446/my-repo.git

In [73]:
cd my-repo/

/Users/rick446/src/arborian-classes/data/my-repo


In [75]:
!git remote add upstream https://github.com/Arborian/my-repo.git

# Fetching changes from upstream

Once we have our upstream reference, suppose someone makes a change to it and we need to merge those changes into our local 'master.' To do this, we can use the `git pull upstream` command to merge the chagnes from upstream's master into our local branch:

In [76]:
!git pull upstream master

From https://github.com/Arborian/my-repo
 * branch            master     -> FETCH_HEAD
Updating 05cfbd0..2362d18
Fast-forward
 README.md | 3 [1;32m+++[m
 1 file changed, 3 insertions(+)
Current branch master is up to date.


In [77]:
!git log --oneline --graph --decorate --all

* [33m2362d18[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m, [m[1;31mupstream/master[m[33m)[m Update README.md
* [33m05cfbd0[m[33m ([m[1;31morigin/master[m[33m, [m[1;31morigin/HEAD[m[33m)[m Change for remote pushing
*   [33m911f10a[m Merge branch 'branch2'
[32m|[m[33m\[m  
[32m|[m * [33m23bcd7b[m Make a change on branch1
* [33m|[m [33m761a39e[m Added license so our stuff doesn't get stolen
[33m|[m[33m/[m  
* [33m763daca[m You can also specify your commit message on the command-line
* [33m67f5fe4[m This is my message


Once I have the changes locally, I can push them to my fork to make sure everything stays up-to-date:

In [78]:
!git push

Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 759 bytes | 759.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.[K
To github.com:rick446/my-repo.git
   05cfbd0..2362d18  master -> master


### Requesting that our changes be incorporated upstream with "pull requests"

Suppose we've made a number of changes to our fork (pushing them all up from our local clone) and we'd like to have them incorporated into the upstream. 

For this case, Github introduced the concept of "pull requests," which are requests on github.com that a maintainer of the upstream project merge in changes from your fork. 

In [79]:
%%file README.md
# My very first project

This is my first git repo.

Here is some more text.

This is a commit that will be pushed to the remote

This is a change so I can do a pull request

Overwriting README.md


In [80]:
!git add README.md

In [81]:
!git commit -m 'Change for pull request'

[master e002746] Change for pull request
 1 file changed, 1 insertion(+), 2 deletions(-)


In [83]:
!git push

Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 357 bytes | 357.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.[K
To github.com:rick446/my-repo.git
   2362d18..e002746  master -> master


If I go to my fork on Github now, I can click the displayed button to create a pull request:

<img src="data/img/pull-request-1.png">

If I go ahead and complete creating the pull request by filling out the forms, then the upstream maintainer is notified of the request and can choose to incorporate the changes by simply clicking 'merge pull request:'

<img src="data/img/pull-request-upstream.png">

# Workflow

Overall, then, we can work with forks, remotes, and clones with the following diagram as a guide:

<img src="data/img/github-workflow.png">

### Setup

1. (on github.com) fork the upstream repository
1. (locally) clone fork to your local machine
1. (locally) create a reference to the upstream with `git remote add`

### Day-to-day work

1. (locally) make changes and commits
1. (locally) push changes to your fork

### Keeping up-to-date with upstream

1. (locally) `git pull upstream master` 

### Requesting that your work be incorporated upstream

1. (locally) push all changes to your fork
1. (on github.com) create a pull request

# Lab

Open [Github lab][github-lab]

[github-lab]: ./github-lab.ipynb