# Git and Github Tutorial
*This tutorial will walk you through some of the basics of git (repos, comitting, stashing, branches) before applying some of these concepts in a social coding context (i.e., Github). *

IMPORTANT: You must either use a bash shell or rewrite these commands in a terminal. This is not python code.

IMPORTANT: If you're on Windows, you need to download a git bash terminal (by going [here](https://git-scm.com/downloads) for instance). Avoid using the graphical user interface (GUI) because it'll be harder to grasp the concepts in the long run. Also, many of the help guides use the terminal instead of the GUI.

*Reasons to use Git
1. Version control with branches
2. Free and open source
3. Lends itself well to collaborative coding tools (like Github)*

*Reasons to use Github
1. Greatly facilitated collaborative coding through repositories (or "repos")
2. Issue tracking (both code bugs and suggested enhancements)
3. Ease of reviewing code changes
4. Continuous integration (CI) for automatic code checking
5. Octocat*

# The basics
`git` *is just a piece of software that you can download.*

*If you don't already have it, you can get it by following [these instructions](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).*

In [1]:
mkdir my_project



In [2]:
cd my_project
ls -a

[1m[34m.[39;49m[0m  [1m[34m..[39;49m[0m


In [3]:
git init

Initialized empty Git repository in /Users/rkp/Dropbox/Repositories/dbw_2015/tutorial_git/my_project/.git/


*When you initialize a repository, git creates a hidden directory called .git, which is where your project history will be stored.*

*At this point your project is entirely local, so even though all the versions will be stored, if you delete the repository you will lose all of your work. We will learn how to use GitHub in a little bit.*

In [4]:
ls -a

[1m[34m.[39;49m[0m    [1m[34m..[39;49m[0m   [1m[34m.git[39;49m[0m


`git status` *allows you to look at the current state of your working directory.*

In [5]:
git status

# On branch master
#
# Initial commit
#
nothing to commit (create/copy files and use "git add" to track)


### *Creating a new file and committing it.*
*Make a file called "my_code.py" inside the "my_project" directory and add the following lines:*
```
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
```

In [6]:
git status

# On branch master
#
# Initial commit
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#	[31mmy_code.py[m
nothing added to commit but untracked files present (use "git add" to track)


*The command *`git add <file>`* moves a modified file to the staging area.*

In [7]:
git add my_code.py



In [8]:
git status

# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#	[32mnew file:   my_code.py[m
#


*Note: files can be removed from the staging area using * `git reset`.

*The command *`git commit -m <message>`* saves the changes in the staging area to your project history.*

In [9]:
git commit -m "Add imports to code."

[master (root-commit) 19f5f66] Add imports to code.
 1 file changed, 3 insertions(+)
 create mode 100644 my_code.py


In [10]:
git status

# On branch master
nothing to commit (working directory clean)


### *Committing modifications to a file.*
*Add the following lines to my_code.py:*
```
def print_hi():
    print 'hi'
```

In [11]:
git add my_code.py
git commit -m "Add print_hi function."

[master a9f7b16] Add print_hi function.
 1 file changed, 4 insertions(+)


*You can use *`git log`* (with various options) to view your project history. (Note that this will look nicer in your terminal window.)*

`HEAD` *tells you what commit you're currently looking at.*

In [12]:
git log --oneline --decorate --all

[?1h=[33ma9f7b16 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

### *Looking at past commits*
*You can view a previous commit using:* `git checkout <commit>` 

*(Note that commit hashes are essentially random and will be specific to each repository.)*

In [14]:
git checkout 19f5f66

Note: checking out '19f5f66'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 19f5f66... Add imports to code.


*Now go take a look at *`my_code.py`*. You won't see your recent changes there. Why not?*

In [15]:
git log --oneline --decorate --all

[?1h=[33ma9f7b16 ([1;32mmaster[m[33m)[m Add print_hi function.[m
[33m19f5f66 ([1;36mHEAD[m[33m)[m Add imports to code.[m
[K[?1l>

*If you wanted to modify this code you would make a new branch from this commit, which we'll talk about later.*

*To return the *`HEAD`* to the most recent commit, use:*  `git checkout master`

In [16]:
git checkout master

Previous HEAD position was 19f5f66... Add imports to code.
Switched to branch 'master'


*If you want to make sure this worked correctly and took you back to your most recent commit, look at *`my_code.py`.

In [17]:
git log --oneline --decorate --all

[?1h=[33ma9f7b16 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

*To undo a commit use:* `git revert HEAD` *(or* `git revert HEAD~3..HEAD` *to undo the last three commits, for example).*

*Note that below we've used the *`--no-edit`* option, since Jupyter won't open a text editor. If you run this in the command line without the *`--no-edit`* option, you'll be taken to a text editor (most likely vim) that allows you to add more detail to your commit message. To enter text press "i", which will take you to insert mode. To exit insert mode press "esc". To close the text editor and save the commit type ":q" and hit "return".*

In [18]:
git revert HEAD --no-edit

[master 9daa507] Revert "Add print_hi function."
 1 file changed, 4 deletions(-)


In [19]:
git log --oneline --decorate --all

[?1h=[33m9daa507 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Revert "Add print_hi function."[m
[33ma9f7b16[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

*Now take a look at * `my_code.py`. *You should see that your changes have been undone.*

*Note that *`git revert`* does not actually erase anything from your project history. Instead it makes a new commit in which previous changes have been undone (this means that you can undo this revert as if it were any other commit.*

*Now add the change that you actually wanted:*
```
def print_bye():
    print 'bye'
```

In [20]:
git add my_code.py
git commit -m "Add print_bye function."

[master db94f78] Add print_bye function.
 1 file changed, 4 insertions(+)


In [21]:
git log --oneline --decorate --all

[?1h=[33mdb94f78 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_bye function.[m
[33m9daa507[m Revert "Add print_hi function."[m
[33ma9f7b16[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

### *Stashing*
*Git will not let you checkout a previous a commit if you have modified files in your working directory since your last commit.*

*For example, say that you've made another modification to your *`my_code.py`* by appending the line:*
```
def print_bye():
    pass
    # TODO: Fill in this later
```
*Now try to checkout one of your earlier commits and note that git won't let you do it.*

In [22]:
git checkout 19f5f66

error: Your local changes to the following files would be overwritten by checkout:
	my_code.py
Please, commit your changes or stash them before you can switch branches.
Aborting


*To save these changes for later but without committing them, use * `git stash`.

In [23]:
git stash

Saved working directory and index state WIP on master: db94f78 Add print_bye function.
HEAD is now at db94f78 Add print_bye function.


*You can view all your stashes with *`git stash list` *:*

In [24]:
git stash list

[?1h=stash@{0}: WIP on master: db94f78 Add print_bye function.[m
[K[?1l>

*Now since you've stashed our changes you can move your *`HEAD`* to a different commit.*

In [26]:
git checkout 19f5f66

Note: checking out '19f5f66'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 19f5f66... Add imports to code.


*Once you're done looking around go back to the most recent commit.*

In [27]:
git checkout master

Previous HEAD position was 19f5f66... Add imports to code.
Switched to branch 'master'


*If you look around after having returned to the most recent commit, you'll notice that the changes you stashed are no longer there. To get them back, we need to reapply the stashed changes by using the following command:*

In [28]:
git stash pop stash@{0}

# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#	[31mmodified:   my_code.py[m
#
no changes added to commit (use "git add" and/or "git commit -a")
Dropped stash@{0} (eb2c4433b9aa62b5550eafff64bc06c1fb9f1005)


*Finish up your edits to *`my_code.py`* and commit before moving on to the next section.*
```
def print_why():
    print 'why'
```

In [29]:
git add my_code.py
git commit -m "Add print_why function."

[master 4902e37] Add print_why function.
 1 file changed, 4 insertions(+)


# Branches

*Branches allow you to test out features, bug fixes, etc., before integrating them into your main project.*

*For example, say you want to add a new function to *`my_code.py`*. To do this create a branch called *`feature0`* that you can work on without messing up your main project.*

*IMPORTANT: The new branch inherits whatever changes are on the current branch. Most of the time, you'll want to be on *`master`* when you create a new branch to avoid inheriting changes from other branches.*

*Running *`git branch`* on its own shows you all the branches and indicates which one you're on.*

In [30]:
git branch

* [32mmaster[m


*Running* `git branch` *with an argument creates a new branch.*

In [31]:
git branch feature0



In [32]:
git branch

  feature0[m
* [32mmaster[m


*The following shows you all your commits, with the branches and *`HEAD`* labeled.*

In [33]:
git log --oneline --all --decorate

[?1h=[33m4902e37 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m, [1;32mfeature0[m[33m)[m Add print_why function.[m
[33mdb94f78[m Add print_bye function.[m
[33m9daa507[m Revert "Add print_hi function."[m
[33ma9f7b16[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

*Important note: a branch is just a pointer to specific commit. While it is often intuitively related to the tree structure of your project, this is not always the case.*

*If *`git checkout`* is followed by a branch name, it will move your *`HEAD`* to that branch.*

In [34]:
git checkout feature0

Switched to branch 'feature0'


In [35]:
git branch

* [32mfeature0[m
  master[m


*Now add the following function (your new feature) to my_code.py and commit your changes:*
```
def print_hello():
    print 'hello'
```

In [36]:
git add my_code.py
git commit -m "Add print_hello function."

[feature0 93278e3] Add print_hello function.
 1 file changed, 4 insertions(+)


In [37]:
git log --oneline --decorate --all

[?1h=[33m93278e3 ([1;36mHEAD[m[33m, [1;32mfeature0[m[33m)[m Add print_hello function.[m
[33m4902e37 ([1;32mmaster[m[33m)[m Add print_why function.[m
[33mdb94f78[m Add print_bye function.[m
[33m9daa507[m Revert "Add print_hi function."[m
[33ma9f7b16[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

*Now go back to your *`master`* branch.*

In [38]:
git checkout master

Switched to branch 'master'


In [39]:
git branch

  feature0[m
* [32mmaster[m


In [40]:
git log --oneline --decorate --all

[?1h=[33m93278e3 ([1;32mfeature0[m[33m)[m Add print_hello function.[m
[33m4902e37 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_why function.[m
[33mdb94f78[m Add print_bye function.[m
[33m9daa507[m Revert "Add print_hi function."[m
[33ma9f7b16[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

*Take a look at *`my_code.py`*. You can see that the changes you made in the *`feature0`* branch haven't affected the *`master`* branch.*

*If you are satisfied with your new feature, you can incorporate it into the main project by merging the feature0 branch into the master branch:*

In [41]:
git merge feature0

Updating 4902e37..93278e3
Fast-forward
 my_code.py | 4 [32m++++[m
 1 file changed, 4 insertions(+)


In [42]:
git log --oneline --decorate --all

[?1h=[33m93278e3 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m, [1;32mfeature0[m[33m)[m Add print_hello function.[m
[33m4902e37[m Add print_why function.[m
[33mdb94f78[m Add print_bye function.[m
[33m9daa507[m Revert "Add print_hi function."[m
[33ma9f7b16[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

*With *`git merge`*, the specified branch is merged into the branch you're on.*

*Once you have merged *`feature0`* you can delete it, since it is no longer needed.*

In [43]:
git branch -d feature0

Deleted branch feature0 (was 93278e3).


In [44]:
git branch

* [32mmaster[m


In [45]:
git log --oneline --decorate --all

[?1h=[33m93278e3 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_hello function.[m
[33m4902e37[m Add print_why function.[m
[33mdb94f78[m Add print_bye function.[m
[33m9daa507[m Revert "Add print_hi function."[m
[33ma9f7b16[m Add print_hi function.[m
[33m19f5f66[m Add imports to code.[m
[K[?1l>

### Handling merge conflicts
*Sometimes two branches will have made conflicting modifications to the same file.*

*The below is a shortcut for making and checking out a new branch. I.e., it is equivalent to:*
```
git branch feature1
git checkout feature1
```

In [46]:
git checkout -b feature1

Switched to a new branch 'feature1'


*Add the following function to *`my_code.py:`
```
def print_circle():
    print 'circle'
```

In [47]:
git add my_code.py
git commit -m "Add print_circle function."

[feature1 3979799] Add print_circle function.
 1 file changed, 4 insertions(+)


*Now go back to your *`master`* branch*

In [48]:
git checkout master

Switched to branch 'master'


*And add the following function to my_code.py:*
```
def print_square():
    print 'square'
```

In [49]:
git add my_code.py
git commit -m "Add print_square function."

[master b9aa1fb] Add print_square function.
 1 file changed, 4 insertions(+)


*If you now look at your project tree (using the --graph option), you can see that *`feature1`* has diverged from *`master`.

In [50]:
git log --oneline --all --decorate --graph

[?1h=* [33mb9aa1fb ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_square function.[m
[31m|[m * [33m3979799 ([1;32mfeature1[m[33m)[m Add print_circle function.[m
[31m|[m[31m/[m  [m
* [33m93278e3[m Add print_hello function.[m
* [33m4902e37[m Add print_why function.[m
* [33mdb94f78[m Add print_bye function.[m
* [33m9daa507[m Revert "Add print_hi function."[m
* [33ma9f7b16[m Add print_hi function.[m
* [33m19f5f66[m Add imports to code.[m
[K[?1l>

*Now a merge attempt will fail since the same file has been modified differently in the two branches.*

In [51]:
git checkout master
git merge feature1

Auto-merging my_code.py
CONFLICT (content): Merge conflict in my_code.py
Automatic merge failed; fix conflicts and then commit the result.


*Your friend *`git status`* can tell you where the conflicts are.*

In [52]:
git status

# On branch master
# You have unmerged paths.
#   (fix conflicts and run "git commit")
#
# Unmerged paths:
#   (use "git add <file>..." to mark resolution)
#
#	[31mboth modified:      my_code.py[m
#
no changes added to commit (use "git add" and/or "git commit -a")


*Now go look at the conflict in *`my_code.py`*. It will have the following syntax:*
```
...

<<<<<<< HEAD
def print_square():
    print 'square'
=======
def print_circle():
    print 'circle'
>>>>>>> feature1
```
*This indicates what changes were made on your current branch and what changes were made on *`feature1`. *It is up to you to figure whether you want one modification, both, or some combination of the two. Suppose that you decide to keep both functions. In that case, change the code in *`my_code.py`* to look like:*
```
...

def print_square():
    print 'square'


def print_circle():
    print 'circle'
```

*Now add your "fix" and commit it.*

In [53]:
git add my_code.py
git commit -m "merge"

[master e7e766a] merge


*Now if you look at your project tree you'll see that the two arms of the tree have come back together (note, however, that the *`feature1`* branch is not at the same commit as *`master`*).*

In [54]:
git log --oneline --all --decorate --graph

[?1h=*   [33me7e766a ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m merge[m
[31m|[m[32m\[m  [m
[31m|[m * [33m3979799 ([1;32mfeature1[m[33m)[m Add print_circle function.[m
* [32m|[m [33mb9aa1fb[m Add print_square function.[m
[32m|[m[32m/[m  [m
* [33m93278e3[m Add print_hello function.[m
* [33m4902e37[m Add print_why function.[m
* [33mdb94f78[m Add print_bye function.[m
* [33m9daa507[m Revert "Add print_hi function."[m
* [33ma9f7b16[m Add print_hi function.[m
* [33m19f5f66[m Add imports to code.[m
[K[?1l>

In [55]:
git status

# On branch master
nothing to commit (working directory clean)


*Now you can delete *`feature1`* since you've incorporated all of the changes you want into *`master`.

In [56]:
git branch -d feature1

Deleted branch feature1 (was 3979799).


*Note, however, even though the branch *`feature1`* has been deleted, the tree structure of the project before the merge does not change. We have only actually removed a pointer.*

In [57]:
git log --oneline --all --decorate --graph

[?1h=*   [33me7e766a ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m merge[m
[31m|[m[32m\[m  [m
[31m|[m * [33m3979799[m Add print_circle function.[m
* [32m|[m [33mb9aa1fb[m Add print_square function.[m
[32m|[m[32m/[m  [m
* [33m93278e3[m Add print_hello function.[m
* [33m4902e37[m Add print_why function.[m
* [33mdb94f78[m Add print_bye function.[m
* [33m9daa507[m Revert "Add print_hi function."[m
* [33ma9f7b16[m Add print_hi function.[m
* [33m19f5f66[m Add imports to code.[m
[K[?1l>

### Examples of merge conflicts:
* *A file was modified in two different branches in different ways.*
* *A file was modified in one branch and deleted in another.*
* *A file was renamed in one branch and deleted in another.*
* *And more!*

### Examples of nonconflicts:
* *A file was modified in only one branch.*
* *A file was deleted in only one branch.*
* *And more!*

# GitHub

### Pushing a repo to Github

*You have a git repository on your local machine that we want to put into a remote repo (on the internet) to back up your code. First create a repo on Github called *`my_project`.

*Now, we first need to see what remote directories your git repo knows about (if any):*

In [58]:
git remote -v



*The below command creates a link named *`origin`* that points at our remote repository. Technically, you can name your remote links anything you want, but *`origin`* is a convention for a remote repo that you own (i.e., a repo on Github that you own).*

In [59]:
git remote add origin https://github.com/your_account/my_project.git



In [60]:
git remote -v

origin	https://github.com/rkp8000/my_project.git (fetch)
origin	https://github.com/rkp8000/my_project.git (push)


*You can now *`push`* any branches to your remote repo on Github. This will copy any code differences from your local machine to the remote version on the internet. To accomplish this, use the command *`git push <remote destination> <local branch>`

*Note that unless your GitHub username and password have already been stored, you'll have to run the command below in your terminal, not in the notebook (because it will ask you for your username and password).*

In [61]:
git push origin master

Counting objects: 25, done.
Delta compression using up to 4 threads.
Compressing objects:   5% (1/17)   Compressing objects:  11% (2/17)   Compressing objects:  17% (3/17)   Compressing objects:  23% (4/17)   Compressing objects:  29% (5/17)   Compressing objects:  35% (6/17)   Compressing objects:  41% (7/17)   Compressing objects:  47% (8/17)   Compressing objects:  52% (9/17)   Compressing objects:  58% (10/17)   Compressing objects:  64% (11/17)   Compressing objects:  70% (12/17)   Compressing objects:  76% (13/17)   Compressing objects:  82% (14/17)   Compressing objects:  88% (15/17)   Compressing objects:  94% (16/17)   Compressing objects: 100% (17/17)   Compressing objects: 100% (17/17), done.
Writing objects:   4% (1/25)   Writing objects:   8% (2/25)   Writing objects:  12% (3/25)   Writing objects:  16% (4/25)   Writing objects:  20% (5/25)   Writing objects:  24% (6/25)   Writing objects:  28% (7/25)   Writing objects:  32% (8/25)   Writing obj

*Now if we go back to our GitHub site we'll see our master branch.*

# Collaborating on Github
*Now imagine that you want to work on a code base with one or more other people. There are [two main repository models](https://help.github.com/articles/using-pull-requests/) that are frequently used:*

1. Shared repository model

 *Here, everyone has write access to the same repository. In github, this is called "collaborator" access -- an unfortunate word choice for us scientists. This is more common for small, rapidly-developed code bases.* <br><br>
2. Fork & pull repository model

  *Here, a few supernerds have write access. Anyone can propose a change to the code (through a "pull request"), but one of the collaborators (with high privileges) must approve it. This is common for large projects involving more than a handful of coders.*

### Shared repository model

*If everyone has write access, there are only two repositories you have to worry about: your local repo and the *`origin`* repo. __Once you have committed your changes__, pushing your local changes to the remote repository is as easy as:*

In [None]:
git push origin master

### Forked repository model

*If you are using forked repos, there is one extra level of complexity. Now we'll have 3 repos to think about:*
 1. your local repo that lives on your computer
 2. your remote copy (or "fork") of (3) that belongs to your Github account on the internet
 3. the big collaborative repo*

*Let's start by forking an any dude/dudette's interesting and collaborative repo online (3). Click "Fork" on someone's Github repo page to make a copy of their collaborative repo (3) and put it in your remote repository (2) that you own. Then you download this repo (2) to your local computer (1) using *`git clone`:

In [None]:
git clone https://github.com/wronk/dbw_2015.git

*Now you have a copy of this tutorial in the current directory (wherever our terminal is). Start by creating/checkout out a new branch that will contain your changes. Check that you're on the new branch with* `git status`.

In [None]:
git checkout -b my_new_text_branch
git status

*Now add a new file (just a simple text file) to your local repo (1). Then use the following commands to check the status, add the file, and commit it:*

In [None]:
git add tumultuous_text.txt
git commit -m 'Add a new text file' tumultuous_text.txt

*Now push the changes to your remote repo (2).* 

In [None]:
git push origin my_new_text_branch

*You can see the results of pushing by going to your Github account page. On that page, there will be a green button to "Compare & Create Pull Request" (PR). This is how we take our changes and make a request to merge them into the big collaborative repo (3).*

*Think of the PR as a way to start a conversation about the proposed changes. After some review and conversation, the code may be merged, be rejected, or require some improvements before the repo owner(s) will merge it.*

*Any additional changes to the PR (like suggested improvements) will be included just by pushing to that same branch that's currently acting as the open PR.*

### Syncing your repo
*If there were changes to your remote repo (e.g., by other coders), you'll probably want to update your local code base to reflect those additions before making your own changes. To do this we use the *`git fetch`* command.*

*Calling *`git fetch`* tells git to get all the branches of a remote repository (including any changes). Call it like this:*

In [None]:
git fetch origin

*You can then attempt to merge the remote-tracking branch *`origin/master`* into your own master branch. Just like before, if there were merge conflicts, you'll have to handle them and then commit the modified files. Merge the branches by calling:*

In [None]:
git merge origin/master