# Git and Github Tutorial
This tutorial will walk you through some of the basics of git (repos, comitting, stashing, branches) before applying some of these concepts in a social coding context (i.e., Github). 

IMPORTANT: You must either use a bash shell or rewrite these commands in a terminal. This is not python code.

Reasons to use Git
1. Version control with branches
2. Free and open source
3. Lends itself well to collaborative coding tools (like Github)

Reasons to use Github
1. Greatly facilitated collaborative coding through repositories (or "repos")
2. Issue tracking (both code bugs and suggested enhancements)
3. Ease of reviewing code changes
4. Continuous integration (CI) for automatic code checking
5. Octocat

# The basics

In [1]:
git init my_project

Initialized empty Git repository in /Users/rkp/Dropbox/Repositories/dbw_2015/tutorial_git/my_project/.git/


In [2]:
cd my_project



In [2]:
ls -a

[34m.[m[m/                  [34m..[m[m/                 [34m.ipynb_checkpoints[m[m/ tutorial.ipynb


In [4]:
git status

# On branch master
#
# Initial commit
#
nothing to commit (create/copy files and use "git add" to track)


### *Creating a new file and committing it.*
*Make a file called "my_code.py" inside the "my_project" directory and add the following lines:*
```
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
```

In [5]:
git status

# On branch master
#
# Initial commit
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#	[31mmy_code.py[m
nothing added to commit but untracked files present (use "git add" to track)


*The command *`git add <file>`* moves a modified file to the staging area.*

In [6]:
git add my_code.py



In [7]:
git status

# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#	[32mnew file:   my_code.py[m
#


*Note: files can be removed from the staging area using * `git reset`.

*The command *`git commit -m <message>`* saves the changes in the staging area to your project history.*

In [8]:
git commit -m "Start my code."

[master (root-commit) 1c7f463] Start my code.
 1 file changed, 3 insertions(+)
 create mode 100644 my_code.py


In [9]:
git status

# On branch master
nothing to commit (working directory clean)


### *Committing modifications to a file.*
*Add the following lines to my_code.py:*
```
def print_hi():
    print 'hi'
```

In [11]:
git add my_code.py
git commit -m "Add print_hi function."

[master 0afca13] Add print_hi function.
 1 file changed, 4 insertions(+)


*You can use *`git log`* (with various options) to view your project history. (Note that this will look nicer in your terminal window.)*

In [12]:
git log --oneline --decorate --all

[?1h=[33m0afca13 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_hi function.[m
[33m1c7f463[m Start my code.[m
[K[?1l>

### *Looking at past commits*
*You can view a previous commit using:* `git checkout <commit>` 

*(Note that commit hashes are essentially random and will be specific to each repository.)*

In [15]:
git checkout 1c7f463

Note: checking out '1c7f463'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 1c7f463... Start my code.


*Now go take a look at `my_code.py`. You won't see your recent changes there. Why not?*

In [16]:
git log --oneline --decorate --all

[?1h=[33m0afca13 ([1;32mmaster[m[33m)[m Add print_hi function.[m
[33m1c7f463 ([1;36mHEAD[m[33m)[m Start my code.[m
[K[?1l>

*If you wanted to modify this code you would make a new branch from this commit, which we'll talk about later.*

*To return the HEAD to the most recent commit, use:*  `git checkout master`

In [17]:
git checkout master

Previous HEAD position was 1c7f463... Start my code.
Switched to branch 'master'


*If you want to make sure this worked correctly and took you back to your most recent commit, look at *`my_code.py`.

In [18]:
git log --oneline --decorate --all

[?1h=[33m0afca13 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_hi function.[m
[33m1c7f463[m Start my code.[m
[K[?1l>

*To undo a commit use:* `git revert HEAD` *(or* `git revert HEAD~3..HEAD` *to undo the last three commits, for example).*

*Note that below we've used the *`--no-edit`* option, since Jupyter won't open a text editor. If you run this in the command line without the *`--no-edit`* option, you'll be taken to a text editor (most likely vim) that allows you to add more detail to your commit message. To enter text press "i", which will take you to insert mode. To exit insert mode press "esc". To close the text editor and save the commit type ":q" and hit "return".*

In [19]:
git revert HEAD --no-edit

[master 37d67c1] Revert "Add print_hi function."
 1 file changed, 4 deletions(-)


In [20]:
git log --oneline --decorate --all

[?1h=[33m37d67c1 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Revert "Add print_hi function."[m
[33m0afca13[m Add print_hi function.[m
[33m1c7f463[m Start my code.[m
[K[?1l>

*Now take a look at * `my_code.py`. *You should see that your changes have been undone.*

*Note that *`git revert`* does not actually erase anything from your project history. Instead it makes a new commit in which previous changes have been undone (this means that you can undo this revert as if it were any other commit.*

*Now add the change that you actually wanted:*
```
def print_bye():
    print 'bye'
```

In [21]:
git add my_code.py
git commit -m "Add print_bye function."

[master 3f73229] Add print_bye function.
 1 file changed, 4 insertions(+)


In [22]:
git log --oneline --decorate --all

[?1h=[33m3f73229 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_bye function.[m
[33m37d67c1[m Revert "Add print_hi function."[m
[33m0afca13[m Add print_hi function.[m
[33m1c7f463[m Start my code.[m
[K[?1l>

### *Stashing*
*Git will not let you checkout a previous a commit if you have modified files in your working directory since your last commit.*

*For example, say that you've made another modification to your my_code.py by appending the line:*
```
def print_why():
    # I'll fill in this later
```
*Now try to checkout one of your earlier commits and note that git won't let you do it.*

In [24]:
git checkout 1c7f463

error: Your local changes to the following files would be overwritten by checkout:
	my_code.py
Please, commit your changes or stash them before you can switch branches.
Aborting


*To save these changes for later but without committing them, use * `git stash`.

In [25]:
git stash

Saved working directory and index state WIP on master: 3f73229 Add print_bye function.
HEAD is now at 3f73229 Add print_bye function.


*You can view all your stashes with *`git stash list` *:*

In [26]:
git stash list

[?1h=stash@{0}: WIP on master: 3f73229 Add print_bye function.[m
[K[?1l>

*Now since you've stashed our changes you can move your HEAD to a different commit.*

In [27]:
git checkout 1c7f463

Note: checking out '1c7f463'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 1c7f463... Start my code.


*Once you're done looking around go back to the most recent commit.*

In [28]:
git checkout master

Previous HEAD position was 1c7f463... Start my code.
Switched to branch 'master'


*If you look around after having returned to the most recent commit, you'll notice that the changes you stashed are no longer there. To get them back use the following command:*

In [29]:
git stash pop stash@{0}

# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#	[31mmodified:   my_code.py[m
#
no changes added to commit (use "git add" and/or "git commit -a")
Dropped stash@{0} (4c60a2bf42a97512347a17d5ff6668d11604a938)


*Finish up your edits to my_code.py and commit before moving on to the next section.*
```
def print_why():
    print 'why'
```

In [30]:
git add my_code.py
git commit -m "Add print_why function."

[master f0bc70e] Add print_why function.
 1 file changed, 5 insertions(+), 1 deletion(-)


# Branches

*Branches allow you to test out features, bug fixes, etc., before integrating them into your main project.*

*For example, say you want to add a new function to my_code.py. To do this create a branch named 'feature0' that you can work on without messing up your main project.*

*IMPORTANT: The new branch inherits whatever changes are on the current branch. Most of the time, you'll want to be on *`master`* when you create a new branch to avoid inheriting changes from other branches.*

In [31]:
git checkout master
git branch feature0



*Running *`git branch`* on its own shows you all the branches and tells you which one you're on.*

In [32]:
git branch

  feature0[m
* [32mmaster[m


*The following shows you all your commits, with the branches and HEAD labeled.*

In [33]:
git log --oneline --all --decorate

[?1h=[33mf0bc70e ([1;36mHEAD[m[33m, [1;32mmaster[m[33m, [1;32mfeature0[m[33m)[m Add print_why function.[m
[33m3f73229[m Add print_bye function.[m
[33m37d67c1[m Revert "Add print_hi function."[m
[33m0afca13[m Add print_hi function.[m
[33m1c7f463[m Start my code.[m
[K[?1l>

*Important note: a branch is just a pointer to specific commit. While it is often intuitively related to the tree structure of your project, this is not always the case.*

*If *`git checkout`* is followed by a branch name, it will move your HEAD to that branch.*

In [34]:
git checkout feature0

Switched to branch 'feature0'


In [35]:
git branch

* [32mfeature0[m
  master[m


*Now add the following function (your new feature) to my_code.py and commit your changes:*
```
def print_hello():
    print 'hello'
```

In [36]:
git add my_code.py
git commit -m "Add print_hello function."

[feature0 d4b6203] Add print_hello function.
 1 file changed, 5 insertions(+), 1 deletion(-)


*Now go back to your master branch.*

In [37]:
git checkout master

Switched to branch 'master'


In [38]:
git branch

  feature0[m
* [32mmaster[m


*Take a look at my_code.py. You can see that the changes you made in the *`feature0`* branch haven't affected the master branch.*

*If you are satisfied with your new feature, you can incorporate it into the main project by merging the feature0 branch into the master branch:*

In [39]:
git merge feature0

Updating f0bc70e..d4b6203
Fast-forward
 my_code.py | 6 [32m+++++[m[31m-[m
 1 file changed, 5 insertions(+), 1 deletion(-)


*With *`git merge`*, the specified branch is merged into the branch you're on.*

*Once you have merged *`feature0`* you can delete it, since it is no longer needed.*

In [40]:
git branch -d feature0

Deleted branch feature0 (was d4b6203).


### Handling merge conflicts
*Sometimes two branches will have made conflicting modifications to the same file.*

*The below is a shortcut for making and checking out a new branch. I.e., it is equivalent to:*
```
git branch feature1
git checkout feature1
```

In [41]:
git checkout -b feature1

Switched to a new branch 'feature1'


*Add the following function to my_code.py:*
```
def print_circle():
    print 'circle'
```

In [42]:
git add my_code.py
git commit -m "Add print_circle function."

[feature1 2c50c54] Add print_circle function.
 1 file changed, 5 insertions(+), 1 deletion(-)


*Now go back to your master branch*

In [43]:
git checkout master

Switched to branch 'master'


*And add the following function to my_code.py:*
```
def print_square():
    print 'square'
```

In [44]:
git add my_code.py
git commit -m "Add print_square function."

[master 192e83f] Add print_square function.
 1 file changed, 5 insertions(+), 1 deletion(-)


*If you now look at your project tree (using the --graph option), you can see that *`feature1`* has diverged from *`master`.

In [45]:
git log --oneline --all --decorate --graph

[?1h=* [33m192e83f ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m Add print_square function.[m
[31m|[m * [33m2c50c54 ([1;32mfeature1[m[33m)[m Add print_circle function.[m
[31m|[m[31m/[m  [m
* [33md4b6203[m Add print_hello function.[m
* [33mf0bc70e[m Add print_why function.[m
* [33m3f73229[m Add print_bye function.[m
* [33m37d67c1[m Revert "Add print_hi function."[m
* [33m0afca13[m Add print_hi function.[m
* [33m1c7f463[m Start my code.[m
[K[?1l>

*Now a merge attempt will fail since the same file has been modified differently in the two branches.*

In [46]:
git merge feature1

Auto-merging my_code.py
CONFLICT (content): Merge conflict in my_code.py
Automatic merge failed; fix conflicts and then commit the result.


*Your friend *`git status`* can tell you where the conflicts are.*

In [47]:
git status

# On branch master
# You have unmerged paths.
#   (fix conflicts and run "git commit")
#
# Unmerged paths:
#   (use "git add <file>..." to mark resolution)
#
#	[31mboth modified:      my_code.py[m
#
no changes added to commit (use "git add" and/or "git commit -a")


*Now go look at the conflict in my_code.py. It will have the following syntax:*
```
...

<<<<<<< HEAD
def print_square():
    print 'square'
=======
def print_circle():
    print 'circle'
>>>>>>> feature1
```
*This indicates what changes were made on your current branch and what changes were made on *`feature1`. *It is up to you to figure whether you want one modification, both, or some combination of the two. Suppose that you decide to keep both functions, such that my_code.py looks like:*
```
...

def print_square():
    print 'square'


def print_circle():
    print 'circle'
```

*Now add your "fix" and commit it.*

In [48]:
git add my_code.py
git commit -m "merge"

[master 1aff513] merge


*Now if you look at your project tree you'll see that the two arms of the tree have come back together (note, however, that the *`feature1`* branch is not at the same commit as *`master`*).*

In [49]:
git log --oneline --all --decorate --graph

[?1h=*   [33m1aff513 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m merge[m
[31m|[m[32m\[m  [m
[31m|[m * [33m2c50c54 ([1;32mfeature1[m[33m)[m Add print_circle function.[m
* [32m|[m [33m192e83f[m Add print_square function.[m
[32m|[m[32m/[m  [m
* [33md4b6203[m Add print_hello function.[m
* [33mf0bc70e[m Add print_why function.[m
* [33m3f73229[m Add print_bye function.[m
* [33m37d67c1[m Revert "Add print_hi function."[m
* [33m0afca13[m Add print_hi function.[m
* [33m1c7f463[m Start my code.[m
[K[?1l>

In [50]:
git status

# On branch master
nothing to commit (working directory clean)


*Now you can delete *`feature1`* since you've incorporated all of the changes you want into *`master`.

In [51]:
git branch -d feature1

Deleted branch feature1 (was 2c50c54).


*Note, however, even though the branch *`feature1`* has been deleted, the tree structure of the project before the merge does not change. We have only actually removed a pointer.*

In [52]:
git log --oneline --all --decorate --graph

[?1h=*   [33m1aff513 ([1;36mHEAD[m[33m, [1;32mmaster[m[33m)[m merge[m
[31m|[m[32m\[m  [m
[31m|[m * [33m2c50c54[m Add print_circle function.[m
* [32m|[m [33m192e83f[m Add print_square function.[m
[32m|[m[32m/[m  [m
* [33md4b6203[m Add print_hello function.[m
* [33mf0bc70e[m Add print_why function.[m
* [33m3f73229[m Add print_bye function.[m
* [33m37d67c1[m Revert "Add print_hi function."[m
* [33m0afca13[m Add print_hi function.[m
* [33m1c7f463[m Start my code.[m
[K[?1l>

### Examples of merge conflicts:
* *A file was modified in two different branches in different ways.*
* *A file was modified in one branch and deleted in another.*
* *A file was renamed in one branch and deleted in another.*
* *And more!*

### Examples of nonconflicts:
* *A file was modified in only one branch.*
* *A file was deleted in only one branch.*
* *And more!*

# Remote branches and GitHub

*Make a new GitHub repo (http://github.com)*

*Once your GitHub repo has been made public, anyone can make a copy of it on their own machine using:*

```
git clone https://github.com/your_username/my_project
```

*Your goal, however, is to connect your local repo (on your computer) to your remote repo (on the internet). To do this, we first need to see what remote directories your git repo knows about: *

In [54]:
git remote -v

origin	https://github.com/rkp8000/my_project (fetch)
origin	https://github.com/rkp8000/my_project (push)


*The below command creates a link named *`origin`* that points at our remote repository. Technically, you can name your remote links anything you want, but *`origin`* is a convention for a remote repo that you own (i.e., in your Github repo).

In [53]:
git remote add origin https://github.com/rkp8000/my_project.git



*You can now *`push`* any branches to your remote repo on Github. This will copy any code differences from your local machine to the remote version on the internet. The command *`git push <remote destination> <local branch>`* allows you to try to merge a local branch with a branch on your remote repo.*

In [None]:
git push origin master

*Now if we go back to our GitHub site we'll see our master branch.*

# Collaborating
*There are two main repository models that are frequently used:*

1. [Shared repository model](https://help.github.com/articles/using-pull-requests/)

 *Here, everyone has write access to the same repository. In github, this is called "collaborator" access -- an unfortunate word choice for us scientists. This is more common for small, rapidly-developed code bases. [more details here]* <br><br>
2. [Fork & pull repository model](https://help.github.com/articles/using-pull-requests/)

  *Here, a lucky few have write access. Anyone can propose a change to the code (through a "pull request"), but one of the collaborators must approve it. This is common for large projects involving more than a handful of coders.*

### Syncing your repo
*If there were changes to your remote repo (e.g., by other coders), you'll probably want to update your local code base to reflect those additions before making your own changes. To do this we use the `fetch` command.*

*Calling `fetch` tells git to get all the branches of a remote repository (including any changes). Call it like this:*

In [1]:
git fetch origin

SyntaxError: invalid syntax (<ipython-input-1-0bbf05c3c361>, line 1)

*You can then attempt to merge the remote-tracking branch *`origin/master`* into your own master branch. Just like before, if there were merge conflicts, you'll have to handle them and then commit the modified files. Merge the branches by calling:*

In [None]:
git merge origin/master

*Now you are ready to make your own changes to the code (woohoo! finally!) and then push them back to a remote repository. They way those changes will be integrated depends on your repository's model...*

### Shared repository model

*If everyone has write access, there are only two repositories you have to worry about: your local repo and the *`origin`* repo. __Once you have committed your changes__, pushing your local changes to the remote repository is as easy as:*

In [None]:
git push origin master

### Forked repository model

If you are using forked repos, there is one extra level of complexity. Now we'll have 3 repos to worry about:
 1. the big collaborative repo
 2. your remote copy (or "fork") of 1. that lives on the internet
 3. your local repo that lives on your computer