# Version Control

# Introduction

Keeping track of changes in code, data analyses, presentations, and mansucripts is crucial to research. This can be difficult, even more so when collaborating with several people. Version control systems like [git](https://git-scm.com/) allow you to track changes and integrate your work with your collaborators'. [github](www.github.org) allows you to share your work, get feedback from a broader range of people, and collaborate on large projects.

The goal here is to give a brief overview of git and github. We'll cover the following:

* Installing, configuring, and using git
* Signing up for github and integrating with git

There are a lot of resources for learning more about version control, git, and github. Check out [this tutorial](http://rogerdudler.github.io/git-guide/) for a concise guide to git, and [this tutorial](https://www.atlassian.com/git/tutorials/) for more in depth coverage.

# Installing git

## Linux 

`sudo apt-get install git`

## Mac 

* Install [XCode](https://developer.apple.com/xcode/downloads/) and [XCode command line tools](http://osxdaily.com/2014/02/12/install-command-line-tools-mac-os-x/)
* Install [git](https://git-scm.com/)

## Windows

* Install [git](https://git-scm.com/)

# Configuring git

Let's configure a few things:

`git config --global user.name "Your user name here"`

`git config --global user.email "the_email_you_useb@something.com"`

Check the status of how things are configured with the following:

In [2]:
%%bash
git config --list

user.name=christopherahern
user.email=christopher.ahern@gmail.com
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true
remote.origin.url=git@github.com:IRCS-analysis-mini-courses/reproducible-research.git
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
branch.master.remote=origin
branch.master.merge=refs/heads/master


Some of these configuration options are stored in a configuration in your home directory. Note that this file contains configuration options for a user on your machine. You can broaden these settings to all users on a machine by creating and editing `/etc/gitconfig` or `/private/etc/gitconfig`. You can also provide more detailed settings for a particular project by editing `<repo>/.git/config`.



In [3]:
%%bash
cat ~/.gitconfig

[user]
	name = christopherahern
	email = christopher.ahern@gmail.com


You can [add aliases](http://top-frog.com/2013/05/16/a-few-handy-git-aliases/) to this file directly to customize git to your needs. For example, you can shorten the commmands you use the most, or set a default behavior for certain commands that will streamline your gitting.

# Local commands

We'll start by keeping things local. In fact, git can be used totally independently of github. The real payoff is that you have a detailed record of the changes you made [two months ago](https://twitter.com/kcranstn/status/370914072511791104). 

A basic usage flow will look something like this:

* Create or edit files
* `git add`
* `git commit`



To see this in action, first, create a directory with a `README` file.

In [4]:
%%bash
cd ~/Desktop/
mkdir test
cd test
touch README.md
echo "# test" >> README.md
echo "We've added a header." >> README.md
echo "Let's add some more text to the README file" >> README.md

Take a look at the contents of the file:

In [5]:
%%bash
cat ~/Desktop/test/README.md

# test
We've added a header.
Let's add some more text to the README file


Start using git to track changes:

In [6]:
%%bash
cd ~/Desktop/test/
git init
git status

Initialized empty Git repository in /home/chris/Desktop/test/.git/
On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	README.md

nothing added to commit but untracked files present (use "git add" to track)


Add the `README` to staging area:

In [7]:
%%bash
cd ~/Desktop/test
git add README.md
git status

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   README.md



Note that if we want to unstage the file, we just have to follow the instructions in the status message: `git rm --cached README.md`.

Let's say we want to commit the `README`:

In [8]:
%%bash
cd ~/Desktop/test
git commit -m "first commit"

git status

[master (root-commit) 967ff1a] first commit
 1 file changed, 3 insertions(+)
 create mode 100644 README.md
On branch master
nothing to commit, working directory clean


Great, now that we've got one file committed, let's add another.

In [13]:
%%bash
cd ~/Desktop/test
touch another-file.md
echo "# Introduction" >> another-file.md
echo "Start with the following..." >> another-file.md
git status

On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	another-file.md

nothing added to commit but untracked files present (use "git add" to track)


Let's walk through the process again, checking the status as we go:

In [14]:
%%bash
cd ~/Desktop/test/
git add .
git status
git commit -m "second commit"
git status

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	new file:   another-file.md

[master 053cb71] second commit
 1 file changed, 2 insertions(+)
 create mode 100644 another-file.md
On branch master
nothing to commit, working directory clean


Again, we can remove the staged files by following the commands listed in the status output. We can also revert to previous states of the project. To pick a state, let's look at the log:

In [11]:
%%bash
cd ~/Desktop/test/
git log --oneline

ec62de2 second commit
967ff1a first commit


We can undo the results of one of these commits using `git revert <commit>`. For example, if we wanted to get rid of `another-file.md` we would revert the change that added it. Note that this will remove `another-file.md` from the directory and add an entry to the log. If you want to undo the commit, but keep the files use `git reset HEAD^`.

## Public commands

We'll extend our local commands slightly to allow us to put our code out in public via github. If you haven't already, go to [github](https://github.com/) and follow the [instructions](https://help.github.com/articles/signing-up-for-a-new-github-account/) for creating a new account. You might also want to install a [git gui client](https://git-scm.com/download/gui/linux) unless you prefer doing things via the command line.

Another useful thing to do is to set up an [ssh key](https://help.github.com/articles/generating-ssh-keys/#platform-all) for GitHub. Check out these two videos for an [overview](https://www.youtube.com/watch?v=GSIDS_lvRv4) and a [bit of historical background](https://www.youtube.com/watch?v=YEBfamv-_do) of public key encryption along with some intuitive but detailed explanations. On a more practical level, this will mean you don't have to type in your user name and password everytime you want to update a repository that's stored on GitHub. Take a few minutes to set up an ssh key for your GitHub account. Note that it's possible to create and manage [ssh keys for multiple accounts](http://code.tutsplus.com/tutorials/quick-tip-how-to-work-with-github-and-multiple-accounts--net-22574). Also note that you can always go back and delete any ssh keys from both your computer and your GitHub account.

Now go to GitHub and create a new repository by clicking on the plus sign in the upper right and selecting the option to create a new repository. 

In [18]:
from IPython.display import Image, display
Image(url='https://help.github.com/assets/images/help/repository/repo-create.png')

Name the repository "test" and then click "Create repository".

In [29]:
Image(url='https://help.github.com/assets/images/help/repository/create-repository-name.png')

Now we'll tell git where we want to locate the repository on github. If you haven't set up an ssh key as suggested above the second to last line will be slightly different: `https://github.com/<username>/test.git`

In [15]:
%%bash
cd ~/Desktop/test/
git remote add origin git@github.com:christopherahern/test.git

We can push all of the changes made in our local repository to the remote repository:

In [None]:
%%bash
cd ~/Desktop/test/
git push -u origin master

Now we can check in on things:

In [9]:
%%bash
cd ~/Desktop/test
git status

On branch master
Your branch is up-to-date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	another-file.md

nothing added to commit but untracked files present (use "git add" to track)


# Social commands

The commands above will work if you are just using git to keep track of your own work, or are the sole contributor to a project that other people might download or view. If you want to collaborate with colleagues, you can add them as contributors to a repository via the settings on github.

To get changes that your collaborators have made to the repository, you can use the following commands:

`git pull`

which is a combination of `git fetch` which gets a copy of the remote repository and `git merge` which merges the fetched copy with your local repository.

You can get a static copy of someone elses repository by [forking it on github](https://help.github.com/articles/fork-a-repo/). In fact, try `forking` the [course materials](https://github.com/IRCS-analysis-mini-courses/reproducible-research) to your own profile. Navigate to the forked repositroy in your profile. On the right hand side you'll see a url to `clone` the repository. You can download a copy to your computer using the following:

`git clone <url>`

The copy of the repository that you forked and then cloned is static. If we make changes to the course materials, they won't automatically show up in the forked version. To keep track of changes made to the original repository, do the following:

`git remote add upstream <url of original repository>`

To [sync changes](https://help.github.com/articles/syncing-a-fork/) made to the original repository with your local fork, do the following:

`git fetch upstream`

`git checkout master`

`git merge upstream/master`

If you have a change you'd like to propose to a repository, you can do so by [creating a pull request](https://help.github.com/articles/using-pull-requests/). In fact, if you've forked the course materials and downloaded them locally, add a dataset to the `user-datasets` directory, push the result to your forked repository and create a pull request for it to be merged with the original repository.

# Version control with RStudio

`RStudio` has built-in git and github functionality. For an overview, see Hadley Wickham's [tutorial](http://r-pkgs.had.co.nz/git.html).