<DIV ALIGN=CENTER>

# Source Code Version Control    
## Professor Robert J. Brunner
  
</DIV>  
-----
-----

## Source Code Version Control

Whether you are writing a document or developing a computer program,
eventually you will find a need for version control (or [revision
control](https://en.wikipedia.org/wiki/Revision_control). In standard
document editing software this is provided by the Undo feature, but we
often want s more powerful solution to version control that allows:

- Multiple users to work simultaneously
- Document who made a change and why
- Allow for recovery to a specific savepoint
- Allow for remote backup and recovery

Version control software has existed to provide one or more of these
features for many years. Currently, one of the more popular version
control software tools is `git`, in large part because of the popular
__github__ online repository that uses the git tool to provide an easy
version control tool for open source software development.

In this lesson, we will learn how to use `git` and __github__ for source code version control.

-----

In [1]:
# Uncomment to live browse the PHD Comics site.
# from IPython.display import HTML
# HTML('<iframe src=http://www.phdcomics.com/comics/archive.php?comicid=1531 width=800 height=800></iframe>')

![PHD Comics version control](images/phdcomics.gif)

-----
## Github

The most popular online source code control site is known as github,
since it is a __hub__ of software projects that are all archived on the
github website by using the `git` source code version control software.
While you can use the `git` software locally, the real power of source
code version control comes from working collaboratively on a project. A
company or organization can run their own, secure, `git` server to
support shared development, or you can use the __github__ web site as
your server. __github__ by default supports open source development with
public repositories, but you can purchase _enterprise_ support to allow
private repositories.

To use __github__, you first need to sign up for an account, after which
you can either use the `git` software tool, as demonstrated later in
this notebook, or the github desktop tool to create repositories, to
commit changes, and to sync your local repository with the public
repository maintained by __github__.

-----

In [2]:
# Uncomment to live browse the course github site.
# from IPython.display import HTML
# HTML('<iframe src=https://github.com/ProfessorBrunner width=800 height=800></iframe>')

## Github Account

In this course, we use __github__ to provide user authentication. Thus
you need a __github__ account to use the course JupyterHub server. If
you already have a __github__ account, you can reuse your existing
github credentials. If not, you need to go to the [Github
Website](http://github.com), and sign up for a new account by clicking
on the __Sign up for Github__ button on the main github website, as
shown below, and following the directions.

![Github website](images/github-website.png)

You will use these credentials to access the course JupyterHub Server,
so be sure to record them for future use.

-----

## Github Desktop Client

While `git` and __github__ were designed to be used from the Unix
command line, __github__ can also be accessed via a Desktop client
application, as shown below.

![Github Desktop Client](images/github-appclient.png)

The __github__ desktop application can be downloaded directly from the
[__github__ site](https://desktop.github.com), shown in the following screenshot.

![Github Desktop Client](images/github-app.png)

From this application you can clone github repositories, easily by
clicking the _Clone in Desktop_ button on the left hand side of the
repositories homepage, as shown below for this courses' github
repository. The app will also track changed files, including new files
or deleted files, so that you can commit changes by simply clicking the
_Commit to master_ button. To push these changes to the __github__
repository, simply click the _Sync_ button.

![Github Clone Button ](images/github-button.png)

-----

## `git`

[`git`](http://git-scm.com) is a popular, free, open source version
control software tool. `git` can support fully distributed projects, by
using a shared server.

![Git Website Note](images/git-website.png)

In this course, you will not be required to frequently use `git`,
however, you can do so if you wish to use a source code versioning
system for your class notes and software assignments. If you wish to do
this, you can use `git` and (optionally) __github__ from within a
terminal window connected to your JupyterHub docker container. The rest
of this document demonstrates how to do this.

-----

In [3]:
# We could uncomment to display and live browse the git website
# from IPython.display import HTML
# HTML('<iframe src=http://git-scm.com width=800 height=500></iframe>')

## Install & Configure `git`

You can install and setup `git` from a terminal window within your Docker container:

1. Start a new terminal window (you can acesss a terminal window by
clicking the _new_ button on your JupyterHub homepage)
2. Test to see if `git` is already installed by entering `git` at the container prompt.  
2.1. `git` should be installed in the container so you should see the `git` usage displayed.   
2.2. If `git` is not installed, you can easily [download][1] and install `git`.  
3. Once `git` is installed, we should set several configuration parameters   
3.1. We can specify a name: `git config --global user.name "Your Name Here"`  
3.2. We can specify an email: `git config --global user.email "Your Email Here"`  
3.3. We might need to define a proxy (e.g., http) to use a distributed http://www.git-scm.com/downloads repository.  
3.4. We could set a number of other options, including colorization or default editor (e.g., `vim`).  
 
![Git Configure](images/git-config.png)

Not in the images in this Notebook, the prompt in the terminal window
shows `temp_host`. Your prompt will be similar, but will likely have a
different name like `rppds`. This is normal, and simply reflects a
different default hostname used in a particular class.

-----
[1]: http://www.git-scm.com/downloads

## Create a `git` Repository

Once `git` is installed and configured, we can create a new repository.
In this example, we will create a new directory, and add a single file.
but you can take an existing set of files and directories and transform
them into a `git` repository by following the same steps.

First, we will create a new directory, called `project`:

    data_scientist@temp_host:~$ mkdir projects

Second, change into this new directory and create a single file called
`plan`. We can write some placeholder text in this file by using the
`echo` command (alternatively we could use an editor like `vim`).

    data_scientist@temp_host:~$ cd projects

At this point, we have a project directory. We can create a repository
for this file by using the `git init` command:

    data_scientist@temp_host:~$ git init
     
![Git Init](images/git-init.png)

Notice how in the new directory listing, we have a new sub-directory
called `.git`. This sub-directory contains the tracking information for
this project; deleting this directory will remove the entire project
history.

-----

## Modifying a `git` repository

Now that our repository is initialized, we can add a file to track. First, we create a file called `plan`. We write placeholder text into this file by using the `echo` command (alternatively we could use an editor like vim).

    data_scientist@temp_host:~$ echo "Todo: Write the great American novel!" > plan

At this point, we have a project directory containing one file. We can always track the status of our repository by using the `git status` comand:

    data_scientist@temp_host:~$ git status

The output of this command states that there is an untracked file in our project directory. To have git track this file, we need to `add` it to the repostiory, which we do with the `git add` command:

    data_scientist@temp_host:~$ git add plan

![Git Add](images/git-add.png)

At this point we have a repository with one file.

-----

## Committing a `git` repository

If we check the status of our git repository at this point, we see that
we have uncommitted changes. To save the current state of the repository,
we use the `git commit` command. When making a commit, we are given the
option of writing detailed comments on what changes are being made. This
forms an important history, and allows you, or other members of your
team to understand what was changed and why at some later date.

The commit message can be directly specified on the command line by
using the `-m` flag, which is fine for short messages.

    data_scientist@temp_host:~$ git commit -m"Added new life plan."

We can now check the status of our git repository by issuing a new `git
status` command.

![Git Commit](images/git-commit.png)

Our repository is now up-to-date.

-----

## Distributed Repositories

So far we have only worked with a local repository. This can be useful to provide version control for individuals, or on a shared server. But the real power of git arises when you work on a distributed server. We will not cover setting up and using a general distributed git server in this course. Instead, we will use the online [github site](https://github.com).

-----

## Using _github_

To use _github_, you need a working `git` installation, and, if you will
be uploading files to a _github_ repository,  you need to choose and set a
proxy method since you will need to repeatedly
[authenticate](https://help.github.com/articles/set-up-git/#next-steps-
authenticating-with-github-from-git) to _github_. [Connecting over
HTTPS](https://help.github.com/articles/which-remote-url-should-i-use#
cloning-with-https-recommended) is recommended and is the easiest to
setup, so for now we will use this method.

The [general
instructions](https://help.github.com/articles/caching-your-github-
password-in-git/#platform-linux) for caching your _github_ password in
git are located on the _github_ website. In our case, we can simply set
two additional configuration parameters:

    $ git config --global credential.helper cache  
$ git config --global credential.helpe 'cache --timeout=3600'

Now we can clone an existing repository on _github_ into our local
repository. Note that if we do this by using the _github_ web interface,
this process is known as forking.

![Git CLone](images/git-clone.png)

### <font color='red'>Warning</font>: Do not upload sensitive information to _github_!

-----

### Additional References

1. The [git tutorial](http://git-scm.com/book/en/v1/Getting-Started)
2. [Try out git](https://try.github.io/levels/1/challenges/1) online.
3. How to [setup github](https://help.github.com/articles/set-up-git/).

-----

### Return to the [Course Index](index.ipynb) page.

-----