# Version Control

In any software development, one of the most important tools is revision control software such as GIT (`git`) and Mercurial (`hg`). 

In the rest of this lecture we will look at `git`, although `hg` is just as good and works in almost exactly the same way.

The source code or digital content is stored in a **repository**.

The repository does not only contain the latest version of all files, but the complete history of all changes to the files since they were added to the repository.

A user can **checkout** the repository, and obtain a local working copy of the files. All changes are made to the files in the local working directory, where files can be added, removed and updated.

When a task has been completed, the changes to the local files are **commited** (saved to the repository).

If someone else has been making changes to the same files, a **conflict** can occur. In many cases conflicts can be resolved automatically by the system, but in some cases we might manually have to **merge** different changes together.

It is often useful to create a new **branch** in a repository, or a **fork** or **clone** of an entire repository, when we doing larger experimental development. The main branch in a repository is called often **master**. When work on a branch or fork is completed, it can be merged in to the master branch/repository.

With GIT or Mercurial, we can **pull** and **push** changesets between different repositories. For example, between a local copy of there repository to a central online reposistory (for example on a community repository host site like github.com).

### 1. Creat a repository

To create a brand new empty repository, we can use the command `git init repository-name`:

In [1]:
!git init gitdemo

Initialized empty Git repository in /Users/BeiciLiang/GitHub/ECS719-SoftwareCarpentry/gitdemo/.git/


Using the command `git status`
we get a summary of the current status of the working directory. It shows if we have modified, added or removed files.

In [11]:
cd gitdemo

/Users/BeiciLiang/GitHub/ECS719-SoftwareCarpentry/gitdemo


In [13]:
!git status

On branch master

Initial commit

nothing to commit (create/copy files and use "git add" to track)


### 2. Add files

To add a new file to the repository, we first create the file and then use the `git add filename` command.

There are a couple of ways to write the sentence "A file with information about the gitdemo repository." into a file named README.

- in terminal, run the command `nano README`, then type the sentence, and use `CTRL`+`O` to write out, finally use `CTRL`+`X` to exit;

- or run the command `echo "A file with information about the gitdemo repository." > README` to print the sentence into the file;

- in Jupyter Notebook, there are built-in cell magics such as `%%file` to help you write the contents of the cell to a file.


In [15]:
%%file README

A file with information about the gitdemo repository.

Writing README


In [16]:
!git status

On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31mREADME[m

nothing added to commit but untracked files present (use "git add" to track)


After having added the file README, the command `git status` list it as an untracked file.

In [17]:
!git add README

In [18]:
!git status

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	[32mnew file:   README[m



Now that it has been added, it is listed as a new file that has not yet been commited to the repository.

### 3. Commit 

In [19]:
!git commit -m "Added a README file" README

[master (root-commit) 42fafb8] Added a README file
 1 file changed, 2 insertions(+)
 create mode 100644 README


In [20]:
!git status

On branch master
nothing to commit, working tree clean


When files that is tracked by GIT are changed, they are listed as modified by `git status`:

In [21]:
%%file README

A file with information about the gitdemo repository.

A new line.

Overwriting README


In [22]:
!git status

On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[31mmodified:   README[m

no changes added to commit (use "git add" and/or "git commit -a")


Again, we can commit such changes to the repository using the `git commit -m "message"` command.

In [23]:
!git commit -m "added one more line in README" README

[master e85e791] added one more line in README
 1 file changed, 3 insertions(+), 1 deletion(-)


In [24]:
!git status

On branch master
nothing to commit, working tree clean


### 4. Remove files

To remove file that has been added to the repository, use `git rm filename`, which works similar to `git add filename`:

In [26]:
%%file tmpfile

A short-lived file.

Writing tmpfile


Add it:

In [27]:
!git add tmpfile

In [28]:
!git commit -m "adding file tmpfile" tmpfile

[master 6608496] adding file tmpfile
 1 file changed, 2 insertions(+)
 create mode 100644 tmpfile


Remove it again:

In [29]:
!git rm tmpfile

rm 'tmpfile'


In [30]:
!git commit -m "remove file tmpfile" tmpfile

[master b810ca8] remove file tmpfile
 1 file changed, 2 deletions(-)
 delete mode 100644 tmpfile


### 5. Commit Logs

The messages that are added to the commit command are supposed to give a short (often one-line) description of the changes/additions/deletions in the commit. If the `-m "message"` is omitted when invoking the git commit message an editor will be opened for you to type a commit message (for example useful when a longer commit message is requried).

We can look at the revision log by using the command `git log`:

In [34]:
!git log

[33mcommit b810ca89e9896166310ba3ad2d29bd46b9f9e9ff[m
Author: beiciliang <liangbeici@gmail.com>
Date:   Tue Mar 13 00:30:58 2018 +0000

    remove file tmpfile

[33mcommit 6608496f16fbcc9f90a1953bc70df1382b120fb4[m
Author: beiciliang <liangbeici@gmail.com>
Date:   Tue Mar 13 00:30:32 2018 +0000

    adding file tmpfile

[33mcommit e85e7917bdd47e73e1c5288484c4b6288b39d692[m
Author: beiciliang <liangbeici@gmail.com>
Date:   Tue Mar 13 00:28:34 2018 +0000

    added one more line in README

[33mcommit 42fafb866758b1924224aa95dafd667dc848eb1a[m
Author: beiciliang <liangbeici@gmail.com>
Date:   Tue Mar 13 00:23:10 2018 +0000

    Added a README file


### 6. Diffs

All commits results in a changeset, which has a "diff" describing the changes to the file associated with it. We can use `git diff` so see what has changed in a file:

In [35]:
%%file README

A file with information about the gitdemo repository.

README files usually contains installation instructions, and information about how to get started using the software (for example).

Overwriting README


In [36]:
!git diff README

[1mdiff --git a/README b/README[m
[1mindex 4f51868..d3951c6 100644[m
[1m--- a/README[m
[1m+++ b/README[m
[36m@@ -1,4 +1,4 @@[m
 [m
 A file with information about the gitdemo repository.[m
 [m
[31m-A new line.[m
\ No newline at end of file[m
[32m+[m[32mREADME files usually contains installation instructions, and information about how to get started using the software (for example).[m
\ No newline at end of file[m


### 7. Discard changes

To discard a change (revert to the latest version in the repository) we can use the checkout command like this:

In [37]:
!git checkout -- README

In [38]:
!git status

On branch master
nothing to commit, working tree clean


If we want to get the code for a specific revision, we can use `git checkout` and giving it the hash code for the revision we are interested as argument:

In [39]:
# Change the hash code to one of yours shown in the git log
!git checkout 42fafb866758b1924224aa95dafd667dc848eb1a

Note: checking out '42fafb866758b1924224aa95dafd667dc848eb1a'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 42fafb8... Added a README file


Now the content of all the files like in the revision with the hash code listed above (first revision)

In [40]:
!cat README


A file with information about the gitdemo repository.

We can move back to "the latest" (master) with the command:

In [42]:
!git checkout master

Previous HEAD position was 42fafb8... Added a README file
Switched to branch 'master'


In [43]:
!cat README


A file with information about the gitdemo repository.

A new line.

In [44]:
!git status

On branch master
nothing to commit, working tree clean


### 8. Branches

With branches we can create diverging code bases in the same repository. They are for example useful for experimental development that requires a lot of code changes that could break the functionality in the master branch. Once the development of a branch has reached a stable state it can always be merged back into the trunk. Branching-development-merging is a good development strategy when serveral people are involved in working on the same code base. But even in single author repositories it can often be useful to always keep the master branch in a working state, and always branch/fork before implementing a new feature, and later merge it back into the main trunk.

In GIT, we can create a new branch like this:

In [45]:
!git branch expr1

We can list the existing branches like this:

In [46]:
!git branch

  expr1[m
* [32mmaster[m


And we can switch between branches using checkout:

In [47]:
!git checkout expr1

Switched to branch 'expr1'



Make a change in the new branch.

In [48]:
%%file README

A file with information about the gitdemo repository.

README files usually contains installation instructions, and information about how to get started using the software (for example).

Experimental addition.

Overwriting README


In [49]:
!git commit -m "added a line in expr1 branch" README

[expr1 c5d52aa] added a line in expr1 branch
 1 file changed, 3 insertions(+), 1 deletion(-)


In [50]:
!git branch

* [32mexpr1[m
  master[m


We can merge an existing branch and all its changesets into another branch (for example the master branch) like this:

First change to the target branch:

In [53]:
!git checkout master

Already on 'master'


In [54]:
!git merge expr1

Updating b810ca8..c5d52aa
Fast-forward
 README | 4 [32m+++[m[31m-[m
 1 file changed, 3 insertions(+), 1 deletion(-)


In [55]:
!git branch

  expr1[m
* [32mmaster[m


We can delete the branch expr1 now that it has been merged into the master:

In [56]:
!git branch -d expr1

Deleted branch expr1 (was c5d52aa).


In [57]:
!git branch

* [32mmaster[m


In [58]:
!cat README


A file with information about the gitdemo repository.

README files usually contains installation instructions, and information about how to get started using the software (for example).

Experimental addition.

### 9. Pull and Push

If the respository has been cloned from another repository, for example on github.com, it automatically remembers the address of the parant repository (called origin):

In [62]:
!git remote

In [63]:
!git remote show origin

fatal: 'origin' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.


Here we didn't specify the remote repository before, so it could not be read. 

Now let's creat a new repository called *gitdemo* at Github.

Github.com is a git repository hosting site that is very popular with both open source projects (for which it is free) and private repositories (for which a subscription might be needed). With a hosted repository it easy to collaborate with colleagues on the same code base, and you get a graphical user interface where you can browse the code and look at commit logs, track issues etc.

To push our existing repository to the hosted repository:

In [65]:
# Please change beciliang to your own github username
!git remote add origin https://github.com/beiciliang/gitdemo.git

In [67]:
!git remote show origin

* remote origin
  Fetch URL: https://github.com/beiciliang/gitdemo.git
  Push  URL: https://github.com/beiciliang/gitdemo.git
  HEAD branch: (unknown)


In [68]:
!git push -u origin master

Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects:  11% (1/9)   Compressing objects:  22% (2/9)   Compressing objects:  33% (3/9)   Compressing objects:  44% (4/9)   Compressing objects:  55% (5/9)   Compressing objects:  66% (6/9)   Compressing objects:  77% (7/9)   Compressing objects:  88% (8/9)   Compressing objects: 100% (9/9)   Compressing objects: 100% (9/9), done.
Writing objects:   7% (1/13)   Writing objects:  23% (3/13)   Writing objects:  30% (4/13)   Writing objects:  38% (5/13)   Writing objects:  46% (6/13)   Writing objects:  53% (7/13)   Writing objects:  61% (8/13)   Writing objects:  69% (9/13)   Writing objects:  76% (10/13)   Writing objects:  84% (11/13)   Writing objects:  92% (12/13)   Writing objects: 100% (13/13)   Writing objects: 100% (13/13), 1.19 KiB | 0 bytes/s, done.
Total 13 (delta 2), reused 0 (delta 0)
remote: Resolving deltas:   0% (0/2)   [Kremote: Resolving deltas:  50% (1/2)   [K

We can retrieve updates from the origin repository by "pulling" changesets from "origin" to our repository:

In [69]:
!git pull origin

Already up-to-date.


We can register addresses to many different repositories, and pull in different changesets from different sources, but the default source is the origin from where the repository was first cloned (and the work origin could have been omitted from the line above).