In [1]:
from IPython.core.display import HTML
css_file = "./notebook_style.css"
HTML(open(css_file, 'r').read())

# 01 Version Control Refresh

### Why?
* Allows you to keep a comprehensive record of all changes made during a project. This is very useful if you later make a change that breaks something as you will be able to track bugs down much faster!
* Keeps a backup of files - this is useful if need to e.g. recreate some data (like a plot) that was made several months ago, as can easily 'rewind' code back to the state it was back then
* Also allows you to make changes without worrying about breaking something (e.g. using branches)
* If working in a group project, helps prevent different members of the group overwriting each others' changes (can merge multiple versions and use conflict resolution), can keep record of who wrote what
* No need for millions of different versions of files named e.g. `version_1.txt`, `version_2.txt`, ..., `version_27.txt`, `FINAL_version.txt`, `FINAL_FINAL_version.txt`, `FINAL_FINAL_PRINT_version.txt`.

### How?
* [git](https://git-scm.com/) is the most popular tool for local version control
* Alternatives: [Bazaar](http://bazaar.canonical.com/en/), [Subversion](https://subversion.apache.org/) (SVN), [Mercurial](https://www.mercurial-scm.org/)...
* Online tools for hosting repositories such as [github](https://github.com/), [gitlab](https://gitlab.com), [bitbucket](https://bitbucket.org), [sourceforge](https://sourceforge.net/), [launchpad](https://launchpad.net/)

## Why should I use version control?

As you are on this course, I assume that you are familiar with version control and probably use it in the vast majority of projects. However, it is worth reminding ourselves why we use it (and why it's worth the occasional battle with `git`'s rather unintuitive workflow).

<center>![Git](https://imgs.xkcd.com/comics/git.png  )
Git doesn't always make the most of sense - [xkcd](https://xkcd.com/1597/)</center>

First of all, version control allows us to keep a comprehensive record of all changes made during a project. It keeps track of exactly what was edited, when this was done and who by. This is very useful if later down the line a change is made that breaks something, as it will make tracking down and pinpointing the line of code responsuble for the bug much faster.

Version control allows us to keep backups of previous versions of files. This is important if you ever need to recreate some data that was made using a previous version of the code, as you can easily 'rewind' the code back to the state it was then. For example, say you have submitted a paper to a journal. A few months later, you get a response from one of the reviewers that they would like you to replot one of your figures. However, in the meantime you have been working on your code and so in its current state you cannot recreate the data in the plot. If you have used version control, this will not be an issue as you can simply roll your code back to the version you used to create the original plot.

When making major changes to a code, it can often be a little scary as there is a large chance said change could break everything. It may be tempting to make a separate version of your project by hand - `project_version_2`- however this can quickly get messy (eventually you will end up with folders full of different versions of your project). Version control offers a better solution, allowing you to create branches. The working version of the code can be preserved on the main master branch while you hack away at code on some branch. If you implement a change and it breaks everything, you can also instantly roll your code back to the previous working version thanks to version control having kept a handy set of backups for you.

<center>![Versions](http://www.phdcomics.com/comics/archive/phd101212s.gif)
Do not do this - [PHD Comics](http://www.phdcomics.com/comics/archive.php?comicid=1531)</center>

If you are working on a project with other people, version control will keep track of who makes what changes and can help prevent different members from overwriting each others' changes through conflict resolution. This becomes more important the more people there are working on a project - imagine trying to work on a document with 100 other people and communicating changes to the document solely by email!

<center>![Git branches](https://cdn-images-1.medium.com/max/400/1*naZweK-cwJpKRYp6Mp_EHg.png)
Even with git, projects can get messy - [Medium](https://medium.com/@dashersw/how-to-manage-git-workflow-and-stay-sane-e32405e9dbf0#.32rtsbmzh)</center>

For more on version control and reproducible research, check out [Tools for Reproducible Research](http://kbroman.org/Tools4RR/assets/lectures/06_org_eda_withnotes.pdf) by Karl Broman, where this excellent quote on the importance of version control comes from:

> **Your closest collaborator is you six months ago, but you don’t reply to emails.**

## How??

[git](https://git-scm.com/) is the most popular tool for local version control, however alternatives do exist: [Bazaar](http://bazaar.canonical.com/en/), [Subversion](https://subversion.apache.org/) (SVN), [Mercurial](https://www.mercurial-scm.org/)...
There are many online tools for hosting repositories out there, including [github](https://github.com/), [gitlab](https://gitlab.com), [bitbucket](https://bitbucket.org), [sourceforge](https://sourceforge.net/) and [launchpad](https://launchpad.net/).

## Read more
- [Git cheat sheet](https://services.github.com/on-demand/downloads/github-git-cheat-sheet.pdf)
- [think like a git](http://think-like-a-git.net/) if you want to go beyond the basics and understand what git is actually doing
- [A Crash Course in Python for Scientists](http://nbviewer.jupyter.org/gist/anonymous/5924718) - a good reference in case you need to quickly remind yourself of something
- [Python Cookbook](http://chimera.labs.oreilly.com/books/1230000000393) free to read online, contains lots of useful tricks for going beyond the basics in python
