# Advanced Version Control
-------------------------------------------------------------

Git Repository for this Workshop: https://github.com/unmrds/cc-version-control

... for this notebook: https://github.com/unmrds/cc-version-control/blob/master/03-advanced-version-control.ipynb

Welcome and welcome back! Please note the notebooks for today's workshop have been added to the same repository as we used to present the introduction to Git and DVCS in December, 2017. Please feel free to refer to those materials for a refresher or background info. We may repeat some steps to get set up today, but generally we won't be going through those notebooks again. However, we are happy to answer any questions!

The concepts covered in the previous workshop focused on a basic workflow:

__Pull - - > Edit/Create/Delete -> Add -> Commit - - > Push__

![The basic workflow upon which all others are built ...](./images/basic_cycle.png)

Though in reality it's better to think about it as a sequence of changes, each of which is recorded in the history of the repository

![A sequential process ...](./images/basic_sequence.png)


In practice, and especially among collaborators, the basic workflow is often insufficient for complex tasks. 

![A complex collaborative process ...](./images/complex_sequence.png)

Today we will cover some additional concepts and commands of use in real life workflows:

* __Branching__ (`git checkout` and `git branch`) - making a copy of the current repository state to work on (within the repository) in parallel with the ongoing changes that might take place in the source branch.
* __Merging__ (`git merge`) - integrating the changes made in a separate branch back into the source branch (typically after testing)
* __Conflict resolution__ ([GitHub process documentation](https://help.github.com/articles/resolving-a-merge-conflict-using-the-command-line/#platform-windows)) - the process of resolving identified conflicts between changes made in parallel branches during the process of merging.

Also, sometimes you need to tune up the contents of your repository, or go back in time.

* __Ignoring files and directories__ (`git ignore`) - defining parts of the repository that should not be maintained in the history of the repository content. For example, large data files that are stored elsewhere, products of processes that generate documents or compiled applications based on the content of the repository. 
* __Rolling back commits__ (`git reset`) - bring the repository back to a previous commit state - preferably while maintaining the history of the repository throughout the entire process

Finally, you may want to work on a copy of someone else's repository, either for your own purposes, or to contribute changes to the original project.

* __Fork a Repository__ ([GitHub process documentation](https://help.github.com/articles/fork-a-repo/#platform-windows)) - Create a copy of a public repository owned by someone else into one that you have commit privileges for. 

* __Generate a *pull request*__ ([GitHub process documentation](https://help.github.com/articles/about-pull-requests/)) - Ask that changes you have made to your forked repository be integrated into the source repository from which the clone was created. 

In the intro session, we presented a handful of DVCS use cases. Before we get into the hands-on part of today's workshop, it may be interesting to look at some more of these use cases in practice and at scale.

## Git in the wild
---------------------------------------------------------------

### The Linux kernel

This is something of a meta-example, since Linus Torvalds initially developed Git as a VCS for the Linux kernel. Some interesting history is available on the [Git Wikipedia page](https://en.wikipedia.org/wiki/Git). 

The kernel is complex:

![A map of the Linux kernel](./images/Linux_kernel_map.png)

Image by Conan at English Wikipedia, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=6092674

A look at [Kernel.org](https://git.kernel.org/) demonstrates a lot of the benefits of DVCS:

* Many, distributed contributors
* Annotation, transparency, audit trails
* Change tracking

### Development of Sensor Firmware

From [*Avian's Blog*](https://www.tablix.org/~avian/blog/archives/code/index-page3.html)

![A visualization of the VENSA firmware repository history](https://www.tablix.org/~avian/blog/images2/2014/06/vesna_drivers_repository_branching_visualization.png)


## Branching Models

Many different strategies have been developed for using branching to manage concurrent and parallel development efforts. *GitFlow* is one popular model adopted by development teams to coordinate their work.

### "A Successful Git Branching Model"  - *GitFlow* - *by [Vincent Driessen](https://nvie.com/about/)*

![Summary Figure](https://nvie.com/img/git-model@2x.png)

[https://nvie.com/posts/a-successful-git-branching-model/](https://nvie.com/posts/a-successful-git-branching-model/)

[https://datasift.github.io/gitflow/IntroducingGitFlow.html](https://datasift.github.io/gitflow/IntroducingGitFlow.html)

## Time for Practice

[04-git-tutorial-deux](./04-git-tutorial-deux.ipynb)