# Intro to version control

## Because this.... 

<img src="img/phd_final_doc.png" alt="drawing" width="400"/>





## ... leads to this 

<img src="img/phd_story_told_infilenames.gif" alt="drawing" width="400"/>


## Why is version management important? 

- Possible to revert back to a working version if things broke.
- Benefit team collaboration.
- Improve efficiency.

## How should we manage changes? 

### Keeping track of changes: 

- Back up (almost) everything created by a human as soon as it is created.
- Keep changes small.
- Share changes frequently.
- Create, maintain and use a checklist for saving and sharing changes to the project. 
- Store each project in a folder that is mirrored off the researchers' working machine.  



This list comes from "Keeping track of changes" in swcarpentry's paper [good-enough practices in scientific computing](https://swcarpentry.github.io/good-enough-practices-in-scientific-computing/).

## Exercise 1: Manual versioning 

Versions can be managed either by hand or by using a Version Control System (VCS). To illustrate the workings of a VCS we start an excercise using manual versioning. 
The goals of this excercise are: 
- Practice with versioning best practices
- Understand the limitations of manual version management



This exercise should be done as a demo in plenum together with the students , so they faced all the necessary steps for manual versioning 

## 1A Setting up the project 
 
1. Create a folder named `simple_trigonometry`, in your Desktop. This folder is your project folder. 

```bash
mkdir ~/Desktop/simple_trigonometry
```






2. Move to that folder and add a file called `CHANGELOG.txt` to your project folder.

```bash
cd Desktop/simple_trigonometry
touch CHANGELOG.txt
```
The aim of the CHANGELOG.txt is to make dated notes about changes to the project in this file in reverse chronological order (i.e., most recent first). This file is the equivalent of a lab notebook, and should contain entries like those shown below.

```markdown
## 2016-04-08

* Switched to cubic interpolation as default.
* Moved question about family disease history to end of questionnaire.

## 2016-04-06

* Added option for cubic interpolation.
* Removed question about blood pressure.
```


## 1B Copy the entire project whenever a significant change has been made

(i.e., one that materially affects the results), and store that copy in a sub-folder whose name reflects the date in the area that's being synchronized. This approach results in projects being organized as shown below:

```bash
.
|-- project_name
|   -- current
|       -- ...project content as described earlier...
|   -- 2016-03-01
|       -- ...content of 'current' on Mar 1, 2016
|   -- 2016-02-19
|       -- ...content of 'current' on Feb 19, 2016
```

## 1C Example

Add your changes every time you finish a bulletpoint. 

* Create a new file called `test.py` and add a function to `test.py` to calculate the circumference of a circle. Add your changes to the CHANGELOG.txt, and copy the entire project to a subfolder whose name reflects the data of the change. 

```python
def Perimeter(r):
    return 2*pi*r
```


* Create a new file called `script.py` that is empty. Add your changes in the same manner as before 

```bash
touch script.py
```



* Add some print statement to `script.py`. Add your changes. 

```python
print("This is a script")
```

* Show your CHANGELOG.txt file 

```bash
cat CHANGELOG.txt
```

## Problems with manual version control 

- It requires a lot of discipline, systematicity and consistency. 


- Imagine tracking all changes by a team that collaborate in the same project folder , just manually !!! 

- Imagine that two members of the team modifies the same line from the file `test.py`. 
    - It is virtually impossible to resolve conflicts, with a manual version control  system 

- Exponentially growth of your project in terms of space. 

# Git as the most used version control system

## Git is a *distributed* version control system (VCS) that automates everything

![](img/Git-Logo.png)


- **git** was created by the creator of Linux , Linus Torvalds.

- **git** is a version control software that keeps track of the entire history of a particular project , and also allows to many people to collaborate in the same project. 

- **git** is a tool widely used in the software development world 

- **git** prevent to generate a copy of your file for every new version you save. 

- **git** + **Github** allows you to collaborate with others in a very efficient way. 


## Which tools are we going to use?

- Terminal (Linux, Mac), Git Bash(Windows) WSL (Windows)
- Browser 
- Github account 

# Start of the lesson

<img src="img/start-lesson.jpeg" alt="drawing" width="400"/>




## The working directory- stage area and git history relationship 

![](img/Git4TU_graphics_git_add_commit.png)

## Current state of the working directory-stage area - git history 

![](img/Git4TU_graphics_git_add_commit_2.png)

## Ignoring files

![](img/Git4TU_graphics_ignoring_files.png)

## Ignoring files

![](img/Git4TU_graphics_ignoring_files_2.png)

# Remote repositories 


![](img/github.png)

- Git repository hosting service.

- While Git is a command line tool, GitHub provides a Web-based graphical interface.

- It also provides access control and several collaboration features, such as a wikis and basic task management tools for every project.

# SSH connection 

**Secure Shell (SSH)** is a cryptographic network protocol for operating network services securely over an unsecured network by providing a secure channel to connect an SSH client with an SSH server. 

![](img/Asymmetric-encryption-primitive.png) 

#  Branches


![](img/Git4TU_graphics_branches.png)

# Pull requests 


![](img/Git4TU_graphics_pull_request.png)