# The git version control system

<center>
<img src="https://imgs.xkcd.com/comics/git.png"/>  
</center>

Version control systems are systems to **manage changes** to documents, computer programs, large web sites, and other collections of information.

## The main idea of version control systems
<center>
  <img src="https://raw.githubusercontent.com/progit/progit2/2.1.252/images/distributed.png" style="width: 600px;"/>     
  </center>


* A version controlled system (typically) contains **one official repository**.
* Contributors work on **copies** of repository files and upload the changes to the official repository.
* **Conflicts** might occur if two people work on the same file simultaneously.
    * Non-conflicting modifications are merged automatically.
    * Conflicting modifications must be resolved manually.  

## Use cases for version control systems

**Organization**
  * Retrieve old versions of files.
  * Print history of changes.
  
**Collaboration**
 * Share code between people and work simultanously on the same codebase
 * Track changes and quickly undo changes if necessary

**Backups**
  * Store copy of git repository on an external platform e.g. github  

## Git: the current standard for version control

  * git is a **fast**, **desentralized**, and **open-source** version control system
  * Many sites for storing git repositories online (e.g. github and bitbucket).
  * Installation instructions: <https://git-scm.com>. On Debian derivates (e.g. Ubuntu):
  ```bash
  sudo apt-get install git
  ```
  * Recommended book Pro Git (free to download [here](https://git-scm.com/book/en/v2))
  <center>
  <img src="figs/progit.png" style="width: 400px;"/>  
  </center>
  (The rest of the lecture uses material from this book)
  

# Creating your first git repository

* A git repository is a folder in which files can be tracked by git. 
* A git repository is created with:

In [None]:
rm -rf ~/in3110/mysrc

In [None]:
mkdir -p ~/in3110/mysrc
cd ~/in3110/mysrc
git init .  # The src folder is now also a git repository 

* Git created a (hidden) directory `.git` in that folder which will contain all history information.

## Adding files to the repository
* By default, git does not track any files.
* Files need to be **added** to the repository in order to track their changes:

In [None]:
echo "print(\"Hello\")" > myfile.py  # Create a new file myfile.py 
git add myfile.py
ls

Create a snapshot of the repository by **committing** the added file:

In [None]:
git commit -m 'Initial version of myfile.py'  

## The lifecycle of the status of your files 

* Files in your repository can either be **tracked** or **untracked**.
* Untracked files are always left untouched by `git`.
* Tracked files can be 
    * **unmodified** (no changes since last commit)
    * **modified**   (changes since last commit)
    * **staged**    (changes are ready to commit)
* This figure shows the full lifecycle:    
<center>
  <img src="https://raw.githubusercontent.com/progit/progit2/2.1.252/images/lifecycle.png"/>  
</center>
* The `git status` command shows in which status files are.


## Inpsecting the changes made since the last commit

Let's first make some changes

In [None]:
echo "print(\"World\")" >> myfile.py 
echo "This is a simple hello world project." > README.md

We use `git status` to see the current state of the repo:

In [None]:
git status

Line-by-line changes since the last commit can be displayed with `git diff`

In [None]:
git diff

## Creating another commit
Let's stage all changes with `git add`: 

In [None]:
git add README.md myfile.py

In [None]:
git status

If we are satisfied, we create a snapshot of the repo with `git commit`: 

In [None]:
git commit -m 'New README.md file and fix in myfile.py'

## Viewing the history of commits
* For every commit, git creates a snapshot of all tracked files in the repository.
* Each commit is identified by unique hash key
<center>
  <img src="https://raw.githubusercontent.com/progit/progit2/2.1.252/images/advance-master.png"/> 
</center>

* `git log` can be used to view the commits in a repository:

In [None]:
git log

* Git allows us to view older version of the repository 
    * **But how do we know which version we are currently at?**

## The role of the `HEAD` pointer

* `HEAD` is a special pointer that shows where you currently are in the repository history.
<center>
  <img src="https://raw.githubusercontent.com/progit/progit2/2.1.252/images/advance-testing.png"/>  
</center>
* Running `git commit` updates the `HEAD` pointer to that latest commit.


In [None]:
git log --oneline

Some usefull command line arguments for `git log`:
* `--oneline`: summarize each commit as one line
* `git log FILENAME`: show commits affecting one file or directory

##  Back to the future: Getting older revisions of a repository

* To go to a previous snapshot of the repository:
    * Simply **move the `HEAD` pointer to that commit**.
    * All tracked files will automatically be updated to the version in that commit. 
* The command for moving the `HEAD` pointer is `git checkout`:

In [None]:
git log --oneline

Let's revert to the first commit:

In [None]:
git checkout main^1

In [None]:
git log --oneline --all

The `README.md` has disappeared and we have the initial version of `myfile.py` back:

In [None]:
ls

In [None]:
cat myfile.py

To move back to the latest version, we use:

In [None]:
git checkout main  # alternatively use the identifier of the latest commit

In [None]:
ls

## Removing and moving files

Files can be removed from the repository with
```bash
$ git rm myfile.py
```

and moved with

```bash
$ git mv myfile.py file.py
```

## Tagging 

* Git has the ability to tag specific commits (i.e. give them a more memorable name than the identifier).
* Typically used to mark release points of your software

## `git` cheat sheet (part 1)

  * `git init .`: create a new (local) repository

  * `git status`: View status of commited/uncommited files

  * `git commit -a`: create a commit of all tracked files

  * `git rm FILE`: remove a file

  * `git mv FILE`: move/rename a file
  

# Remote repositories

We can work on git repositories that live on a remote location (for collaboration or backup).

Let's say we created a git repository on github.com: https://github.com/minrk/mytest

<center>
<img src="figs/github.png" height=200>
</center>

## Working with remote repositories


Clone a remote repository to a local directory:

<!-- 
git push origin 1f174d77b7ad8b756c42671fde20bb1f83c33cec:HEAD -f
-->

In [None]:
rm -rf ~/in3110/mytest

In [None]:
cd ~/in3110
git clone git@github.com:minrk/mytest.git mytest
cd mytest
ls

Create a new commit and push it to the remote repository (requires write permission on the remote repository).

In [None]:
echo "print(42)" > main.py
git add main.py
git commit -m "Add main.py file"
ls

In [None]:
git push origin   

On https://github.com/minrk/mytest we can see the new commit has been uploaded.

![uploaded](figs/github-pushed.png)

You can download updates from remote repository with 

```bash
git pull origin main 
```

* Conflicting changes might have been made on the local and remote repository. 
* This results merge conflicts which need to be resolved manually.
* This will be part of your first assignment.

## `git` cheat sheet (part 2)

  * `git clone URL`: clone a (remote) repository

  * `git pull origin main`: update file tree from (remote) repository

  * `git push origin main`: push changes to remote repository

# Branches


* Branches are leightweight copies of the main version 
* Allow fast testing of new code without touching the default version

<center>
<img src="https://github.com/progit/progit2/raw/2.1.252/images/branch-and-history.png" width=800px>
</center>

* Remember: `HEAD` is a special pointer that shows where you currently are in the repository history.
* `main` (or sometimes `master`) is a default branch that is created when initializing a new repository.

## Creating a branch

* Branches are created with the `git branch NAME` command.

In [None]:
cd ~/in3110/mysrc
git branch testing

* The result is that we created a new pointer to the current commit.
* The `HEAD` pointer still points to the branch `main`.
<center>
<img src="https://git-scm.com/book/en/v2/images/head-to-master.png" width=800>
</center>

In [None]:
git log --oneline

# Switching to the new branch

* We use the `git switch` command to move the `HEAD` pointer to our new branch

In [None]:
git switch testing

<center>
    <img src="https://git-scm.com/book/en/v2/images/head-to-testing.png" width=800>
    </center>

In [None]:
git log --oneline

## Creating a commit on the new branch

Which difference does this make? Let's make another commit:

In [None]:
echo "Hello world" >> testing.txt
git add testing.txt
git commit -m "Add testing.txt"
ls

<center>
<img src="https://git-scm.com/book/en/v2/images/advance-testing.png" width=800> 
    </center>

In [None]:
git log --oneline --all

## Switching between branches

If we switch to the `main` branch again, all files will be updated to the version in `main` - in particular the `testing.txt` file will be missing:

In [None]:
git switch main

<center>
    <img src="https://git-scm.com/book/en/v2/images/checkout-master.png" width=800> 
    </center>

In [None]:
ls

## Diverging branches
Let's now create another commit on the main branch:


In [None]:
echo "Hello world" >> main.txt
git add main.txt
git commit -m "Add main.txt"

<center><img src="https://git-scm.com/book/en/v2/images/advance-master.png" width=800> 
    </center>

In [None]:
git log --oneline --graph --all

## Merging branches

We can now merge the change from the testing branch into the master branch:

In [None]:
git branch # Show that we are on the main branch

In [None]:
git merge testing -m "Merge testing into main"

In [None]:
git log --oneline --graph --all

Now both the files from main and testing are in the repo:

In [None]:
ls

In [None]:
git switch testing

## `git` cheat sheet (part 3)


* `git branch NAME`: create a new branch
* `git switch NAME`: move the `HEAD` pointer to `NAME` (can be a commit identifier or branch name) (can also use `git checkout`)
* `git merge NAME`: merges the commits of the branch with name `NAME` into the current branch.
  

## That's it for today!


Do the interactive git tutorial on https://try.github.io

<img src="figs/try-git.png" height=150>

