# Git - version control system

<br />

<img src="../images/git_logo.png" align="left" /><br /><br /><br />

## Introduction

What is Git?

Git is a version control system. Git records changes for designated data over time so that previous states of the data can be restored. Only the differences from the previous state will be saved in the `.git` directory structure.

<img src="../images/git_figure.png" />

Git features **local version control**: the user can `create` his own local Git repository or `clone` an existing one. This already works on the local PC and logs changes on this side.

But git also features a **remote version control**: A remote repository is the consolidated/merged repository of your local repository. It is online and used for cloning. You *collaborate* on the remote repository. Updates by co-workers can be `pull`ed as well as your local changes can be sent back (`push`ed) to the remote repository and `merge`d with it.

### Motivation

A version control helps not only with code:

- **Change-logs**: Understand what you did by displaying the differences.
- **Restoring**: Reset an earlier state of your work.
- **Syncronization** on many computers.
- **Collaboration**: Make sure your co-worker uses the same work base.


## Installation

<br />

**Linux (RHEL or CentOS):**

`sudo dnf install git-all`

**Linux (Ubuntu):**

`sudo apt install git-all`

**MacOS:**

With Xcode Git should be already installed but you can install it from git-scm  https://git-scm.com/download/mac following the instructions of the installer.

**Windows:**

For Windows there is a different Git version available (see more informations at https://gitforwindows.org). Download the installer from https://git-scm.com/download/win.

**Installation from source:**

Recommended only for advanced users! See https://git-scm.com/book/en/v2/Getting-Started-Installing-Git.

<br />


## Help

To get more information about Git you can use the general command line tools _man_

`man git`

or Git itself

`git help`

<br />

<h2 style="color:red"> Exercise </h2>

Run the commands from above. 

<div class="alert alert-block alert-info"><b>Note:</b> <br> 
    To start a program, which is normally started in a terminal, in a Jupyter Notebook the command <b> %%bash </b> must be entered first.
</div>

For example

`%%bash` </br>
`git help`

<br />


In [None]:
%%bash
module load git
git --help

In [None]:
%%bash
cat ~/.gitconfig


## First steps

Now introduce yourself to Git by giving Git your name and email address.

`git config --global user.name "your name"`

`git config --global user.email your_email_address`
<br />

Example:

`git config --global user.name "Groot"`

`git config --global user.email groot@guardians.galaxy`
<br />

To see the global settings use `git config --list`. <br />
The settings are stored in the directory <b>$HOME/.gitconfig</b>.

<h2 style="color:red"> Exercise </h2>

Set the global name and email with your data and list them like above.
<br />


### Workflow

<br />

Term definition:

+ The **local working directory** is where the current changes are made. It can be a single checkout of a project or your local working directory to be used for your private repository. This directory contains:
    + **tracked** files which are watched by git. It is often not clever to track all files of a directory.
    + **untracked** files which are not yet version controlled.

+ The **staging area or index** is a file in your Git directory which contains the information of your next commit.

+ The **.git directory** where the metadata and object database is stored.


The *tracked* files in a Git repository can have one of the three main states: **modified**, **staged**, and **committed**.


<figure>
    <img src="../images/git_3_stages.png" width="500" style="border:1px solid" align="center"/>
    <figcaption style="font-size:smaller; text-align:center">From https://git-scm.com/</figcaption>
</figure>

And the corresponding workflow:

1. Modify files in your working directory tree (add, modify, or delete files).
2. Choose those changes which has to be part of your next commit.
3. Commit the files of the staging area to the repository (like a snapshot).

<br />

### Create a new repository

A repository can be created from an existing directory structure as well as from an empty directory. 
Here, we want to start with an empty directory, called 'myRepo', in which the repository will 'live' in.
In the directory 'myRepo' we can save all files of this workshop and use versioning.

```bash
mkdir myRepo
cd myRepo
```

Next, we initiate the Git repository with the `init` command. This creates a subdirectory **.git** which is the backbone of the repository.

```bash
git init
```

Now we can start storing our files in the repository. Below we show the repetitive process for doing this.

As a first step, we create a README file:

```bash
cat - > README
This is my first repository.
```

Enter **CTRL-D** to close the input stream.

The directory _myRepo_ now looks like that

```bash
[~/myRepo] > ls -la
total 8
drwxr-xr-x   4 k204045  staff  128 26 Mai 16:32 .
drwxr-xr-x  20 k204045  staff  640 26 Mai 16:15 ..
drwxr-xr-x   9 k204045  staff  288 26 Mai 16:18 .git
-rw-r--r--   1 k204045  staff   29 26 Mai 16:32 README
```

The second step is to add the file to the staging area with `add`.

```bash
[~/myRepo] > git add README
```

You can use the `status` command to see what is in the pipe.

```bash
[~/myRepo] > git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
	new file:   README
```


The third step is to commit the file from the staging area into the repository with `commit`. You should write a short message about the commit with `-m "A short message about this commit"`.

```bash
[~/myRepo] > git commit -m "Add README"
[master (root-commit) 11e7892] Add README
 1 file changed, 1 insertion(+)
 create mode 100644 README
```
 
```bash
[~/myRepo] > git status
On branch master
nothing to commit, working tree clean
```

<h2 style="color:red"> Exercise </h2>

Create your first repository by your own.

<br />


In [None]:
%%bash
cd $HOME; cd tmp

In [None]:
%%bash
mkdir myRepo

In [None]:
%%bash
cd myRepo
git init

In [None]:
%%bash
echo "hello world" >> README
git add README

### Changes

Next, let's look at what happens when you continue adding files or making changes to existing files.

We'll jump ahead a bit here and create our first Python script and add some text to our README file.

```bash
[~/myRepo] > cat - >> README

script_0.py just say 'Hello World'

```

Enter **CTRL-D** to close the input stream.



<h2 style="color:red"> Exercise </h2>

Add the line to the README file as explained above.

<br />

Create the first Python script, do

```bash
cat - > script_0.py
print('Hello World')
```

Yeah, don't forget to enter **CTRL-D**.

<br />

Our directory contains now the following files:

```bash
[~/myRepo] > ls -la
total 16
drwxr-xr-x   5 k204045  staff  160 27 Mai 10:57 .
drwxr-xr-x  20 k204045  staff  640 26 Mai 16:15 ..
drwxr-xr-x  12 k204045  staff  384 26 Mai 16:35 .git
-rw-r--r--   1 k204045  staff   66 27 Mai 10:51 README
-rw-r--r--   1 k204045  staff   21 27 Mai 10:57 script_0.py
```

The files are not yet listed for commit to the repository and `status` is kind enough to let us know.

```bash
[~/myRepo] > git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   README

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	script_0.py

no changes added to commit (use "git add" and/or "git commit -a")
```

Well, let's catch up on that.

```bash
[~/myRepo] > git add README script_0.py
```

<h2 style="color:red"> Exercise </h2>

And now you ...

<br />

With Git's `diff --cached` we can look up what we changed before the actual commit.

```bash
[~/myRepo] > git diff --cached
diff --git a/README b/README
index eb6d976..dcca46d 100644
--- a/README
+++ b/README
@@ -1 +1,3 @@
 This is my first repository.
+
+script_0.py: just say 'Hello World'
diff --git a/script_0.py b/script_0.py
new file mode 100644
index 0000000..df1dc68
--- /dev/null
+++ b/script_0.py
@@ -0,0 +1 @@
+print('Hello World')
```

<h2 style="color:red"> Exercise </h2>

Jupp, you know ...

<br />

Now that we have done the preliminary work, we can commit it to the repository.

<h2 style="color:red"> Exercise </h2>

Do you remember the `git commit` command from above. Run the command, but change the message with something appropriate for these changes.

<br />

Git stores the differences of the changes in the git repository, not the changed files.

With the command `log` we can always check what has been done in the repository.

```bash
[~/myRepo] > git log
commit a300390fc25be979286298716f1687aaea47a6c7 (HEAD -> master)
Author: KMFleischer <meier-fleischer@dkrz.de>
Date:   Thu May 27 12:59:55 2021 +0200

    Add script_0.py and modify REAMDE

commit 11e789205b3e7a9cca589df57791a2cef2fe6108
Author: KMFleischer <meier-fleischer@dkrz.de>
Date:   Wed May 26 16:35:35 2021 +0200

    Add README
```

If you want to see more about each change you can use `log` with the option `-p`.

```git log -p```

<h2 style="color:red"> Exercise </h2>

What does your repository logfile look like?

<br />

## Branches

If you want to change or extend a repository, a development branch can be created. This is safer because the main branch, called **master**, remains untouched. You can switch between the master branch and your development branch, and it can be merged with the master branch at any time.

Create a branch and call it _development_ (or what ever you like)
```bash
[~/myRepo] > git branch development
```

Show the existing branches
```bash
[~/myRepo] > git branch
  development
* master
```

The active branch is marked by an asterisk.

The next step is to switch from master to the development branch
```bash
[~/myRepo] > git switch development
Switched to branch 'development'
```

or `git checkout -b development`

Let's see what happened
```bash
[~/myRepo] > git branch
* development
  master
```

Switch back to the master branch, just because we can do it
```bash
[~/myRepo] > git switch master
Switched to branch 'development'
```

<h2 style="color:red"> Exercise </h2>

Now, create your own branch, switch to it and make some changes? <br />
What happen if you commit the changes to the branch and master branch?

<br />

### Merge branch into master

To merge the development branch into the master you can use the `merge` command. 
First, make sure your're in the master branch.

`git merge development`

If there are no conflicts all is well, if there are, then we need to see where and fix them.
```bash
[~/myRepo] > git merge development
Auto-merging script_01.py
CONFLICT (content): Merge conflict in script_01.py
Automatic merge failed; fix conflicts and then commit the result.
```

With `diff` we can see the marked parts of the conflict.
```bash
[~/myRepo] > git diff
diff --cc script_01.py
index 8cc9c21,d09cfbe..0000000
--- a/script_01.py
+++ b/script_01.py
@@@ -1,3 -1,4 +1,8 @@@
+ # welcome to a short python script ;)
+ 
  print('Hello again')
  
++<<<<<<< HEAD
 +# that's the end
++=======
++>>>>>>> development
```



## Remove File

The command `rm` removes files from the repository respectively staging area. The files are not removed from your working directory .

To delete the file script_01.py from the repository/staging area:

```bash
git rm script_01.py
```

## Rename File

To rename an file of the repository use the `mv` command.

```bash
git mv script_0.py script_00.py
```

<h2 style="color:red"> Exercise </h2>

1. Add a new file and commit it to the repository
2. Remove the new file from repository
3. Rename script_0.py to script_00.py

<br />

In [None]:
# 1.


In [None]:
# 2.


In [None]:
# 3.


## Undoing Changes

You can undo changes at any time. There are different ways to do it depending on what you want to undo.

### After a commit

After a commit, if files or changes were forgotten to be added. Add the stuff with `git add` and use the option `--amend` with the commit command:

`git commit --amend`

This commit will replace the commit before.


### Undo staging

Unstage a staged file with the `restore` command.

Example

```bash
[~/myRepo] > cat - > file1.txt
file 1   

[~/myRepo] > cat - > file2.txt
file 2

[~/myRepo] > git add *.txt

[~/myRepo] > git status
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	new file:   file1.txt
	new file:   file2.txt

[~/myRepo] > git restore --staged file2.txt

[~/myRepo] > git status
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	new file:   file1.txt

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	file2.txt
```


### Undo changes to a file

To undo changes in a file (not staged) you can also use `restore` to reset the file to the state of the last commit. 

Example: We append some text to the file1.txt

```bash
[~/myRepo] > cat - >> file1.txt

Add new lines to the file.

(PythonCourse) [~/Python/DKRZ_Python_Workshop/myRepo] > git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   file1.txt

no changes added to commit (use "git add" and/or "git commit -a")
```

The output of `git status` tells you exactly what you can do to undo the changes to the state of the last commit.

`git restore file1.txt`

<br />

<h2 style="color:red"> Exercise </h2>

Now, try what you have leared.

<br />

## Git Gui

To have a better overview of the different branches it is nice to use the `gitk` Git graphical user interface.

`gitk`

<div class="alert alert-block alert-info"><b>Note:</b> <br> 
gitk is not installed in the default installation on Mac OS X. This can be done with 'homebrew': <b>brew gitk</b>
</div>

<br />

## Working with a Remote Repository

### Clone Existing Repository

To copy an existing repository from a server the `clone` command is your choice. When you run `git clone <URL>` a full copy of all versions and data is pulled down.

The course material with the notebooks can be cloned from https://gitlab.dkrz.de/pythoncourse/material.git 

`git clone https://gitlab.dkrz.de/pythoncourse/material.git`

<br />



<h2 style="color:red"> Exercise </h2>

Clone the course material as described above in a new empty directory. See what happens.

<br />

### Info about the Remote Repository

Move into the cloned repository which is in the directory **material**.

`cd material`

Show the repository's URL with the `remote`command:

```bash
[~/material] > git remote -v
origin	https://gitlab.dkrz.de/pythoncourse/material.git (fetch)
origin	https://gitlab.dkrz.de/pythoncourse/material.git (push)
```



### Fetch and Pull

When working in a team, the status of the repository must be checked before any change is made to prevent conflicts. This is done with `git status`. If there are changes on the server this will be shown and you can update the local repository with the `fetch` or `pull` command.

`git fetch <remote>` - pull down all data of the remote repository which is not in your current local one, do not merge it. 

`git pull <remote>` - pull down all data of the remote repository which is not in your current local one and merge it with yours.


### Push


<figure>
    <img src="http://krishnaiitd.github.io/gitcommands/images/GitWorkflow-3.png" width="500" style="border:1px solid" align="center"/>
    <figcaption style="font-size:smaller; text-align:center">From http://krishnaiitd.github.io</figcaption>
</figure>


If you want to share your work with others of the repository community you can upload it with the `pull` command.

`git push origin master`

<div class="alert alert-block alert-info"><b>Note:</b> <br> 
If another user has pushed his work directly before you do, you have to fetch his work first and merge it into your repository before you can push yours.
</div>

You can get more information of the remote with the `remote show` command:

```bash
[~/material] > git remote show origin
* remote origin
  Fetch URL: https://gitlab.dkrz.de/pythoncourse/material.git
  Push  URL: https://gitlab.dkrz.de/pythoncourse/material.git
  HEAD branch: master
  Remote branch:
    master tracked
  Local branch configured for 'git pull':
    master merges with remote master
  Local ref configured for 'git push':
    master pushes to master (up to date)
```


<h2 style="color:red"> Exercise </h2>

1. Create a subdirectory in **material/Participants** with your name.
2. Add a file to the newly created directory.
3. Push it to the remote repository.

<br />

More information about possible development workflows with git at https://git-scm.com/book/en/v2/Distributed-Git-Distributed-Workflows

## Tags

Tags of a repository are labels which allow us to more easily restore the status of the repository at any time to the state of the tag. You can list the tags with the `tag` command.

`git tag`

It will return a list of the current given tags if tags are set, like

```bash
v1.0
v1.1
```

Git knows two different types of tags - **lightweight** and **annotated**. 

The lightweigth tag is like a pointer or link to the specified commit.

The annotated tag is stored as a full data obkect in the Git database. It stores the tagger name, email, date, checksum, and tagging message. It is recommended to use the annotated tag to have all these information within, but if you want to have a temporary tag do the lightweight way.

With the `show` command you can see what is in the tag:

`git show v1.0`

To push a tagged version of your **branch** use the tag name instead of master:

`git push origin v1.0`



### Examples

1. To create a tag for the current committed version, annotated method.

`git tag -a v1.0 -m "Version 1.0"`



2. To tag an older commit you have to look at the log file.

`git log`

See where the commit is located in the log file, e.g.

```bash
commit 4ddacb2a9d08ed50755571ec1eade4e73567d862 (HEAD -> master)
Author: Groot <groot@gardians.galaxy>
Date:   Wed Jun 2 13:45:38 2021 +0200
```

Here, it is demonstrated how to use the commit address to set a tag for the lightweight method:

```bash
git tag groot_v1.0 4ddacb2a9d08ed50755571ec1eade4e73567d86
```

Now, the information of the log file about this commit shows the setting og the tag.

```bash
commit 4ddacb2a9d08ed50755571ec1eade4e73567d862 (HEAD -> master, tag: groot_v1.0)
```


<h2 style="color:red"> Exercise </h2>

Do it - give your current version of the branch a tag name.

<br />

### Start Branch from Tag

You can checkout a tag and create a local branch containing that version with the `checkout` command. 

`git checkout -b version1.0 'v1.0'`

<br />


## Push your Branch to Remote Repository

You can push your branch to the remote repository and store it as a branch, not in master.

`git push origin version1.0`

<br />

<h2 style="color:red"> Exercise </h2>

Yepp, create your first branch from an existing tag.

<br />
