In [None]:
rm -rf repository

# introduction to `git`

`git` – and **v**ersion **c**ontrol **s**ystems (VCS) in general – remember the changes of files in "commits", which contain metadata and a "diff", the changes between two versions of a file.

for two versions of the same file, a set of differences can be computed. These differences are called "diff", and applying the diff to a file is called "to patch".

Each commit is also uniquely identified by the commit hash, which is a mathematical summary of the changes. An example for such a hash is `ded105a62b9d78717f8dc64652e3903190b585dd`.

Since hash values are not easy to remember and type, there are two forms of human-readable labels: tags, or static labels, and branches, or dynamic labels. For example, in the following graph:

```mermaid
gitGraph
    commit
    commit
    branch feature-branch
    checkout main
    commit
    checkout feature-branch
    commit
    commit
    branch feature-branch2
    commit
    commit
    checkout main
    merge feature-branch
    commit tag: "v0.3.1"
    checkout feature-branch2
    merge main
    commit
    checkout main
    commit
```

`main`, `feature-branch` and `feature-branch2` are branches (the white nodes are merge commits with multiple parents), and `v0.3.1` is a tag.

For more extensive explanations see the [git book](https://git-scm.com/book/en/v2).

With all that in mind, let's start by creating a repository:

## creating a repository

Repositories can be created using two methods:
- if we want to create a new repository: `git init`
- if we want to help with a repository that already exists: `git clone`

### repository initialization

In [None]:
mkdir repository
cd repository

In [None]:
git init
# or `git init .`

we can also do the same thing with
```bash
git init repository
cd repository
```

Next, we need to configure the repository: since git was designed to allow collaboration with other people, we need to tell `git` the name and email address so it knows who authored what. This information will be used to fill in the author and the (last) committer's information of a commit (we'll see what this is used for in the next section).

To do this, we use the `git config` command.

:::{note}

We're using the `--local` flag for `git config`. This flag, together with `--global`, `--system`, and `-f` / `--file`, controls the configuration file we write to:
- `--local` selects `.git/config`
- `--global` selects `~/.gitconfig`
- `--system` selects `/etc/gitconfig`
- `-f` / `--file` allow specifying a custom location

`--local` is the default when setting configuration values, but for reading `git config` will read all configuration files and merge them (local overrides global, which in turn overrides system)

:::

In [None]:
cat .git/config

In [None]:
git config --local --get-regexp 'user.'

In [None]:
git config user.name "The user's name"
git config user.email "user@example.com"

In [None]:
git config --local --get-regexp 'user.'

In [None]:
cat .git/config

## commits

git remembers changes to files (be that creating, modifying, or deleting) in the form of commits. To see the components of a commit, see [this section](#commit-contents).

A newly created repository will not have any commits at all, which we can verify by running `git status`:

In [None]:
git status

Use this every time you're not sure about the state of the repository.

### creating commits and the staging area

If we make changes to files, we modify them in the actual directory, which git calls the "workdir".

We can select changes to commit using `git add`, which will add the changes to the staging area. This allows us to use multiple calls to `git add` until we're content with the changes to commit. If there's anything we want to remove, we can do so using `git rm --cached`.

In [None]:
echo "a" > file1
git add file1
git status

In [None]:
echo "b" > file2
git add file2
git status

We can also look at the actual changes using

In [None]:
git diff --staged

This is a "patch". The most important bits are:
- we compare file1 from commit a with file1 from commit b
- `/dev/null` is a marker for "does not exist"
- at line 0, we insert a line containing `a`

If we're happy, we can commit:
:::{note}
If we don't add `-m <message>`, this would usually open an editor, which doesn't work too well in a notebook.
:::

In [None]:
git commit -m "first commit"
git status

You can see the relationship between workdir, stage, and commits here:
```mermaid
graph LR
    A((workdir)) -- git add --> B((stage))
    B -- git rm --> A
    B -- git commit --> C((commit))
    C -- git reset --> B
```

### commit contents

Commits were originally built on emails (people used to mail around diffs), so they consist of:
- the creation time
- the author (the user first creating this changeset) in the form of `User <email-address>`
- the time of last modification
- the committer (the user who last modified the commit) in the form `User <email-address>`
- the hash value of one or two parents
- the commit message
- the changeset in the form of a diff (a text representation of the changes)
- a hash of all that information as a unique id (the current commit's id)

#### The commit message

By convention, the commit message consists of:
- a one-line summary of the changes within the commit (the recommendation is to keep that below ~70 characters)
- optionally more text separated from the summary by a blank line

In [None]:
echo "1" > file.txt

In [None]:
git status

## branches

(TODO)

## github workflow

(TODO)