# Version control

**Contact:** Alex Kanitz (alexander.kanitz@unibas.ch)

This lesson gives you a primer on version control and relevant tools to version control your code and publish it online.

## Table of Contents

* [Why use version control?](#Why-use-version-control?)
* [How does a version control system work?](#How-does-a-version-control-system-work?)
* [Git](#Git)
  * [Overview](#Overview)
  * [Common Git workflows for collaborative coding](#Common-Git-workflows-for-collaborative-coding)
  * [Interactive session: Basic Git commands](#Interactive-session:-Basic-Git-commands)
  * [Remote repositories & GitLab](#Remote-repositories-&-GitLab)
  * [Best pratices](#Best-practices)
  * [Interactive session: Git & GitLab](#Interactive-session:-Git-&-GitLab)
  * [Further reading](#Further-reading)
* [Homework](#Homework)

## Why use version control?

Sending text files around used to be a common method to work collaboratively on
coding projects (or manuscripts, for that matter). In settings where people
often code only sporadically, such as in the life sciences, it still is. This
is unfortunate, because it is cumbersome, error-prone, does not track
provinence (who did what, when and why), does not allow easy management of
multiple versions (say, you want to add new features or security patches to
multiple versions of your software), and it does not scale well (good luck
trying to resolve conflicts when seven people modify your code simultaneously).
**Version control systems (VCS)** - also frequently referred to as revision
control, source control, or source code management systems - come to the rescue
here, as they are specifically designed to address all of these issues, and
more. VCSs are readily available, often for free (as in speech _and_ beer), and
generally integrate well with any modern software development toolkit,
including most editors, integrated development environments (IDEs) and
continuous testing, integration and delivery solutions.

_Adopting a VCS will ..._

* make your code more robust
* protect it from degradation and catastrophic losses
* facilitate agile development
* accelerate code delivery

As such, VCSs are at the core of good coding practices, and **we strongly
recommended you use a VCS for _all_ of your coding projects**, private or
public, collaborative or not. In fact, you may want to consider using them to
version control other pieces of data you generate and frequently modify,
such as vector images or (as we do) course materials.

> Cloud-based, collaborative productivity tools such as [Google
> Docs](https://www.google.com/docs/about/) typically have version control
> built in, so you may already be familiar with some of its advantages, like
> the ability to roll back changes or track who wrote what contributed what.

## How does a version control system work?

Simply speaking (though probably not quite accurate for any given VCS), modern
VCSs store only the differences between different states/versions of a code
repository in a database, rather than complete snapshots of all files.
Moreover, the stored differences are typically compressed. This design
considerably lowers the storage footprint of code repositories and improves
the performance/speed of operations.

VCSs in use today can be roughly categorized into two groups, differing in
the way that they are organized:

* **Centralized VCS**: contributors pull the latest state from a remote server,
  make their changes and push them back
* **Distributed VCS**: contributors create a local copy (or "clone") of the
  remote repository, including its entire change history, commit changes to
  their local copies and the push them back

Both solutions have advantages and disadvantages, but the fact that
contributors have the entire repository at their disposal typically allows for
more complex (and faster!) operations in distributed compared centralized
solutions. It also provides an additional safety against catastrophic damages
resulting, e.g., from server outages.

> This course is meant to give you guidelines and quick entry points into
> tooling that allows you to code more effectively and therefore we have
> limited capacity to go into much detail on any individual topic. For more
> information regarding the differences, pros and cons of centralized and
> distributed VCSs, you may refer to [this Atlassian blog
> post](https://www.atlassian.com/blog/software-teams/version-control-centralized-dvcs).
> If you are interested in learning more about the various VCSs in use today,
> including their evoluation, underlying principles and implementation details,
> we recommend you check out [this excellent Initial Commit blog
> post](https://initialcommit.com/blog/Technical-Guide-VCS-Internals).


## Git

### Overview

[Git](https://git-scm.com/) is a distributed VCS originally launched in 2005 by
Linus Torvalds, the father and namesake of the Linux kernel. Over the years,
Git outcompeted other VCS solutions to the extent that it is now considered the
**_de facto_ standard VCS for open source software development** (check [this
report from
RhodeCode](https://rhodecode.com/insights/version-control-systems-2016) for
some actual, though dated, numbers; the actual dominance of Git has likely
further increased substantially since the report was published in 2016; you may
also want to read [this Hackernoon blog
post](https://hackernoon.com/how-git-changed-the-history-of-software-version-control-5f2c0a0850df)
for some additional context on the impact of Git). 

Given the popularity of Git, especially in the scientific software community
(we are not aware of a single piece of serious, widely used open source
software in the field that does offer its code in a Git repository), **we will
be using Git as our VCS of choice throughout this course**.

Git is **fast, free, open source** and comes preinstalled on most Mac and Linux
machines. It stores a project's history as a directed graph, with a root, edges
("branches"), nodes and leaves ("commits"). Commits represent snapshots of a
project's state, and given Git's distributed nature (i.e., users work on a
clone of the entire code repository, not just the current state) it is easy to
traverse the tree to go back in time, roll back changes or to compare one state
of the project with another (which makes code review a breeze). Branches
represent different lines of work and they can be used for various purposes,
e.g., maintaining multiple versions of your software or keeping stable,
fully-tested code separately from features that are currently being
implemented. Generally, there is one default branch (typically the `main` or,
formerly, the `master` branch - depending on your version of Git). This is also
often referred to as a _stable branch_, _release branch_ or _production branch_.

> Note that while Git is designed as a distributed nature, it doesn't mean that
> there cannot be a central, authoritative repository in Git workflows. It just
> allows you _not_ to have one, if it happens to suit your project's needs. In
> reality, though, most open source software projects _will_ make use of such a
> _blessed_ repository, and so will we.

One other important property of Git is that it distinguishes between three
different environments:

* The **working directory**  
  This is your current directory structure and corresponds to the state of the
  project on your file system.
* The **staging area**  
  This includes all changes _staged_ to be included in the next commit.
* The actual **local repository**  
  This represents the current state of the commit you are currently viewing
  (also referred to as the HEAD).

Now, when you create a fresh Git repository from an empty directory, clone a
repository from a Git server or pull the latest changes from a remote
repository to your local copy (more on the latter two later), there will be no
difference between the working directory and the HEAD, and the staging area
will be empty. But once you start adding or editing files to the directory
containing the Git repository, the status of your working directory will start
to differ from that of the HEAD. You can then _stage_ one, multiple or all of
the created or modified files to be included in the next commit, filling up the
staging area. Unstaged changes will never be committed. Once you are happy with
your staged changes you can go ahead and commit them - at which point the
working directory and the HEAD will be in sync again.

### Common Git workflows for collaborative coding

When working on a project collaboratively, it is critically important to keep
the corresponding code repository in a clean state so as to avoid conflicts
introduced through modifying the same portions of the same files by different
as much as possible. Conflicts cannot always be avoided and that's totally
fine and noone's mistake, but it helps to adopt a common Git workflow that
everyone can follow.

Here are some of the most commonly used Git workflows (or branching models),
in increasing order of complexity. Which one to pick will depend on the
requirements of your project, but generally the bigger the code base and the
more people are contributing, the more complex the workflow should be. All of
these branching models have in common that they do not allow anyone to push
code directly to the main/default branch.

* [**"GitHub flow"**](https://guides.github.com/introduction/flow/) (low
  complexity)  
  Feature branches are created off the main branch and are merged back into it after a feature has been implemented, tested and reviewed; multiple feature branches can be worked on at the same time, one for each feature and ideally by a single person; the main branch is always assumed to be stable and in a state to be deployed
* [**"GitLab flow"**](https://docs.gitlab.com/ee/topics/gitlab_flow.html)
  (medium complexity)  
  This branching model builds on the GitHub flow and includes guidelines for setting up optional production (to deploy code from development whenever the time is right), environment and release branches on top of the default/main branch and feature branches; this is useful for situations where intensive testing is required prior to deployment (e.g., staging, pre-production, production deployments and corresponding environment branches), when deployments are scheduled or where multiple explicitly versioned releases of a software are to be released 
* [**"GitFlow"**](https://nvie.com/posts/a-successful-git-branching-model/)
  (high complexity)  
  The GitHub and GitLab flows represent recent simplications of this workflow, one of the oldest and most widely known Git branching models; while still a good fit for some release/versioning schemes, its popularity is decreasing with the increasing adoption of continuous integration and delivery solutions, which render some of its features unnecessary; GitFlow prescribes the use of several branch types (main/production, hotfix, release, development and feature branches), with the development branch being the default branch

**For this course, we will be making use of the simples branching model, the
GitHub flow**, with the additional constraint that every feature branch will be
the sole responsibility of a single person. In this way, it is not so important
that feature branches are kept overly tidy and you can commit to it as much as
you like (it takes experience to keep commits tidy). "Features" to be
implemented during the collaborative coding project will also be kept small,
thus minimizing the possibility of merge conflicts.

### Interactive session: Basic Git commands

In this session, we will be creating a local Git repository, learn about
staging and committing files, practice the GitHub flow and learn about some
useful Git utilities.

As a first step, please make sure that Git is installed on your machine by
executing `git --version` in your shell. If the command is not found, you will
first need to [install
Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) before you
can continue. Please let us know if you run into any problems during
installation.

Please refer to the following documentation for further details or in case you
missed the interactive session:

* Git commands
  * [`git config`](https://www.atlassian.com/git/tutorials/setting-up-a-repository/git-config):
    set your user name and email address for attribution and feedback
  * [`git init`](https://www.atlassian.com/git/tutorials/setting-up-a-repository/git-init):
    create a Git repository out of the current working directory
  * [`git status`](https://www.atlassian.com/git/tutorials/inspecting-a-repository):
    display current branch and the state of the working directory and staging area 
  * [`git diff`](https://www.atlassian.com/git/tutorials/saving-changes/git-diff):
    compare changes between current HEAD and working directory
  * [`git add`](https://www.atlassian.com/git/tutorials/saving-changes):
    add files to the staging area
  * [`git commit`](https://www.atlassian.com/git/tutorials/saving-changes/git-commit):
    commit staged files to local repository
  * [`git log`](https://www.atlassian.com/git/tutorials/inspecting-a-repository):
    display commit history
  * [`git branch`](https://www.atlassian.com/git/tutorials/using-branches):
    create or delete a branch
  * [`git checkout`](https://www.atlassian.com/git/tutorials/using-branches/git-checkout):
    switch to a different branch or commit
  * [`git merge`](https://www.atlassian.com/git/tutorials/using-branches/git-merge):
    merge one branch into another
 
 > Note that all of these commands have multiple options and most have
 > additional functionalities than the ones mentioned. For brevity, we are
 > focusing only on those functionalities that we will likely be using during
 > the course.

In [None]:
#################################  READ ME!  ##################################

# if you install the Bash kernel for JupyterLab, you can execute this code
# cell

# see here how to install Bash kernel:
# https://github.com/takluyver/bash_kernel

# note that before re-running you should restart the kernel and manually delete
# the directory "my_directory" that was created during execution

###############################################################################

# configure Bash print every command to the screen before executing; useful for
# following what's happening when executing the entire cell at once
set -o xtrace  

# 1. SET UP GIT

# check if Git is installed
git --version

# configure user details
# skip argument to get current user name and email
git config --global user.name "MY NAME" 
git config --global user.email "my@email.com"

# 2. CREATE REPOSITORY

# create new directory and move into it
mkdir -p my_repository
cd my_repository

# create Git repository from current working directory (and all sudirectories)
# directory can be empty or contain preexisting files and directories
# generates a repository root, an (empty) default branch ("main" or "master",
# depending on Git version) and a ".git" directory that stores all changesets,
# metadata etc 
git init

# 3. ADD COMMIT STRAIGHT TO DEFAULT BRANCH

# check status of working directory and staging area relative to state of
# repository
git status

# now it's time to add or modify some files...
touch my_file_1 my_file_2

# confirm that repository status changed: now there are untracked files and/or
# unstaged changes; also check exactly which lines differ between the working
# directory and the (in this case still "empty") repository
git status
git diff

# stage files for inclusion in next commit, then check and commit
git add my_file_1 my_file_2
# alternatively, do: "git add -A" to add _all_ new/modified files to staging
# area
git status
git commit -m "initial commit"
# alternatively just do: "git commit" to open an editor where you can enter a
# more detailed commit message

# confirm that commit history now contains a new entry and that the staging
# area is clean again
git log
# alternatives:
# short representation with one line per commit: "git log --oneline"
# include visualization of branch tree:
# "git log --graph --decorate --oneline --all"
git status

# make sure your default branch is "main", not "master"
git branch -m main

# 4. MERGE IN CHANGES FROM FEATURE BRANCH

# create feature branch and switch to feature branch
git branch my_feature
# you can use "git branch", without argument, to list available branches and
# verify that a branch was indeed created
git checkout my_feature
# alternatively, do "git checkout -b my_feature" to create and switch to
# feature branch with a single command
# alternatively, you can use "git checkout" also to switch to a specific commit
# by passing a commit identifier/hash (7-digit or long form); you can get these
# from "git log"

# add and/or modify some files...
touch my_file_3
echo "some_content" >> my_file_1

# add to staging area and commit
git status
git diff
git add my_file_3 my_file_1
git status
git commit -m "feat: add my feature"
git log --oneline

# now merge new commit(s) from feature branch into default branch
git checkout main
git merge my_feature

# delete feature branch
git branch -D my_feature
# you could use "git branch" to verify that the branch was indeed deleted
# was indeed deleted

### Remote repositories & GitLab

So far we have only been working with Git locally. But to successfully use Git
for collaborative coding, we need a common remote repository to push our code
changes to, pull the work of other from etc.

**For hosting our remote repository we will be making use of the popular
Git server and social coding platform [GitLab](https://gitlab.com/)**. While
GitLab is not quite as widely used as the Microsoft's
[GitHub](https://github.com/) platform, we chose to use it for this course as
the University of Basel's scientific compute center
[sciCORE](https://scicore.unibas.ch/) offers a local deployment of GitLab at
http://git.scicore.unibas.ch/ and so it will be easy to apply what you have
learned here while working at the Biozentrum. Besides, GitLab and GitHub are
actually quite similar, so it will not be very difficult to transition in case
the need arises (we are using both in our lab).

Next to simple hosting of Git repositories (there are various services that do
just that), GitLab and other social coding platforms offers various project
management functionalities that are incredibly useful for running a software
development project, including merge requests (called pull requests on GitHub
and other platforms), code review tools, an issue tracker and automated kanban
boards for managing issues and merge requests. Social coding platforms (and
Git servers in general) also allow users to _fork_ any public repositories,
i.e., create a private, remote copy of it. In open source software development,
creating merge/pull requests from a fork to the corresponding original
repository is the typical way for people to contribute code to projects that
they are not directly affiliated to and thus do not have the permissions to
push code directly to the original repository. During the collaborative coding
project in the second half of this course, we will also make use of forking, as
we will keep the remote repository of the project on our sciCORE GitLab
instance, to which not all of you may have access to (and thus we are not able
to grant you permissions to add you as a collaborator with the necessary write
permissions to the project.

### Best practices

1. **Mind what you commit**  
   Only commit manually generated files. Auto-generated files can be recreated
   later and only clutter the repository. For the same reason (and also because
   Git servers impose restrictions on total file and/or repository sizes), do
   not commit big files, such as data files, so keep your test files small and
   relevant. Do not commit any artefacts that are specific to your environment,
   e.g., absolute file paths (this is also a potential security issue!) and,
   most importantly, **do not commit secrets or any other sensitive
   information!** There are bots out there that are constantly scanning public
   Git repositories for such information, and it is cumbersome to rewrite
   the Git history to completely remove such information. String patterns
   matching file and directory names to be included can be indicated in various
   places, most often in a version-controlled `.gitignore` file that is placed
   in the repository root directory. However, patterns for files that you want
   to exclude for all repositories and that are specific to your work
   environment (e.g., editor-specific artefacts such as lock and backup files),
   you should rather indicate globally. Have a look at the ["gitignore"
   documentation](https://git-scm.com/docs/gitignore) to find out how to use
   it. We also recommend that you make use of the
   [gitignore.io](http://gitignore.io/) service that auto-generates "gitignore"
   patterns for you based from a wide list of keywords (e.g., `Python`,
   `VisualStudioCode`, `Linux`). Given the importance of configuring Git to
   ignore certain files, we have included the generation of a `.gitignore` file
   in the homework below.
2. **Create single-purpose commits with semantic commit messages**  
   Analogous to the [single-responsibility
   principle](https://en.wikipedia.org/wiki/Single-responsibility_principle)
   of software development, a commit should wrap only functionally and/or
   semantically related changes. For example, if you find a typo in your
   project's documentation while you are working on something else, do not
   just fix it along with your other changes. It's easier to review code that
   is consisting of the least possible number of lines. It's also easier to
   roll back buggy code without any additional side effects. Finally, it will
   make for a clean commit history and changelog, so that users of your
   software can track its development and the release of new features and bug
   fixes. In this regard, we strongly recommended that you describe your
   commits using concise (up to 50 characters in the title line) semantic
   commit messages following the
   [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
   specification. It may be painful at first, but learning it early should
   prevent you from adopting bad practices, and it comes with multiple
   benefits, including a clean and consistent commit history, the ability to
   create changelogs, bump versions according to the widely used [Semantic
   Versioning (SemVer)](https://semver.org/) specification and even publish
   entire releases automatically (in Python, e.g., with the
   [`python-semantic-release`](https://python-semantic-release.readthedocs.io/en/latest/)
   package).
3. **Do not rewrite history**  
   Once you push code to a public repository, others are able to pull the code
   and work on it (which you may not necessarily be aware of). If you change
   the commit graph between the time point that someone has checked out the old
   history tree and the time point they are trying to merge their new code, Git
   will not be able to resolve the resulting conflicts, causing much justified
   frustration. Apart from creating irresolvable conflicts, rewriting history
   can also lead to contributions being misattributed to different people
   (imagine someone rewriting your paper, stripping your name and replacing the
   old version with the new one on the publisher's website) and difficulties in
   tracking the progress of your project. Of course, as with (almost) every
   guidelines, there are valid exceptions: For example, in some Git workflows,
   including the one we are using, the history of a _feature branch_ (but not
   a release/stable branch) can be rewritten in order to clean up or squash
   commits before merging. Here, by convention, the assumption generally holds
   that only a single person is working on a iven feature branch agt a time.
   Another exception is when you inadvertently commit sensitive information to
   your repository (see above). Clearly, in such a situation even rewriting a
   stable branch may become necessary.


> More thoughts and best practices can be found in [this extensive
> resource](https://sethrobertson.github.io/GitBestPractices/), courtesy of
> Seth Robertson.



### Interactive session: Git & GitLab

In this session, we will create a _remote_ on GitLab. It will serve as our central, authoritative repository to/from which we push/pull the latest code changes. Next to the relevant Git commands to interact with a remote, we will explore GitLab's basic project management, code review and merging features.

As a first step, please [register with GitLab](https://gitlab.com/users/sign_up) if you haven't already done so.

Please refer to the following documentation for further details or in case you missed the interactive session:

* GitLab functionalities
  * [Creating a blank project](https://docs.gitlab.com/ee/user/project/working_with_projects.html#blank-projects)
  * [Managing project permissions](https://docs.gitlab.com/ee/user/project/working_with_projects.html#blank-projects)
  * [Setting branch protection rules to enforce GitHub flow](https://docs.gitlab.com/ee/user/project/protected_branches.html#require-everyone-to-submit-merge-requests-for-a-protected-branch)
  * [Creating issues](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#create-a-new-issue)
  * [Creating merge requests](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html)
  * [Automatically closing issues](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically)
  * [Reviewing merge requests](https://docs.gitlab.com/ee/user/project/merge_requests/reviews/)
  * [Squashing commits before merging](https://docs.gitlab.com/ee/user/project/merge_requests/squash_and_merge.html)


* Git commands
  * [`git remote add`](https://docs.github.com/en/get-started/getting-started-with-git/managing-remote-repositories#adding-a-remote-repository):
    connect a local with a remote repository
  * [`git push`](https://docs.github.com/en/get-started/using-git/pushing-commits-to-a-remote-repository#about-git-push):
    push changes from local to remote repository
  * [`git fetch`](https://docs.github.com/en/get-started/using-git/getting-changes-from-a-remote-repository#fetching-changes-from-a-remote-repository):
    fetch latest metadata from remote repository but do _not_ merge any changes into local repository
  * [`git pull`](https://docs.github.com/en/get-started/using-git/getting-changes-from-a-remote-repository#pulling-changes-from-a-remote-repository):
    fetch latest metadata and merge changes into the local repository (shortcut for executing `git fetch` and `git merge` one after another)
  * [`git clone`](https://docs.github.com/en/get-started/using-git/getting-changes-from-a-remote-repository#cloning-a-repository):
    create a local copy of a remote repository

In [None]:
#################################  READ ME!  ##################################

# you can execute this code cell if you have executed the previous code (i.e.,
# you are still inside the repository you created) AND you have created a new,
# *empty* repository/project on a Git server (e.g., GitLab) AND you have
# assigned the SSH-type address ("git@...") to the variable below

REPO_ADDRESS=

###############################################################################

# configure Bash print every command to the screen before executing; useful for
# following what's happening when executing the entire cell at once
set -o xtrace
# make sure commands below are not executed if repository address is not
# provided
if [ -z "$REPO_ADDRESS" ]; then echo "repository address not set"; exit; fi

# 1. CONNECT LOCAL WITH REMOTE REPOSITORY

# connect the local to the remote repository; call the remote repository
# "origin"
git remote add origin $REPO_ADDRESS
# push everything in local repository (all branches and tags) to the remote
# called "origin"
# tags are just optional labels for specific commits, e.g., if you decide that
# this commit right here is going to represent "v1.2.3" of your software
git push -u origin --all
git push -u origin --tags

# 2. ADD CODE CHANGES TO FEATURE BRANCH AND PUSH TO REMOTE

# make sure you are on the default branch and that it is in sync with all the
# latest changes on the remote
git checkout main
git fetch
git merge
# alternatively: "git pull" does "git fetch" (to fetch changes) and "git merge"
# (merge in changes) all in one

# create new branch and switch to it
git branch my_new_feature
git checkout my_new_feature


# add and/or modify some files...
touch my_file_4
echo "some_other_content" >> my_file_2

# add to staging area and commit
git status
git diff
git add my_file_4 my_file_2
git status
git commit -m "feat: add my new feature"
git log --oneline

# push your feature branch to the remote repo
# setting the "-u" flag sets the default remote branch for the current local
# branch, so that for future "git push" operations on your feature branch,
# you only need to execute "git push" (and similarly, git pull)
git push -u origin my_new_feature

# you can now go ahead and create a merge request on your Git server to have
# your code reviewed and your feature merged into the main/default/production
# branch

### Further reading

We focused here on functionalities that you are likely going to use during the
the course, but there are plenty of other Git commands, as well as nuances to
the introduced commands that have not been addressed. The [official Git
website](https://git-scm.com/) offers extensive documentation, including the
[reference documentation](https://git-scm.com/docs), the book "[Git
Pro](https://git-scm.com/book/en/v2)" (free), some
[videos](https://git-scm.com/videos) and [links](https://git-scm.com/doc/ext)
to externally hosted tutorials, books, videos and courses. Apart from that,
[Stack Overflow](https://stackoverflow.com/) has [more than 100'000 questions
tagged with "Git"](https://stackoverflow.com/questions/tagged/git), so you will
likely find answers to pretty much any question you may have.

We encourage you to (re)visit these resources whenever you need them so that
you can, with time, add cool new Git skills to your toolbox.

## Homework

1. **Practice markdown** (15-30 min)  
   Complete the [markdown tutorial](https://commonmark.org/help/tutorial/) and
   take a screenshot of the final page, indicating that you have completed the
   tutorial.
2. **Practice Git commands** (1-2 h)  
   Complete at least the "Introduction Sequence" (on tab "Main") and the "Push
   & Pull -- Git Remotes!" (on tab "Remote") sections of the [interactive Git
   tutorial](https://learngitbranching.js.org/) and take screenshots of the
   final messages indicating that you have completed the sections.
3. **Set up your GitLab project** (5-10 min)
   > Note that this should be done only _once_ per tool/repo, so if you work with multiple people, make sure you do this together or distribute tasks.
   * Set up branch protection rules for the default branch (`main` or `master`, depending on your version of Git) such that nobody can push directly to the branch and only people with a "Maintainer" role can merge code.
   > Hint: `Settings` > `Repository` > `Branch rules`
   * Use the "Fast-forward merge" strategy and encourage the squashing of commits.
   > Hint: `Settings` > `Merge requests`
   * Encourage the squashing of commits so that all (potentially messy) commits of a feature branch are squashed into a single clean commit when merged to the default branch.
   > Hint: `Settings` > `Merge requests`
4. **Create a Git repository and connect it with the corresponding GitLab project** (5-10 min)
   > Again, this should be done only _once_ per tool/repo, so if you work with multiple people, make sure you do this together or distribute tasks. Also, if you had already pushed content to your GitLab project, add the `.gitignore` file via the procedure described in exercise 5.
   1. Create an empty directory on your file system, move into it and create a Git repository with `git init`.
   2. Generate a `.gitignore` file in the repository's root directory.
   3. Stage the inclusion of the file and commit the staged changes. Use "initial commit" as your commit message.
   4. Connect your repository to the corresponding public project on GitLab by following the instructions on the GitLab project page below "Push an existing Git repository".
5. **Add content to your project** (20-40 min)  
   Use the GitHub flow to add content to your project. To do so, repeat the following steps three times to add (1) a `README.md` file (add a header and a few lines to describe the project; do this only _once_ per project/repo), (2) the screenshots from the markdown and Git tutorials (inside a directory `images`; add your names to the screenshot filenames to make sure filenames are unique and we can track who did what), and (3) any code files you have may have already written :
   > Again, make sure to organize yourselves if you work with multiple people. Only _one_ copy of `README.md` should be pushed, but each of you should push your screenshots as well as any code you may have written. Place all code files in a dedicated directory. Make sure you choose suitable, descriptive names both for the code files as well as for the code directory. Neither should include your names and the latter should ideally match the name of the package. Mind the naming rules for files and packages in Python.
   1. Create a (brief!) issue on GitLab for each code/repository change.
   2. Check out the default branch (`main` or `master`).
   3. Pull the latest code changes. (_DO NOT FORGET THIS!_)
   4. Create a new feature branch off the default branch.
   5. Add your file/files, stage it/them for inclusion and commit.
   6. Push your changes to the remote on GitLab.
   7. Create a merge request for your feature branch ("source branch") against the default branch ("target branch"). Make sure the merge request title follows the rules for semantic commit messages. Refer to the corresponding issue (e.g., "Closes #1") in the description of the merge request. Make sure the box "Squash commits when merge request is accepted" is checked (it should be if you set up the GitLab project correctly in exercise 3).
   8. Self-review the merge request. If you are working with multiple people, have your code reviewed by the other people.
   9. Merge the proposed code changes.

   Upon completion, the directory structure should look something like this:

```console
├── your_package
│   ├── your_code_file_1.py  # from student 1
│   ├── your_code_file_2.py  # from student 2
│   └── ...  # any additional code files
├── .git
│   ├── ...
│   ├── ...
│   └── ...
├── .gitignore
├── images
│   ├── screenshot_git_tutorial_1_student_1.png
│   ├── screenshot_git_tutorial_2_student_1.png
│   ├── screenshot_markdown_tutorial_student_1.png
│   ├── screenshot_git_tutorial_1_student_2.png
│   ├── screenshot_git_tutorial_2_student_2.png
│   ├── screenshot_markdown_tutorial_student_2.png
│   └── ...  # screenshots from additional contributors
└── README.md
```
6. **Create detailed issues for your work packages** (30-120 min)
   > Do this only once your project design was approved.
   
   Write individual issues for each work package in your project design. Follow the guidelines on how to write good issues to include as much information as possible: What is it that you want to achieve? How do you want to achieve it? What are potential pitfalls and alternatives that may help address those pitfalls? Is there any additional context you would like to share?

   > For increased consistency, you can optionally also try to create [issue templates](https://docs.gitlab.com/ee/user/project/description_templates.html). Either write one yourself or find a template online.
7. **CONTINUOUS HOMEWORK: Implement work packages**
   > Do this only once your project design was approved and after issue descriptions have been created (exercise 6).
   
   From now on, please process/implement available issues in your GitLab project (those created in exercise 6 and those added later on, if any). Assign yourself to an issue, implement the necessary code additions/changes and merge the code changes by following the procedure outlined in exercise 5. Repeat.
   
   > If you work on the project with multiple people, coordinate yourselves to distribute issues evenly and make sure to review each other's code changes. Do NOT forget to `git pull` to synchronize your local repository with the latest changes on the remote repository before creating a feature branch to avoid unnecessary conflicts later on. It may be necessary to resolve conflicts arising from multiple people changing the same file or files at the same time. If that happens, follow the on-screen instructions to resolve the conflicts. However, remember that the best conflict resolution strategy is to avoid conflicts before they happen, wherever possible. So try to distribute tasks such that different people work on different files. For any problems that may arise, first consult the web for help. If that does not resolve your problem, discuss it with your fellow students working on the same project, if any. If you are still stuck, describe it (1) on the Slack channel if it's a Git-/GitLab-related or other general technical problem, or (2) in the corresponding issue if it's an issue-related problem. In case new issues arise during the process, make sure to write a description for them as in exercise 6.

Enjoy version controlling your code! :)