# Testing, Static Code Analysis & Continuous Integration

**Contact:** Alex Kanitz (alexander.kanitz@unibas.ch)

This lesson gives you an overview of testing strategies, how to write simple unit tests (with some common techniques for more complex cases), how static code analysis tools can help you with consistent code formatting and identifying code smells, and how you can automate the running of tests and static code analysis tools to guard against codebase degredation using Continuous Integration.

## Table of Contents

* [Writing tests](#Writing-tests)
  * [`pytest`](#pytest)
    * [A simple example](#A-simple-example)
    * [Monkeypatching](#Monkeypatching)
  * [Code coverage](#Code-coverage)
* [Static code analysis](#Static-code-analysis)
  * [Code style](#Code-style)
  * [Linting](#Linting)
    * [Interactive session: Linters](#Interactive-session:-Linters)
  * [Automatic code formatting](#Automatic-code-formatting)
    * [Interactive session: Black](#Interactive-session:-Black)
  * [Static type checking](#Static-type-checking)
    * [Interactive session: Mypy](#Interactive-session:-Mypy)
* [Continuous Integration](#Continuous-Integration)
  * [How does it work?](#How-does-it-work?)
  * [A short note on costs](#A-short-note-on-costs)
  * [A small glossary](#A-small-glossary)
  * [Software development life cycle](#Software-development-life-cycle)
* [GitLab CI/CD](#GitLab-CI/CD)
  * [YAML](#YAML)
  * [Pipeline configuration file](#Pipeline-configuration-file)
    * [Annotated example](#Annotated-example)
    * [Keywords for defining jobs](#Keywords-for-defining-jobs)
    * [Basic example](#Basic-example)
  * [Interactive session](#Interactive-session)
* [Managing Python dependencies](#Managing-Python-dependencies)
* [Homework](#Homework)


## Writing tests

_**"Untested code is broken code."**_  
Martin Aspeli, Philipp von Weitershausen

To keep your sanity while trying to maintain and extend an evergrowing codebase, it is crucially important to not only test your code while you are writing it, but to keep your test cases in your code repository, keep them up-to-date with your code base and run them whenever you add new code. You should strive to cover _every line of code_ in your codebase.

Testing is a complex subject and it takes a lot of practice. Also, there are various types of tests, primarily:

* _**unit tests**_ test a block of code (typically a function or method) in isolation, thus focusing only on the behavior of that specific code block
* _**integration tests**_ test the behavior or two or more code blocks together
* _**end-to-end tests**_ are a special type of _integration test_ that test the behavior of the entire program

To get you started with testing, here we will **focus on unit tests**, as they are the easiest to write. Having your entire codebase covered by unit tests already goes a very long way in preventing bugs and easing maintenance, especially if you encapsulate and isolate your _code units_ well and minimize _side effects_ (i.e., when a code block is relying on or modifiying code outside of its scope).

### `pytest`

[`pytest`](https://docs.pytest.org/) is a widely used package for code testing in Python. Like other packages, it can be installed with 

```
pip install pytest
```

and once installed it can be called with 
```
pytest
```

What it does is to run all the code in files called `test_*.py` or `*_test.py` that are located in the current directory or its subdirectories. Which directories and file patterns to search can be changed by commandline parameters of `pytest`. More on customizing `pytest` can be found at https://docs.pytest.org/en/6.2.x/reference.html. For now, we will use the default parameters.

It is a good idea to separate the actual code to be tested form the testing code. Typically, we put the latter in a directory `tests/` in the repository root directory. However, the testing code will need to access the code of the module being tested. The way to do this is to _import_ the module to be tested into the testing code, in a way that does not make assumptions about the directories in which the modules to be tested reside. Given that we have know how to package our code, this is simple though, as we can simply install our package by executing the following in the repository root (compare the `EncapsulationPackaging.ipynb` notebook):

```
pip install -e .
```

#### A simple example

Let's now look at a simple example (adapted from https://gist.github.com/bobhsr/4635489). Assume that we have the following code in module `arithmetic/arithmetic.py`:

```python
"""Classes for arithmetics operations."""


class Arithmetic:
    """A python class for basic arithmetic operations for two rational numbers.

    Non-number inputs are attempted to be cast to floats. The behavior for
    passing values that are not (rational) numbers and cannot be easily cast
    to numbers is not well defined.
    """
    def add(self, x, y):
        """Calculate the sum of inputs.

        Args:
            x: Number to be added to `y`.
            y: Number to be added to `x`.

        Returns:
            Sum of inputs.
        """
        return float(x) + float(y)

    def subtract(self, x, y):
        """Calculate the difference between inputs.

        Args:
            x: Number from which `y` is to be subtracted.
            y: Number to be subtracted from `x`.

        Returns:
            Difference between inputs.
        """
        return float(x) - float(y)
```

Let's further assume that we have created a package out of directory `arithmetic/` (by adding `__init__.py`) and installed it by creating a corresponding `setup.py` and installing with `pip install -e .` from the repository root directory.

Now, we want to write code that tests all of the methods in the class `Arithmetic` defined above. As mentioned previously, we will save this code in a `tests/` directory. One way of organizing the tests is to create one module of testing code for each module of code to be tested. Considering the naming conventions that ensure that `pytest` finds the tests, (part of) our project's directory structure will look something like this:

```
├── arithmetic
│   ├── __init__.py
│   └── arithmetic.py
├── setup.py
└── tests
    └── test_arithmetic.py
```

The `test_arithmetic.py` file could look like this:

```python
# imports
import pytest

from arithmetic.arithmetic import Arithmetic  # we import class `Arithmetic` from module `arithmetic` in package `arithmetic`

# create an instance of the Arithmetic class
ar = Arithmetic()


# tests for the `.add()` method
def test_add():
    assert ar.add(1, 2) == 3.0  # we ensure that the addition works as expected for a few cases
    assert ar.add(3, -7) == -4.0
    assert ar.add(-10, -10) == -20.0
    assert ar.add(1, "2") == 3.0  # we also ensure that strings that can be cast to numbers are handled as expected
    assert ar.add("1", 2) == 3.0
    assert ar.add("1", "2") == 3.0
    with pytest.raises(ValueError):  # and we make sure that Python complains if we try to convert letters to numbers
        ar.add(1, "a")
    with pytest.raises(ValueError):
        ar.add("a", 1)
    with pytest.raises(ValueError):
        ar.add("a", "b")


# tests for the `.subtract()` method
# here we are making use of pytest's functionalities to run parametrized tests by creating from a nested list of tuples
# (1) a list `test_input` of input tuples and (2) a list `expected` of expected values
# now let's use those lists to write the tests
@pytest.mark.parametrize(
    "test_input,expected",
    [((1, 2), -1.0), ((3, -7), 10.0), ((-10, -10), 0.0), ((1, "2"), -1.0), (("1", 2), -1.0), (("1", "2"), -1.0)]
)
def test_subtract_param(test_input, expected):  # we need to pass the lists to the test function...
    assert ar.subtract(test_input[0], test_input[1]) == expected


# we can do the same for the tests that raise an error:
@pytest.mark.parametrize(
    "test_input,expected",
    [((1, "a"), ValueError), (("a", 1), ValueError), (("a", "b"), ValueError)]
)
def test_subtract_param_failing(test_input, expected):
    with pytest.raises(expected):
        ar.add(test_input[0], test_input[1])
```

The first two lines of code import `pytest` and the `Arithmetic` class, respectively. We then create an instance of the class to be used in our tests. What follows is a block of functions, each designed to test the functionality of one method of the `Arithmetic` class. Specifically, we are using two basic test cases:

1. Asserting a specific result when calling the tested code, with the general syntax:
   ```python
   assert func(x, y) == result
   ```
2. Ensuring that a specific error is raised when calling the tested code, with the general syntax:
   ```python
   with pytest.raise(Error):
       func(x, y)
   ```

When running `pytest` from the repository root directory, all tests will be executed and if we did everything correctly, we will learn that all tests we have set up have passed - yay! :)

> Note that running `pytest` will create a directory `.pytest_cache/` in the repository root directory. Make sure you do _not_ version control it by including it in your `.gitignore` file. If you have automatically created your `.gitignore` file via http://gitignore.io/ and you have selected Python, you will not need to add it manually.

Let's be brave and tweak one of the tests to find out how `pytest` reacts if a test fails. For example, if you do

```python
def test_add():
    with pytest.raise(ValueError):
        ar.add(1, True)
```

`pytest` will fail because no `ValueError` is raised. Can you imagine why not?

Apart from test parametrization, `pytest` offers many more features and "fixtures" (functions that help you set up a test case), such as creating temporary files, checking the screen output etc. Have a look at `pytest`'s [documentation](https://docs.pytest.org/en/6.2.x/contents.html) for more info. But don't be put off by the complexity, you will learn more of it incrementally, when you need it.

There are, however, two important aspects of testing that we would like to highlight.

#### Monkeypatching

As mentioned before, in _unit tests_ we are testing blocks of code in isolation. But what if our code depends on the functionality of third-party code? Should our tests extend to documented (and hopefully tested) behavior of other people's code? For unit tests, the answer is "It depends!". While we should never go as far as writing entire unit tests for other people's code, if our code depends on it, we should usually try to cover with our tests all responses (return values or errors) we may reasonably expect from it. However, we should also make sure that our tests run quickly and ideally do not depend on too many dependencies, e.g., the availability of an external service or database. So what can we do to avoid it?

The answer is _**monkeypatching**_ or _**mocking**_, the practice of overriding code to return a specifc, well-defined response.

There are several uses cases for monkeypatching (see [here](https://docs.pytest.org/en/6.2.x/monkeypatch.html) for some more), but let's focus on the example mentioned earlier, the dependency of an external service that we are trying to call via HTTP. Consider this example code (adapted from [`pytest`'s documentation](https://docs.pytest.org/en/6.2.x/monkeypatch.html#monkeypatching-returned-objects-building-mock-classes)):

```python
# contents of app.py, a simple example where a JSON response is retrieved from
# a web service available at a specified URL and is serialized into a Python dictionary
import requests


def get_json(url):
    """Takes a URL, and returns the JSON."""
    r = requests.get(url)
    return r.json()
```

So what if the URL is done when we are running our tests? The tests would likely fail and we might scratch our heads, thinking that something is wrong with our own code, when in fact it was just a server outage. Of course, you should _also_ include in your code the possibility that such outages exist and test for _that_ behavior, but you will also need to test the expected _normal_ behavior, so in comes _monkeypatching_:

```python
# contents of test_app.py, a simple test for our API retrieval

# import requests for the purposes of monkeypatching
import requests

import pytest

# our app.py that includes the `get_json()` function
# see the previous code block example
import app

# custom class to be the mock return value
# will override the `requests.Response` object returned from `requests.get()`
class MockResponse:

    # `requests.Response` has a `.json()` method that we are relying on
    # our mock `.json()` method will always return a specific, well-defined testing dictionary
    @staticmethod
    def json():
        return {"mock_key": "mock_response"}


# now let's write our test case
def test_get_json(monkeypatch):  # we need to pass monkeypatch here; it's available as soon as you import pytest

    # here we are defining a mock method: whatever arguments are passed to it,
    # mock_get() will always return our mocked object `MockResponse`, which only has the .json() method.
    def mock_get(*args, **kwargs):
        return MockResponse()

    # now let's monkeypatch requests.get with mock_get
    # we do this by telling `monkeypatch` to set the `.get` method (or more generally, attribute)
    # of the `requests` class to the `mock_get` function so that whenver `requests.get()` is called
    # our `mock_get()` function is called, which returns our `MockResponse` object
    monkeypatch.setattr(requests, "get", mock_get)

    # now let's call `app.get_json()`, which contains `requests.get()` (see previous code block)
    # it will use our monkeypatch...
    result = app.get_json("https://fakeurl")
    # ...and so our JSON response returns our pre-defined dictionary!
    assert result["mock_key"] == "mock_response"
```

As we have seen, we set our test up in a way that it isn't actually making that call to the `url` anymore, but rather we simply get the response that we put in. For the simple code block we have tested, this may seem somewhat pointless, because there's basically _nothing else_ but that call to the URL and so why bother testing it if we basically then don't actually make that call and just tell it what to return us instead. We're just getting out what we put in - way to test! But now imagine that there's more in `get_json()`, some processing that actually needs proper testing? In that case, we would be able to test that code independently of the availability of a server at URL `url` or in the absence of an internet connection.

Apart from setting/overriding and deleting class attributes, the `monkeytest` fixture also has methods to set and delete dictionary items and environment variables. And while it is difficult for beginners to estimate when to use monkeypatching in practice and we will probably not need to make use of it for our coding project, we feel it is important for you to know that overriding external code, data and variables is _possible_ should the need ever arise (and it soon will if you start doing something more complex!).

### Code coverage

The last aspect on testing we want to touch upon is the concept of _**code coverage**_, which is defined as the percentage of all code statements that are covered by the entirety of available test cases. Say, your code consists of 100 statements, but your test cases never run 30 of these statements, then your _code coverage_ is 70%. As mentioned before, we should be striving to have _all_ statements covered by tests, so in terms of _code coverage_, we are striving for 100% - a very high bar! But again, use/publish untested code at your own peril - sooner or later it's gonna come back at you hard!

There is a nice Python package [`coverage`](https://coverage.readthedocs.io/en/6.0.2/) that conveniently allows you to calculate your code coverage. It is easy to install:

```bash
pip install coverage
```

...and use (call from repository root directory):
```bash
coverage run --source=code_directory/ -m pytest
# where code directory is the _top-level_ directory containing the code to be tested
```

This will calculate the _code coverage_ across all modules inside the `code_directory/` directory and all its subdirectories and write output to a file `.coverage` that is created in the repository root directory.

> Similar to the `.pytest_cache/` directory mentioned above, make sure not to version control the `.coverage` file. And again, if you have automatically created your `.gitignore` file via http://gitignore.io/ and you have selected "Python", you don't need to worry about adding it manually.

You can also specify more than one directory at a time, like so:

```bash
coverage run --source=code_directory_1/,code_directory_2/ -m pytest
```

In order to see the calculated _code coverage_, execute the following:

```bash
coverage report -m
```

The `-m` flag ensures that the lines of code that are not covered by tests are explicitly mentioned in the output, which is very useful to tell you what tests you still need to implement. Luckily for our `arithmetic` example, there are no statements missing (not surprising, because our methods were just single lines of code with no conditionals/branching), so we are at 100% coverage:

```console
Name                       Stmts   Miss  Cover   Missing
--------------------------------------------------------
arithmetic/__init__.py         0      0   100%
arithmetic/arithmetic.py       5      0   100%
--------------------------------------------------------
TOTAL                          5      0   100%
```

## Static code analysis

Here we learn about tooling that can analyze our code at rest (therefore _static_) and give us valuable feedback for improvement. We will also learn about tools that can help us keep consistency in our codebase.

### Code style

While it may perhaps not sound overly important for small scripts, as soon as projects grow to a certain volume, having a consistent coding style throughout the codebase is a tremendous time saver, as it helps you to find your way through your codebase with ease. But apart from saving time, keeping a certain degree of order in your codebase also helps reduce the risk of introducing bugs. The analogy would be living in a messy vs. a tidy room. You may be fine, because you feel like you are the master of your own chaos. But what if your entire house is messy? It can be cumbersome. But now imagine that other people have to search through that mess and find things in it - clearly, when working on collaborative projects, keeping a uniform coding style is crucial!

Among its many (hundreds by now) "Python Enhancement Proposals" (PEP), one of the first, [PEP 8](https://www.python.org/dev/peps/pep-0008/) defines a style guide for Python code. The guide is not very dogmatic and features relatively loose recommendations, conventions and guidelines on, e.g., code layout, naming conventions, commenting and typing. It is essential reading especially for Python beginners.

For an example of a more concrete style guide, check out the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html). It is a decent choice for your own projects, although it is of course perfectly alright to stick to your own preferences for some or all style choices (it is strongly recommended that you follow PEP 8, though!). The most important thing is that the style of a given codebase is consistent! Therefore, if you are contributing to a collaborative project, it is important that you follow the style in which the existing codebase is written. If no code exists yet, discuss the style to adopt with your fellow contributors.

### Linting

Linting refers to the automated checking of your codebase for style issues and style-related _code smells_. _**Linters**_ are tools that can be run to analyze your codebase, and they are available for most programming languages, domain-specific languages and markdown languages. In fact, there are often different _linters_ available for different _aspects_ of different languages, e.g., those that focus on enforcing particular coding style guides, documentation or on functionally assessing your code based on type information (very cool!).

Two of the most popular linters for Python are:
* [Pylint](https://pylint.org/), perhaps the oldest and most widely used one, but also (by default) the strictest
* [Flake8](https://flake8.pycqa.org/), somewhat less strict than Pylint with default settings, but still good enough to achieve decent results

> Note that it is possible (and frequently done) to use _more than one_ linter at a time, even if these linters focus on the same aspects.  
>  
> Also note if you are using a commonly used code editor, such as [Visual Studio Code](https://code.visualstudio.com/), you can easily configure linters to run automatically every time you save your code, or even continuously. We strongly recommend you to set this up for yourself, as it is a huge time saver and, given that you get immediate feedback if you go wrong, you will learn sticking to a clean coding style much better.

#### Interactive session: Linters

We will be using both the [`flake8`](https://flake8.pycqa.org/) (with the [docstrings extension](https://github.com/PyCQA/flake8-docstrings)) and [`pylint`](https://www.pylint.org/) linters for our collaborative coding project. They are very easy to install:

```bash
# move to your project repository, then execute:
pip install flake8
pip install flake8-docstrings
pip install pylint
```

> Like all other linters, and indeed all tools presented during this session, `flake8` and `pylint` can be configured to your heart's contents. However, we will be using the defaults for now.

To have the tools lint your code, run:

```bash
flake8 --docstring-convention google code_directory/ tests/
pylint code_directory/ tests/
```

> The docstrings extension makes sure that not only our code, but also the docstrings are linted!

You will likely see quite a lot of issues that these linters find with your code, yes? :)  

_**Do you want to share Pylint code rating?**_  

As you can imagine, it will be part of your homework to fix all of these issues.

### Automatic code formatting

In addition to linters, there are also configurable tools that automatically reformat your code according to style conventions such as PEP8. These are typically called _code formatters_.

The most popular code formatter for Python is [Black](https://github.com/psf/black) and we will try it out in a minute.

> While code formatters can be very helpful, we do not necessarily recommend them for beginners, as it is better to put work in developing good habits rather than relying on software. Also, they come with the downside that it is difficult to automate the inclusion of a code formatter, because they cannot be easily integrated into a CI pipeline, given that they actually _change_ the code (something that shouldn't normally happen inside during CI). Therefore, it may be better to run them _before_ committing your code - something that can be automated via [Git (pre-commit) hooks](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks).

#### Interactive session: Black

Let's see [Black](https://github.com/psf/black) in action!

First install the tool:

```shell
pip install black
```

Then run it on your code:

```shell
black code_directory/ tests/
```

To see what Black did to your code, just run a `git diff`. Also, rerun the linter commands from above and you will probably see that at least some of the complaints have disappeared - nice!

### Static type checking

We have recently learned how to add type hints/annotations to your code. But we have also mentioned that they are only really useful if we use a static type checker like [`mypy`](http://mypy-lang.org/) to actually analyze the types and tell us if there are any issues with them.

This is indeed very useful, because issues with incompatible types (and Python, being a dynamically typed language, will not complain about these until we run into such issues during runtime) are sometimes hard to spot and debug.

So let's try out `mypy` on our code base as well.

#### Interactive session: Mypy

Let's first install [Mypy](http://mypy-lang.org/):

```shell
pip install mypy
```

Then run it:

```shell
mypy code_directory/  # probably we don't need to run this on our tests!
```

Did it find any issues?

> Note that this will only give you any complaints if you have already added type hints to your code!

## Continuous Integration

As we develop large software packages in a multi-programmer environment, we want to minimize the opportunity to introduce bugs/erros, and the time it takes to fix them when they _do_ occur (and they will!). Here we will learn about _Continuous Integration_, a technique to guard against degredation of your codebase.

### Software development life cycle

Various principles, paradigms and [philosophies of modern software development](https://en.wikipedia.org/wiki/List_of_software_development_philosophies) have been proposed. Several are heavily reliant on the principles of Continuous Integration and Delivery/Deployment, such as _**Agile software development**_, a set of software development guidelines centered around the concept of delivering a minimum viable product as fast as possible, then improving it incrementally. A simplified Agile-based software development life cycle involving continuous integration (testing, documentation, deliver) could look like this:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;```Analysis```  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8663;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8664;  
```Integration```&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;```Design```  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8662;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8665;  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;```Implementation```

You _**analyze**_ your requirements, _**design**_ the code (and the tests!), _**implement**_ it and _**integrate**_ into the existing code base (if available). The start the cycle over again for any additional features, fixes etc.

### Can we automate integration?

Manually hecking the integrity of your entire codebase for every iteration of the software development life cycle is boring work. And boring things tend to be forgotten or skipped. You may think: "Oh, but I just added these few lines, I'm _sure_ they will not cause any issues!" And, perhaps, nine times out of ten they won't. But this one time they will, and your codebase degrades. So wouldn't it be great if we could automate the integration and make sure it runs _every time_ the codebase is modified?

Let's assume that we are conscientious coders that write comprehensive tests for every piece of code that we add to our projects. So what we want is that all tests are run automatically for each proposed code change to ensure that (1) the _new_ code behaves as expected and (2) no inconsistencies have been introduced to the _existing_ codebase. The general concept of automating the integration of new code into an existing code base, including testing, static code analysis and any other automatable processes to help keep the codebase in a robust, maintainable state, is known as _**Continuous Integration (CI)**_.

### How does it work?

There are various tools available for your continuous integration and delivery needs, each with their own strengths and weaknesses. Generally, the solutions have you specify all of the steps you would like to automate (running tests, style checks or virtually anything else you can imagine), their sequence, the conditions under which you want to execute them and the environment in which to execute them in a _**continuous integration pipeline (or workflow)**_. Most commonly, this happens in a _declarative_ manner, using a set of keywords to define the sequence, conditions and environments in which commands are executed. The resulting document is then used by the CI engine to set up an isolated environment in which the steps are executed in the defined manner.

Social coding platforms such as [GitLab](https://gitlab.com/) and [GitHub](https://github.com/), as well as other Git servers typically support triggering one or more CI solutions out-of-the-box, i.e., they make it very easy for you to set up running your CI pipelines whenever there is code pushed to your repository or a merge/pull request is opened. And perhaps most importantly, you can **set up your repository in a way that code can _only_ be merged if your CI pipeline passes**!

We will try all of this out further below with GitLab.

### A short note on costs

Running your tests on dedicated infrastructure incurs **costs**. However, companies offering CI solutions typically offer **free plans** which will often be sufficient to support the development of small projects and limited testing (typically you are awarded a free number of minutes per time frame, e.g., at the time of writing, the free GitLab plan comes with 400 minutes per month, the free GitHub plan comes with 2'000 minutes per month).

But it may quickly run out if your tests get more involved or you want to run your tests in multiple environments. For example, running your test suite against multiple versions of Python, or worse, against multiple Python versions _and_ multiple operating systems (and perhaps multiple database versions?) - the "job matrix" can quickly get very extensive for such combinations!

Luckily, CI providers also frequently offer additional free resources for Open Source Software and/or for academic/educational use. If you end up using CI frequently and you realize that you are exceeding your limit for writing scientific software, check whether your provider offers such additional free resources.

Alternatively, your employer/institution may offer bulk licenses or dedicated machines for running CI workloads on local infrastructure. For example, if your group is paying for [**sciCORE**](https://scicore.unibas.ch/) resources and you are eligible to use [sciCORE's GitLab instance](http://git.scicore.unibas.ch/) for your development, you can ask them to set up a _runner_ for executing your continuous integration pipelines. Indeed, for this course, we will make use of such infrastructure: our testing jobs are run on a dedicated testing machine that was set up for us by sciCORE (and for which we pay!).

### A small glossary

When reading up on Continuous Integration, you will often encounter related terms that are often overlapping in meaning to some extent. We would like to briefly discuss and try to distinguish them from one another as best as possible:

* _**Continuous Delivery or Continuous Deployment (CD)**_ refers to techniques that automate rolling out code changes to the consumers, e.g., by preparing releases (bump version, add Git tag, compile package, update to package repository etc.) or updating a web service (take down old version, deploy new version). CD pipelines are typically run right after CI pipelines (which is why you will often encounter the abbreviation _**CI/CD**_), though often by different teams or even organizations. Note that CI and CD are not always fully distinguishable, as CI and CD pipelines typically contain some of the same components, e.g., both will usually contain instructions to run tests.
* _**Continuous Testing**_ specifically refers to the practice of automating testing of all individual pieces of a codebase ("unit testing") as well as the codebase as a whole ("integration testing" / "end-to-end testing"). As such it is conceptually a subset of both CI and CD. Often, different tests are being run for different pipelines, e.g., long-running tests like end-to-end tests are often omitted from CI pipelines so as to speed up the integration of new code by not blocking developers. To mitigate the resulting higher risk of introducing errors into the codebase, those time-intensive are typically run inside of CD pipelines, which often have layered deployment environments to run extensive tests (e.g., development, staging, pre-production) before rolling out changes to consumer ("production environment").
* _**Continuous Documentation**_, similar to _Continuous Testing_, can be regarded as the subset of CI that deals with automating maintenance and quality control measures of documenting code.

## GitLab CI/CD

Given that we are using [sciCORE's GitLab instance](https://git.scicore.unibas.ch/) as our social coding platform of choice, we will be using [GitLab CI/CD](https://docs.gitlab.com/ee/ci/), GitLab's own CI/CD engine and corresponding "language" for defining and running our pipelines. The general process will be the same for other platforms and CI solutions (e.g., [GitHub Actions](https://docs.github.com/en/actions), [Travis CI](https://travis-ci.com/), [CircleCI](https://circleci.com/)).

Like most CI engines, GitLab CI/CD roughly works in the following manner:

1. Create an environment in which to run the code.
2. Get the code from the repository.
3. Run the commands specified in the CI pipeline in the specified order and under the specified conditions.
4. Record the screen output of all commands and return whether the pipeline succeeded (no errors produced) or not.

From the user, the following is required to activate continuous integration for a specific repository (or a group of repositories):

1. Write a [pipeline configuration file](#Pipeline-configuration-file).
2. Commit and push it to your remote on GitLab.
3. Configure under which conditions the pipeline is triggered.

Before we have a look how all of this plays out for a very simple example, we first need to make a small detour to have a brief look at the file format that is used to define GitLab CI/CD pipelines.

### YAML

[YAML](https://yaml.org/) (Yet Another Markup Language) is a broadly used language for *serialization*, enabling a general definition of key-value pairs. It is designed to be a superset of the popular JSON (JavaScript Object Notation) format (in other words: JSON is valid YAML), but it allows for a relaxed, more human friendly notation, making it very popular for writing configuration and other files that have to be edited by humans. As such it is used to define pipelines for several CI and other automation solutions. 

YAML files

* use the file extensions `.yaml` or `.yml`
* start with three dashes (`---`) and end with three dots (`...`) to delineate the beginning and end of the file, respectively (optional in many cases)
* are structured based on indentation (use only spaces _not_ tabs!), with deeper levels in hierarchy being indicated by deeper indentation 


Let's have a look at an example file (taken from [this tutorial](https://www.cloudbees.com/blog/yaml-tutorial-everything-you-need-get-started)):

```yaml
---
 doe: "a deer, a female deer"
 ray: "a drop of golden sun"
 pi: 3.14159
 xmas: true
 french-hens: 3
 calling-birds:
   - huey
   - dewey
   - louie
   - fred
 xmas-fifth-day:
   calling-birds: four
   french-hens: 3
   golden-rings: 5
   partridges:
     count: 1
     location: "a pear tree"
   turtle-doves: two
...
```

The key-value pair structure is apparent, as is the hierarchical nature of the file. For example, `calling-birds` is an array containing 4 elements (each on a separate line indented with respect to they key and starting with a dash character). When parsed in Python, e.g., this will be interpreted as: 

```python
calling-birds = ['huey', 'dewey', 'louie', 'fred']
```

In contrast, `xmas-fifth-day` will be interpreted as a nested dictionary:

```python
xmas-fifth-day = {
    'calling-birds':'four',
    'french-hens':3,
    'golden-rings':5, 
    'partridges': {
        'count':1,
        'location':
        'a pear tree'
    }, 
    'turtle-doves':'two'
}
```

### More info

* [YAML specification](https://yaml.org/spec/1.2.2/)
* [YAML tutorial](https://learnxinyminutes.com/docs/yaml/)
* [YAML idiosyncracies](https://docs.saltproject.io/en/3000/topics/troubleshooting/yaml_idiosyncrasies.html)
* [YAML validator](https://codebeautify.org/yaml-validator)

### Pipeline configuration file

We will now get to writing a very simple pipeline configuration file. **It is required to be defined in file `.gitlab-ci.yml` inside a repository's root directory!**

As you can guess by the file extension `.yml`, the expected file format for GitLab CI/CD configuration files is YAML. But of course you cannot just use any old YAML file. Rather, you need to use a specific set of pre-defined keywords so that GitLab CI/CD is able to understand what you want it to do.

Specifically, the GitLab CI/CD configuration files are structured in terms of _**stages**_ and _**jobs**_. The following table summarizes their uses:

| Component | Execution mode | Use case | Details |
| --- | --- | --- | --- |
| `stage` | sequential | Group one or more (related) jobs | The pipeline is executed stage-by-stage. There are five default stages (`.pre`, `build`, `test`, `deploy`, `.post`) that are executed, stage-by-stage, in the indicated order. With the exception of the `.pre` and `.post` stages, the order can be overridden by the `stages` array. Custom stages can also be defined. By default, jobs not explicitly associated with a particular stage are run in the `test` stage. |
| `job` | parallel | Group one or more (related) tasks | Tasks are basically just Shell commands that are passed to the `script` keyword of a job as an array. Multiple tasks are interpreted/executed sequentially, just like in your Shell, but as long as enough resources are available, jobs associated with one and the same stage are run in parallel (note that free plans for CI solutions often only offer 1 thread to be used at a time). The CI engine progresses to the next stage only when all jobs in a stage finish. |

#### Annotated example

```yaml
# Adapted from https://docs.gitlab.com/ee/ci/pipelines/pipeline_architectures.html#basic-pipelines

image: alpine:3.14.2  # define the environment in which the tests are run
                      # here: a Docker image of Alpine, a very small Linux distribution
                      # note that it is good practice to always include the version of
                      # the image/environment (here: 3.14.2) and not use the `latest` tag

stages:  # define the stages used and the order in which they are executed,
         # i.e., jobs associated with stage `build` will run first, followed
         # by the jobs associated with stage `test`
  - build
  - test

build_a:  # this is the job name; any YAML-compliant keyword is fine
          # except keywords reserved by GitLab CI/CD (e.g., `stages`):
          # see here: https://docs.gitlab.com/ee/ci/yaml/index.html#unavailable-names-for-jobs
  stage: build  # ...and the stage the job will be associated with
  script:  # put your tasks into the script array
    - echo "This job builds something."

build_b:
  stage: build
  script:
    - echo "This job builds something else."
    - echo "It will start at around the same time as test_a."  # this task is executed _after_ the previous one

test_a:
  stage: test
  script:
    - echo "This job tests something."
    - echo "It will only run when all jobs in the build stage are complete."

test_b:
  stage: test
  script:
    - echo "This job tests something else."
    - echo "It will only run when all jobs in the build stage are complete."
    - echo "It will start at about the same time as test_a."
```

To reiterate, _stages_ are run sequentially:

```
                               Time
START |--------------------------------------------------->| END

       <-- build --------------->
                                 <-- test ---------------->
```

Within a given stage, the corresponding _jobs_ are run in parallel:

```
                               Time
START |--------------------------------------------------->| END

       <-- build.build_a ------->
       <-- build.build_b -->
                                 <-- test.test_a ---->
                                 <-- test.test_b --------->
```

#### Keywords for defining jobs

Let's have a closer look at some of the keywords that we can use to define CI jobs (see [here](https://docs.gitlab.com/ee/ci/yaml/index.html#job-keywords) for an exhaustive list):

| Keyword |	Description |
| --- | --- |
| `image` | Docker image (`<image>:<version>`, default version `latest`) to be used; default is to use DockerHub, otherwise the registry also has to be specified |
| `stage` | Defines the stage with which the job is associated|
| `before_script` | Set of commands that are executed before a job; useful to set up an environment |
| `script` | Shell script (array of commands) to be executed |
| `after_script` | Set of commands that are executed after a job, even if it fails; useful to tear down an environment |
| `allow_failure` | If set to true, the failure of the job will not cause the pipeline to fail |
| `only` / `except` | Used to control when jobs are/are not added to pipeline, based on branch names |
| `variables` | Define variables to be passed to a job; [more on predefined CI/CD variables](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html) |

> Note that some of the job kewords can also be used in a _global_ context, i.e., outside of a job description (e.g., `images` or `variables`) to set **custom default values** that are applied to all jobs (unless specifically overridden inside a job). More info: https://docs.gitlab.com/ee/ci/yaml/index.html#custom-default-keyword-values
>  
> Also note that there are **global keywords** that can _only_ be used in the global, but not in the job context, e.g., `stages`. More info: https://docs.gitlab.com/ee/ci/yaml/index.html#global-keywords

More info:

* [Official documentation](https://docs.gitlab.com/ee/ci/)
* [Pipeline validator](https://docs.gitlab.com/ee/ci/lint.html)
* [Pipeline editor](https://docs.gitlab.com/ee/ci/pipeline_editor/)
* [Examples](https://docs.gitlab.com/ee/ci/examples/)

#### Basic example

We will not go into more details of GitLab CI/CD, mostly because you may be using another ecosystem such as GitHub, and then use another CI solution such as GitHub Actions instead. But we do recommend you to learn more details about the particular flavor of CI solution that you have access to. For the scope of this course, it will be enough to just run a couple of quick tests and other simple commands. We can easily dispense with a lot of the extras that the GitLab CI/CD configuration offers until we have more complex use cases, e.g., involving integration tests, tests across multiple different environments (Python/OS versions/browsers), instructions for automated releases/deployments etc.

So let's have a look at a really basic example:

```yaml
default:
  image: python:3.10-slim-buster
  tags:
    - docker

my_tests:
  # Good to put a small description here of what this job does
  before_script:
    - pip install -r requirements.txt
  script:
    - echo "Command 1"
    - echo "Command 2"
```

Et Voilà! Here, job `my_tests` in stage `test` (not indicated, hence associated with the default stage, whose name is `test`) is executed. It consists of the _sequential_ execution of commands 1 and 2 in a Python environment (version 3.10 based on Linux distribution Debian Buster)  _after_ all required Python packages have been installed from file `requirements.txt` inside the repository root directory. Obviously that `pip install` call will fail if there is no such file available.

Note that we are requiring a _Docker_-based GitLab CI/CD _runner_ for the execution of the pipeline. We can tell that because we are providing a default _image_ to use for every job (provided under the global `default` directive), referring to a container _image_ that contains the environment in which the jobs will be run in. Docker is very convenient for defining such environments easily and we will learn more about it in one of the next sessions. However, it is important to note that GitLab CI/CD also provides runners that are _not_ based on Docker. In these, we may need to make sure that required software (e.g., Python) is already available on the system we run the tests on.

> In the Zavolan lab, we have a dedicated (virtual) machine that runs our test jobs, and we have two GitLab CI/CD runners defined that check for CI/CD pipelines defined in any of our repositories (including the ones for your projects). One runner is Docker-based, the other one (the default runner) is not. Hence, whenever we want a GitLab CI/CD pipeline to be handled by the Docker-based runner, we need to tell GitLab which runner to use. To do so, we specify the `docker` tag. Note that this name is arbitrary and was defined by us when registering the runner. It is **not** a reserved keyword of GitLab and therefore won't just magically work if you set up GitLab CI/CD yourself oon gitlab.com or in a sciCORE GitLab repository that is not associated with the Zavolan lab GitLab group. So, **using the `docker` tag _only_ works on projects within our group**!

### Interactive session

#### In your shell

**Go to project repository**

```bash
cd /project/directory
```

**Create feature branch**

```bash
git pull origin main
git checkout -b add_ci_config
```

**Create GitLab configuration file***

```bash
cat << EOF > .gitlab-ci.yml
default:
  tags:
    - docker
  image: python:3.10-slim-buster

my_tests:
  # Good to put a small description here of what this job does
  script:
    - echo "Command 1"
    - echo "Command 2"
EOF
```

**Stage, commit & push**

```bash
git add .gitlab-ci.yml
git commit -m "ci: add CI config"
git push origin add_ci_config
```

#### On GitLab

**CI/CD-related settings/features**

1. Visit https://gitlab.com
2. Select the project you are working on
3. CI/CD > Jobs  
   More info: https://docs.gitlab.com/ee/ci/jobs/
4. Explore the CI job by clicking on the status button or the name field
5. Analytics > CI/CD  
   More info: https://docs.gitlab.com/ee/user/analytics/ci_cd_analytics.html
6. Settings > CI/CD  
   More info: https://docs.gitlab.com/ee/ci/pipelines/settings.html
7. Settings > General > Merge requests  
   More info: https://docs.gitlab.com/ee/user/project/settings/#merge-request-settings
8. Set "Pipelines must succeed" in "Merge checks"  

> Now the configured CI/CD pipelines _must_ pass before a merge request can be merged!

## Managing Python dependencies

For the homework, we need to give you a little preview on how to manage dependencies in your Python projects.

As we have already seen, we can install dependencies such as Flake8, Pylint and Black individually with `pip`. However, it is generally better to keep these dependencies listed in requirements files. By convention, we use two different requirements files:
    
- `requirements.txt`: Here, we list all dependencies that our code requires to run. An example may be BioPython, which some of you may use (or possibly _should_ use)
- `requirements_dev.txt`: Here, we list dependencies that are only used for testing and developing, such as Pytest, Flake8, Pylint, Mypy, Black etc.

All you need to do is create these files as normal text files and add your dependencies by name. For example, the contents of `requirements_dev.txt` could read:

```shell
black
flake8
flake8-docstrings
mypy
pylint
```

Then, in order to install all of these dependencies, you can simply run

```shell
pip install -r requirements_dev.txt
```

# Homework

1. Read [PEP8](https://www.python.org/dev/peps/pep-0008/) and [Google's Python style guide](https://google.github.io/styleguide/pyguide.html); refresh your knowledge about [docstring conventions](https://www.python.org/dev/peps/pep-0257/) and Google-style docstrings ([examples](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html)). 

2. Write comprehensive unit tests (covering all statements!) for your code.

3. Add a `requirements-dev.txt` file to the root directory of your repository and add the following:
   * `pytest`
   * `coverage`
   * `flake8`
   * `flake8-docstrings`
   * `mypy`
   * `pylint`

   > When you write your CI/CD pipeline in exercise 5., make sure that you include the command to install the `requirements_dev.txt` rather than (or in addition to, if your code requires external dependencies) `requirements.txt`.

4. Install the packages in the `requirements_dev.txt` file locally with `pip install -r requirements-dev.txt` and execute the following commands:
   * `coverage run --source code -m pytest`
   * `flake8 --docstring-convention google code_directory/ tests/`
   * `pylint code_directory/ tests/`
   * `mypy code_directory/`
   Now fix all issues they reveal! Once done, push and merge the changes.

   > Don't forget to also install your own package with `pip install .` (or `pip install -e .`), else the tests may not be able to find your code. Do this both locally and, later on in step 5., in the `before_script` section of your test job in the CI/CD pipeline configuration file.

5. Add a GitLab CI/CD pipeline configuration file for your repository that runs the commands from exercise 4. in a single job in the default stage (`test`).

   > Please make sure the pipeline succeeds, i.e., all of your unit tests succeed, your code coverage is 100%, and all the linters are happy.

6. Set up your repository such that, in the future, code can only be merged if the CI pipeline successfully completes, i.e., all tests pass and the linters and static type checker do not complain!