# Automating quality checks
<img src='../images/xebia-logo.png' width='300px' align='right' style="padding: 15px">

This notebooks shows how to use `ruff` as a code formatter and style checker, and how to automate runninng multiple quality checks with pre-commit.

Let's start with `ruff`. The first step is installing it by running `poetry add -G dev ruff`.

**Question:** Why are you installing it as a `dev` dependency?

## Formatting

To format you project you can run `ruff format .`, and if you want to check what would change but without applying the changes you can add the `--diff` flag.

#### <mark>Exercise: Trying out the formatter

Apply formatting to the project and check what changes `ruff` will make compared to your original code.

You can also experiment and try multiple different styles and see whether `ruff` will change them or not.

## Styling

Apart from formatting, `ruff` can act as a code linter that detects a multitude of potential stylistic issues. `ruff` has a list of over 700 rules, but only some of them are activated by default.

To run the style checker you can run `ruff check .` in your project.

#### <mark>Exercise 1:</mark> Trying out the linter

- Run `ruff check` in your project and fix the stylistic errors it reports.
- Some errors might be automatically fixable by passing the `--fix` flag.

### Configuring `ruff`

You can turn-on additional rules on the `pyproject.toml`.

```toml
[tool.ruff.lint]
extend-select = [
  "UP", # pyupgrade
  "D",  # pydocstyle
  "N",  # PEP8 names
  "I"  # isort
]
```

Or ignore specific rules with `ignore` under `[tool.ruff.lint]`.

You can also exclude particular directories.

```toml
[tool.ruff]
exclude = [
    ".git",
    ".pyenv",
    ".pytest_cache",
    ".pytype",
    ".ruff_cache",
     ".venv"
     ]
```

#### <mark>Exercise 2:</mark> Adding rules

Choose some additional rules from the `ruff` documentation (https://docs.astral.sh/ruff/rules/) and explore what effect they have on your project.

<a id='pre'></a>
## Automating checks with pre-commit
It can get annoying having to run `ruff` everytime before we share our code with our colleagues, for example by pushing it to our `git` repository. 

That's where `pre-commit` comes in. With `pre-commit` we can configure various checks on our code before our code is committed to our repository. 

We can add pre-commit as a dev dependency and generate a default configuration file.

```
poetry add -G dev pre-commit
(poetry run) pre-commit sample-config > .pre-commit-config.yaml
```

Then we can install pre-commit with 

```
pre-commit install
```

This will ensure that the pre-commit hooks are run before the code is _actually_ commited.

To rever the installation, you can:

```
pre-commit uninstall
```

#### <mark> Exercise: </mark> Testing pre-commit

Try now to commit some changes to the repository. 

Read the messages that you get. Some hooks fail; why? 

With the sample configuration, you automatically checked for trailing whitespace, end of file fixer (newline at the end of the file, check if any yamls that exist in the repo are parseable and whether any large files were added. The pre-commit failed on the `end-of-file fixer`, but immediately corrected it. The pre-commit also failed on the `check-added-large-file check`, as the `test.cs` and`train.csv` exceeded the allowed limit. Want these files to be checked in anyway? Adjust your configuration by removing these checks. 

Have a look at [the documentation](https://pre-commit.com/hooks.html) to see what other checks you can add! For example, `check-toml` to check whether the toml file is parseable. 

### Some notes on pre-commit

Pre-commit is a very widely used tool to run multiple checks within repositories that contain Python packages. It makes it quite convenient to configure what checks to run.

However, as a developer, having pre-commit running in your local machine can become annoying since you won't be able to commit changes that contain code that doesn't pass the tests. Sometimes you might want to commit WIP changes that don't pass all tests (e.g. you don't have time to fix your code and wan't to save your WIP).

You can configure and use `pre-commit` without running automatically on every commit by ommiting the `poetry run pre-commit install` command during setup, or by explicitly unistalling it via `precommit uninstall`. To run pre-commit manually you can execute `pre-commit run --all-files`.

Additionally, some `pre-commit` hooks run their actions on different environments than your package's virtual environment. 

#### <mark> Exercise: </mark> Adding `ruff` to pre-commit

Add `ruff` to `pre-commit` and check that `ruff check` and `ruff format` run when you run `pre-commit`.

- Have a look at: https://github.com/astral-sh/ruff-pre-commit
- And also investigate how to add a *local* `pre-commit` hook that runs the version of `ruff` installed in your package.
    - *Hint: set the `repo` value to `local` and use the values `id`, `name`, `entry`, `language`, `pass_filenames` and `always_run`.*


<a id='type'></a>
## Type hinting & checking

Type hints make it much easier to statically reason about your code.
Signalling what types are used by your code can serve as documentation, help linters & IDEs, and help catch errors by checking the hints.

Type hinting in Python is not necesarily a do or do-not choice: you can gradually add type hints.
A good practice is to at least add type hints to the public functionality of your library.

Let's discuss some examples.

`-> None` tells us that this function returns `None`.

In [None]:
def p() -> None: 
    print('hello')

?p

p()

The function below accepts an argument `names` that should consist of a list with strings in it.

In [None]:
def greet_all(names: list[str]) -> None: 
    for name in names:
        print('Hello ' + name)

?greet_all


greet_all(['Jane', 'Mike'])

Type hints are *hints*.
You can still disregard them:

In [None]:
greet_all(('Jane', 'Mike'))

[Duck typing](https://en.wikipedia.org/wiki/Duck_typing) is supported: you can signal that `names` can be any collection that supports iteration:

In [None]:
from typing import Iterable

def greet_all(names: Iterable[str]) -> None:
    for name in names: 
        print('Hello ' + name)

Even though type hints are *hints* and are not enforced by the python interpreter, you can enforce them yourself by running a type checker. The most widely used one is `mypy`.

### <mark> Exercise:</mark> set up mypy

Add type hints to modules `data.py` and `features.py` so that mypy doesn't return any errors when you run `poetry run mypy src/`.

You will need to install `mypy` as a development dependency, and install extensions for types included on Pandas (i.e. stubs). If you run `mypy` without having installed the stubs, you will get an error telling you how to do it. However, try installing any package needed via poetry instead of via pip.

- Can you check what the `--disallow-untyped-defs` and `--strict` options do?

Also, extend your pre-commit config to include mypy using a local hook!

## Extra: running pre-commit from github actions

So far, this notebook has covered how to set up automatic quality checks in your local development machines. When collaborating with others, it is best to share the configuration of all these quality checks (i.e. commiting the configuration to the repository) to ensure that everyone is running the same quality checks.

Moreoever, all modern git-based code-sharing services (i.e. git forges, e.g. github, gitlab, gitea) offer running these quality checks on their platforms, to enforce the quality checks on all code included in the repositories.

For example, to run pre-commit on all commits to the `main` banch and on all commits with branches with PRs opened, you can add the following yaml code to `.github/workflows/pre-commit.yml`.

```yaml
name: pre-commit

on:
  pull_request:
  push:
    branches: [main]

jobs:
  pre-commit:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v4
      with: 
        python-version-file: pyproject.toml
    - uses: abatilo/actions-poetry@v2
    - run: poetry install
    - run: poetry run pre-commit run
```

You can try it out by pushing your branch to the repository and creating a PR from it to `main`.