<div align="center">
<h1>Pre-commit Guide</a></h1>
by Hongnan Gao
<br>
</div>

# Pre-commit

## Intuition

Before performing a commit to our local repository, there are a lot of items on our mental todo list, ranging from styling, formatting, testing, etc. And it's very easy to forget some of these steps, especially when we want to "push to quick fix". To help us manage all these important steps, we can use pre-commit hooks, which will automatically be triggered when we try to perform a commit.

> Though we can add these checks directly in our **CI/CD** pipeline (ex. via GitHub actions), it's significantly faster to validate our commits before pushing to our remote host and waiting to see what needs to be fixed before submitting yet another PR. So the advantage of pre-commit over CI/CD checks is the speed of knowing your errors early, and to avoid excessive PR request should your commit fails.

## Installation

We first install the package called [`pre-commit`](https://pre-commit.com/). Recall that we can put these in our `requirements.txt` as well.

```bash
# Install pre-commit
pip install pre-commit
pre-commit install
```

In [11]:
!source venv_reighns/bin/activate; python3 -m pip install -q pre-commit # colab
# pip3 install -q pre-commit                                            # IDE

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m197.8/197.8 KB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.6/98.6 KB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m596.3/596.3 KB[0m [31m21.2 MB/s[0m eta [36m0:00:00[0m
[?25h

We need to initialize a `git` repo first before running `pre-commit install`. Note this `git` can be local!

In [14]:
!git init

Initialized empty Git repository in /content/reighns/.git/


In [15]:
!source venv_reighns/bin/activate; pre-commit install

pre-commit installed at .git/hooks/pre-commit


## Config

Similar to creating `.flake8` and `pyproject.toml` file in our **Styling Guide**, we also have to create some settings for our `pre-commit` files.

In other words, we need to tell `pre-commit` what kinds of checks to perform prior to committing.

For a starter, we use the **default** config file provided by `pre-commit` and add on other config later.

In [18]:
# Simple config
!source venv_reighns/bin/activate; pre-commit sample-config > .pre-commit-config.yaml

We can see from the default template that some default checks like `trailing-whitespace` are already in place.

In [19]:
!cat .pre-commit-config.yaml

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
    -   id: check-yaml
    -   id: check-added-large-files


## Hooks

What are Hooks? Let us read the [documentation](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks) to get a general idea of what hooks are.

### Built-in

Inside the sample configuration, we can see that pre-commit has added some default hooks from it's repository. It specifies the location of the repository, version as well as the specific hook ids to use. We can read about the function of these hooks and add even more by exploring pre-commit's [built-in hooks](https://github.com/pre-commit/pre-commit-hooks). Many of them also have additional arguments that we can configure to customize the hook.

> Be sure to explore the many other built-in hooks because there are some really useful ones that we use in our project. For example, check-merge-conflict to see if there are any lingering merge conflict strings or detect-aws-credentials if we accidently left our credentials exposed in a file, and so much more.

And we can also exclude certain files from being processed by the hooks by using the optional exclude key. There are many other [optional keys](https://pre-commit.com/#pre-commit-configyaml---hooks) we can configure for each hook ID.

```yaml
# Inside .pre-commit-config.yaml
...
-   id: check-yaml
    exclude: "mkdocs.yml"
...
```

### Custom

We can also define custom hooks.

For example, if we want to apply formatting checks with Black as a hook, we can leverage Black's pre-commit hook.

```yaml
# Inside .pre-commit-config.yaml
...
-   repo: https://github.com/psf/black
    rev: 20.8b1
    hooks:
    -   id: black
        args: []
        files: .
...
```

This specific hook is defined under a [`.pre-commit-hooks.yaml`](https://github.com/psf/black/blob/main/.pre-commit-hooks.yaml) inside Black's repository, as are other custom hooks under their respective package repositories.


### Local

Finally, we can define local hooks. Recall that in `Makefile` we defined:

```Makefile
# Cleaning
.PHONY: clean
clean:
	find . | grep -E ".ipynb_checkpoints" | xargs rm -rf
```

And so we want the `pre-commit` to run this `clean` from `Makefile`. The below commands will do, note that we need to define `entry` as `make` to tell the `pre-commit` that `make` is our local command.

```yaml
# Inside .pre-commit-config.yaml
...
- repo: local
  hooks:
    - id: clean
      name: clean
      entry: make
      args: ["clean"]
      language: system
      pass_filenames: false
```

## Commit

Our `pre-commit` hooks will automatically execute when we try to make a `git commit`. We'll be able to see if each hook passed or failed and make any changes. If any of the hooks failed, we have to fix the corresponding file or in many instances, reformatting will occur automatically.

```html
(venv_test) reighns@HONGNANs-MacBook-Air emergency_forecast % git commit -a                                       
Trim Trailing Whitespace.................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
Check python ast.....................................(no files to check)Skipped
Check JSON...........................................(no files to check)Skipped
Check for merge conflicts................................................Passed
black................................................(no files to check)Skipped
flake8...............................................(no files to check)Skipped
isort................................................(no files to check)Skipped
pyupgrade............................................(no files to check)Skipped
clean....................................................................Passed
```

If some `hooks` failed, the messages will show up accordingly, most of the times the failed `hooks` will be formatted automatically and therefore you just need to `git commit` again to ensure that all hooks are passed.

## Run

Though pre-commit hooks are meant to run before (pre) a commit, we can manually trigger all or individual hooks on all or a set of files.

```bash
# Run
pre-commit run --all-files  # run all hooks on all files
pre-commit run <HOOK_ID> --all-files # run one hook on all files
pre-commit run --files <PATH_TO_FILE>  # run all hooks on a file
pre-commit run <HOOK_ID> --files <PATH_TO_FILE> # run one hook on a file
```

## Update

In our `.pre-commit-config.yaml` configuration files, we've had to specify the versions for each of the repositories so we can use their latest hooks. Pre-commit has an autoupdate CLI command which will update these versions as they become available.

```bash
# Autoupdate
pre-commit autoupdate
```

We can also add this command to our Makefile to execute when a development environment is created so everything is up-to-date.

```make
# Makefile
...
.PHONY: install-dev
install-dev:
    python -m pip install -e ".[dev]" --no-cache-dir
    pre-commit install
    pre-commit autoupdate
...
```


## Tips and Cautions

1. Match the versions between your local development environment and the `pre-commit` hooks. For example:

    ```yaml
    -   repo: https://github.com/psf/black
        rev: 22.3.0
        hooks:
        -   id: black
            args: []
            files: 
    ```

    We want to use `pre-commit` hooks from `black` to check styling and re-format when necessary. This `black` version should necessarily be the same as the local development envrionment in `requirements.txt`. By local development, it can mean:
    - VSCode's `black` formatter;
    - `Makefile` commands to use `black` to format;

    The intuition is that different versions of `black` may have differing changes. The versioning difference issues can be resolved if in the `Update` section.

2. When you see [`(no files to check) Skipped`](https://stackoverflow.com/questions/54697699/how-to-propertly-configure-my-pre-commit-and-pre-push-hooks) message during `pre-commit` checks. This is because the **committed files** do not have the file type that requires checkings. For example, your `black` check in `pre-commit` will say `no files to check` if the `git commit` has no `.py` files.

    `pre-commit` will pass a list of [staged files which match `types` / `files`](https://pre-commit.com/#arguments-pattern-in-hooks) to the `entry` listed.

    Your commit shows "no files to check" because there were no python files in your commit.  You probably want to run `pre-commit run --all-files` when first introducing new hooks.

## Ignore Pre-commit

Use `git commit -n` flag to ignore pre commit runs.

> By default, the pre-commit and commit-msg hooks are run. When any of --no-verify or -n is given, these are bypassed. See also githooks[5].

## Workflow

### Workflow in IDE

Assuming Mac environment:

```bash
pip3 install -q pre-commit                          # install pre-commit in your vm
pre-commit install                                  # make sure the dir is a git repo
pre-commit sample-config > .pre-commit-config.yaml  # create default config file and populate with custom configurations if need be
pre-commit run --all-files                          # run all hooks on all files in the repo during first commit
git add .                                           # add all files to the repo
git commit -a                                       # commit all files and pre-commit will automatically run all hooks
```

## References

- [Offical Pre-commit documentation](https://pre-commit.com/)
- [Pre-commit madewithml](https://madewithml.com/courses/mlops/pre-commit/)
- [Git Hooks](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks)