# 1. Linting and autoformatting notebooks

> There is a notebook [02_notebook_to_format.ipynb](./02_notebook_to_format.ipynb) that has contains Python PEP8 violations to test. Follow the instructions below.

## 1.1 Linting notebooks using `flake8`

If you are new to coding (or even experienced!) you might be delighted to know that software exists to help you write cleaner code and keep to PEP8 standards. These are called code **linters**.

There are a number of linters you can choose from. Here I make use of **flake8**. I’ve always found this helpful.

To use flake8 with a Jupyter notebook requires another package `nbqa` ([Quality Assurance for Jupyter Notebooks](https://nbqa.readthedocs.io/en/latest/)). This can be installed using `conda`, `mamba` or `pip`.

> There is a [virtual environment](binder/environment.yml) provided you can use to install the software needed.

For example to run the linter with this particular notebook I would run the following in the terminal:

```bash
nbqa flake8 02_notebook_to_format.ipynb
```

I get the following output:

```yml
02_notebook_to_format.ipynb:cell_1:2:1: F401 'numpy as np' imported but unused
02_notebook_to_format.ipynb:cell_2:1:35: E231 missing whitespace after ','
02_notebook_to_format.ipynb:cell_2:1:74: E231 missing whitespace after ','
02_notebook_to_format.ipynb:cell_2:1:80: E501 line too long (117 > 79 characters)
02_notebook_to_format.ipynb:cell_2:1:118: W291 trailing whitespace
02_notebook_to_format.ipynb:cell_2:4:80: E501 line too long (104 > 79 characters)
02_notebook_to_format.ipynb:cell_2:5:80: E501 line too long (124 > 79 characters)
02_notebook_to_format.ipynb:cell_2:6:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:9:80: E501 line too long (83 > 79 characters)
02_notebook_to_format.ipynb:cell_2:10:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:12:80: E501 line too long (80 > 79 characters)
02_notebook_to_format.ipynb:cell_2:14:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:15:80: E501 line too long (105 > 79 characters)
02_notebook_to_format.ipynb:cell_2:16:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:17:80: E501 line too long (110 > 79 characters)
02_notebook_to_format.ipynb:cell_2:18:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:22:80: E501 line too long (93 > 79 characters)
02_notebook_to_format.ipynb:cell_2:23:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:27:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:31:1: W293 blank line contains whitespace
02_notebook_to_format.ipynb:cell_2:32:80: E501 line too long (421 > 79 characters)
02_notebook_to_format.ipynb:cell_2:33:1: W293 blank line contains whitespace
```

## 1.2 Explaining `flake8` output

You can see the full list of flake8 violations [here](https://flake8.pycqa.org/en/latest/user/error-codes.html)

The first two line of the output can be interpreted in the table below

* To **resolve** failure 1 we would decide if the `numpy` import should be removed.  Avoid imports that are not needed.

* Failure 2 is best resolved using `black` to autoformat the notebook

| **Attribute**   | **Failure 1**                     | **Failure 2**                 |
|-----------------|-----------------------------------|-------------------------------|
| **File name**   | `02_notebook_to_format.ipynb`     | `02_notebook_to_format.ipynb` |
| **Cell**        | Cell 1                            | Cell 2                        |
| **Line**        | 1                                 | 1                             |
| **Column**      | 1                                 | 35                            |
| **Error Code**  | F401                              | E231                          |
| **Description** | 'numpy as np' imported but unused | missing whitespace after ','  |



**Notes:**

* The **column number** is 1-indexed, meaning the first character of the line is at column 1. In this case, the error is at the 35th character of the line, which is where flake8 has identified the missing whitespace after a comma

* To toggle line numbers on in a notebook: Select the cell. Press ESC. Shift-L.

## 1.3 Autoformatting code in notebooks using `black`

To help with PEP8 compliance we can make use of a code formatter.  In the `jupyter-tips` environment you can combine `nbqa` with `black`. This software will greatly improve the code format, but may not fully comply with PEP8. So it is often useful to run a linter afterwards to check.  

It can be run as follows in a terminal:

```bash
nbqa black 02_cleaner_notebooks.ipynb
```

**Caveats**

By default `black` uses a longer line length than 79. We can modify the line length parameter as follows:

```black
nbqa black 02_cleaner_notebooks.ipynb --line-length=79
```

> Note that black will note break strings for you. If you have long strings in functions/classes/cells you will need to handle this manually using standard techniques.