# <font color='#1EB0E0'>Code quality</font>
<img src='../images/gdd-logo.png' width='300px' align='right' style="padding: 15px">

Writing modular, reusable code is important for code quality.

Code is a means to communicate: you use it to communicate with machines but also with other developers. Therefore high quality code is good communication.

Code of high quality is correct, human readable, consistent, modular and reusable.
This involves fundamentals like code styling, but also concerns naming, code structure and principles like DRY (Don't repeat yourself), the [rule of three](https://en.wikipedia.org/wiki/Rule_of_three_&#40;computer_programming&#41;) and [single responsibility principle](https://en.wikipedia.org/wiki/Single_responsibility_principle).

In this notebook we shall explore methods for ensuring your code is of high quality.

- [Refactoring](#refactor)
- [Formatting](#format)
- [Importing](#import)
- [Styling](#style)
- [Bonus: Pre-commit](#pre)

<a id='refactor'></a>
## Refactoring

The code in `add_features()` produces the correct output, but it's not good code yet.
The function is doing multiple things (checking sex, getting hair type, etc.) and that is [not OK](https://blog.codinghorror.com/curlys-law-do-one-thing/).

### <mark> Exercise: Refactoring

Move the sub-logic from `add_features()`  to the appropriate functions in:

 - `check_has_name()`
 - `get_sex()`
 - `get_neutered()`
 - `get_hair_type()`
 - `compute_days_upon_outcome()`    

 The function `check_is_dog()` is already filled in for you.
 All functions take a `Series` (a column in our `DataFrame`) and return a `Series`.

After this exercise `add_features()` should look something like:


 ```python
 def add_features(df):
     df['is_dog'] = check_is_dog(df['animal_type'])
     df['has_name'] = check_has_name(df['name'])
     # ...
     return df
 ```


### <mark> Exercise: Side effects

It already looks better and more structured, but there are still things that should be improved.

 The function `add_features()` has an unexpected [side effect](https://softwareengineering.stackexchange.com/questions/15269/why-are-side-effects-considered-evil-in-functional-programming): the input `df` gets changed when the function is called.
    
 Generally, you want to avoid this kind of unexpected behaviour. How could you avoid this?

 What would you do to improve these functions further?

<a id='format'></a>
## Formatting

We'll focus on formatting the code with [black](https://github.com/psf/black). black has become the de-facto standard in the last few years.

Install black with:
```sh
poetry add black
```

Notice that it has added it as part of the [`pyproject.toml`](../pyproject.toml)

### <mark>Exercise: Formatting
 
Often, you'll just *apply* black and don't spend too much time looking at the results. Now, however, it's interesting to see what black would change.

Run the following command below in the animal_shelter folder. What changes would black make?
```sh
poetry run black --diff --color src/
```

Now that you know what will change, let black format the code. Pick one of these options:
- In VS Code, use 'Format Document' to format the module `features.py`.
- In PyCharm, right-click the file and select 'Reformat Code'
- Format the module `features.py` from the command line with `poetry run black src/`

<a id='import'></a>
## Importing
    
Next up, our imports may not be in the right order. 

`isort` is a Python library that reformats your file to adhere to the standards regarding sorting. This means the imports are sorted alphabetically and the imports are automatically separated into sections and by type. 


Install isort with:
```sh
poetry add isort
```
### <mark>Exercise: Sort our imports
Run the following command below in the animal_shelter folder. What changes would isort make?
```sh
poetry run isort --diff --color src/
```
Afterwards you can implement the changes by running the command below.
```sh
poetry run isort src/
 ```

<a id='style'></a>
## Styling

The next step is styling.
Style guides dictate how you should write your code so that everyone uses a single, consistent style.
This facilitates good communication.
There's [PEP8](https://www.python.org/dev/peps/pep-0008/) for Python; [Google's Style Guide](https://google.github.io/styleguide/Rguide.xml) or [Advanced R](http://adv-r.had.co.nz/Style.html) for R; and the official [Guide](https://docs.scala-lang.org/style/) for Scala.

Install flake8 with:
```sh
poetry add flake8
```

### <mark>Exercise: Styling</mark>
    
We have been using `add_features()` in  `features.py` of our Python package `animal_shelter` to add features to our data.
    
However, it doesn't follow the PEP8 standards. Most violations are whitespace problems and variable names, so this should be pretty easy to fix.

Open the project folder in [VS Code](https://code.visualstudio.com/) and set the linter to flake8 (`view > Command Palette > Python: Select Linter > flake8`). If it says that flake8 is not installed make sure you have selected the correct Python interpreter in VS Code (`view > Command Palette > Python: Select Interpreter > Python 3.9.6 ('animal-shelter')`). Although it is not as conveniant, it is also possible to configure flake8 in [PyCharm](https://www.programmerall.com/article/93221446897/). 

Then navigate to the file `animal_shelter/features.py`.
 
Hover over to the yellow curly lines to see what flake8 deems wrong and make corrections.

If you don't have VS Code, change the code in your favorite editor until the following command doesn't return errors:

 ```bash
poetry run flake8 src/animal_shelter/features.py --show-source
 ```


`flake8` reports the style violations of your code, tries to decypher its output and fix the code.

<a id='pre'></a>
## BONUS: Pre-commit
It is quite a hassle to manually run `flake8` and `black` everytime before we share our code with our colleagues, for example by pushing it to our `git` repo. 

That's where `pre-commit` comes in. With `pre-commit` we can configure various checks on our code before our code is committed to our git repo. 

Let's see an example. Make sure your current working directory is the `animal_shelter` folder.

1. Run: 
```bash
git init
```
This creates a new Git repository, necessary for pre-commit. 


2. Run: 
```bash
poetry add pre-commit
```
This installs the package pre-commit in your poetry environment. Do you not have poetry configured? `pip install pre-commit` will work as well. 


3. Run: 
```bash
poetry run pre-commit sample-config > .pre-commit-config.yaml
```
This creates a simple configuration for pre-commit similar to what you see below. Investigate this config.


4. Run: 
```bash
poetry run pre-commit install
```
This will ensure that when you run `git commit`, before your code is _actually_ committed, the checks (hooks) are run first. 


5. Run: 
```bash
git add *
git commit -m 'my first git commit'
```

Read the messages that you get. Some hooks fail; why? 

**Congratulations!** You created your first pre-commit hooks and succesfully ran them. 

With your sample configuration, you automatically checked for trailing whitespace, end of file fixer (newline at the end of the file, check if any yamls that exist in the repo are parseable and whether any large files were added. The pre-commit failed on the end-of-file fixer, but immediately corrected it. The pre-commit also failed on the check-added-large-file check, as the test.csv and train.csv exceeded the allowed limit. Want these files to be checked in anyway? Adjust your configuration by removing these checks. 

Have a look at [the documentation](https://pre-commit.com/hooks.html) to see what other checks you can add! For example, `check-toml` to check whether the toml file is parseable. 

### <mark> Bonus Exercise 1: Add flake8 to the pre-commit

You can also extend pre-commit with flake8 and black. Check [the documentation](https://pre-commit.com/hooks.html) again and search for `flake8`. There are a couple of hits, but the most promising one is the one under `https://github.com/PyCQA/flake8`. Extend your `.pre-commit-config.yaml` with the following: 

```.pre-commit-config.yaml
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
-   repo: https://gitlab.com/pycqa/flake8
    rev: 8f9b4931b9a28896fb43edccb23016a7540f5b82
    hooks:
    -   id: flake8
```

`repo` is the repo where flake8 is implemented. `rev` is the most recent commit, to ensure you have the most recent version of the repo. `hooks: - id: flake8` indicates that you want to run flake8 from this repo. 

Run: 
```
git add *
git commit -m 'my second git commit'
```
and verify flake8 is executed! 

### <mark> Bonus Exercise 2: Add black to the pre-commit</mark>

1. Find `black` in [the documentation](https://pre-commit.com/hooks.html). Extend your pre-commit config with the repo you find (*hint: the repo name is short for python software foundation*).
 
2. Find the most recent commit by visiting repo. 
 
3. Add the hook. In the [documentation](https://pre-commit.com/hooks.html), there are two options for this repo: `black` and `black-jupyter`. Choose which one you want. 
 
Template:
``````.pre-commit-config.yaml
repos:
-   repo: <the black repo> 
    rev: <the most recent commit hash> 
    hooks:
    -   id: <the thing you want to execute - black or black-jupyter> 
```
    
4. Afterwards, run: 
```
git add *
git commit -m 'my second git commit'
```
and verify black is executed! 

    
---
<br>
<br><br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

<br>
<br>
 
 

   
    
    
    
    
    
    
    
    
    
### <mark>Solution    
```.pre-commit-config.yaml
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
    -   id: check-yaml
    -   id: check-added-large-files
-   repo: https://gitlab.com/pycqa/flake8
    rev: 8f9b4931b9a28896fb43edccb23016a7540f5b82
    hooks:
    -   id: flake8
-   repo: https://github.com/psf/black
    rev: 64c8be01f0cfedc94cb1c9ebd342ea77cafbb78a
    hooks:
    -   id: black
    
``` 