<img src='../images/gdd-logo.png' width='300px' align='right' style="padding: 15px">

# <font color='#1EB0E0'>Supercharge your package</font>

In this assignment we will take our package to the next level

- [Type hinting & checking](#type)
- [Testing](#test)
- [Logging](#log)
- [Documentation](#docs)
- [Build the package](#build)

In [1]:
%load_ext autoreload
%autoreload 2

<a id='type'></a>
## Type hinting & checking


Type hints make it much easier to statically reason about your code.
Signalling what types are used by your code can serve as documentation, help linters & IDEs, and help catch errors by checking the hints.

Type hinting in Python is not necesarily a do or do-not choice: you can gradually add type hints.
A good practice is to at least add type hints to the public functionality of your library.

Let's discuss some examples.

`-> None` tells us that this function returns `None`.

In [1]:
def p() -> None: 
    print('hello')

?p

p()

hello


The function below accepts an argument `names` that should consist of a list with strings in it.

In [2]:
def greet_all(names: list[str]) -> None: 
    for name in names:
        print('Hello ' + name)

?greet_all


greet_all(['Jane', 'Mike'])

Hello Jane
Hello Mike


Type hints are *hints*.
You can still disregard them:

In [3]:
greet_all(('Jane', 'Mike'))

Hello Jane
Hello Mike


[Duck typing](https://en.wikipedia.org/wiki/Duck_typing) is supported: you can signal that `names` just needs to be something to iterate over:

In [4]:
from typing import Iterable

def greet_all(names: Iterable[str]) -> None:
    for name in names: 
        print('Hello ' + name)

### <mark> Exercise: mypy</mark>

Add type hints to modules `data.py` and `features.py` so that mypy doesn't return any errors when you run
 
```sh
poetry run mypy src/ --ignore-missing-imports
```

<!-- https://stackoverflow.com/questions/60247157/how-can-i-get-stub-files-for-matplotlib-numpy-scipy-pandas-etc -->

**Bonus**: extend your pre-commit to include mypy!

<a id='test'></a>
## Testing

Tests help you determine if your code does what you expected it to do.

There are different types of test.
The [most important tests](http://slides.com/treycausey/pydata2015#/) for Data Scientists are:
- **unit tests** that focus on small units of code like functions; 
- **integration tests** for whole systems;
- **regression tests** to test if software performs the same after changes;

In addition, you probably want to have systems checking data quality and monitoring if your model is still performing as expected.
Those test won't be discussed here: we'll only show unit tests.

[Unit testing](https://jeffknupp.com/blog/2013/12/09/improve-your-python-understanding-unit-testing/) is easy as calling your function and `assert`-ing that the function behaves as expected:

In [22]:
from animal_shelter.data import convert_camel_case

result = convert_camel_case('CamelCase')
expected = 'camel_case'  # TODO: Adjust this to see what happens.

assert result == expected  # Check if it's true!

We `expected` something and compared it to the `result` our function returned, it's as easy as that.

Python unit tests generally go in a folder called `tests/` and contain modules starting with `test_`.
These modules again contain functions and classes starting with respectively `test_` and `Test`.
It's tests all the way down.

Our project has a folder called `tests/` and the modules `test_data.py` and `test_features.py` contain unit tests to check the functions that you've made. 
Check them out!

Note that most functions in `test_features.py` don't use `assert`, but use the `pandas` utility function `assert_series_equal()` to check if `Series` are the same.
Many libraries have utility functions to make writing tests easier.

Run the unit tests using [`pytest`](https://docs.pytest.org/en/latest/):

```bash
$ poetry run pytest tests/
```

You'll get some error messages because `test_is_dog()` has not been implemented yet!

### <mark> Exercise: pytest</mark>

Create a test case to check if `is_dog()` is implemented correctly.

Make sure that `pytest` doesn't return any errors.

**Bonus**: `is_dog` raises an exception if something other than cats or dogs is encountered.

Test that this exception if raised if invalid input is given.

<a id='log'></a>
## Logging

Logging helps you understand what's happening when your code is being run.

A common mistake is that people *configure* logging in their library.
This can give problem if the application using your library also wants to configure logger.

### <mark> Exercise: pytest</mark>

The function `check_is_dog()` you were provided with has a print statement.
```python
def check_is_dog(animal_type):
    """Check if the animal is a dog, otherwise return False.
    Parameters
    ----------
    animal_type : pandas.Series
        Type of animal
    Returns
    -------
    result : pandas.Series
        Dog or not
    """
    is_cat_dog = animal_type.str.lower().isin(['dog', 'cat'])
    if not is_cat_dog.all():
        print('Found something else but dogs and cats:\n%s',
              animal_type[~is_cat_dog])
        raise RuntimeError("Found pets that are not dogs or cats.")
    is_dog = animal_type.str.lower() == 'dog'
    return is_dog
```

Your task is to replace this print statement and error raise with a logging call. 

Afterwards, run the cells below to check the logging. You can experiment with setting different logging levels.
```python
logger.setLevel(level=logging.CRITICAL)
```

In [51]:
def check_is_dog(animal_type):
    """Check if the animal is a dog, otherwise return False.
    Parameters
    ----------
    animal_type : pandas.Series
        Type of animal
    Returns
    -------
    result : pandas.Series
        Dog or not
    """
    is_cat_dog = animal_type.str.lower().isin(['dog', 'cat'])
    if not is_cat_dog.all():
        logging.error(f"Found something else but dogs and cats :\n' {animal_type[~is_cat_dog]}")
        #print('Found something else but dogs and cats:\n%s',animal_type[~is_cat_dog])
        #raise RuntimeError("Found pets that are not dogs or cats.")
    is_dog = animal_type.str.lower() == 'dog'
    return is_dog

In [52]:
import pandas as pd
#from animal_shelter.features import check_is_dog

# Create some data to test the function logging
cat_dog = pd.Series(['Cat', 'Dog'])
cat_dog_pig = pd.Series(['Cat', 'Dog', 'Pig'])

In [53]:
import logging

# This allows us to update the level of the logging
logging.basicConfig(level=logging.CRITICAL)

In [54]:
logger = logging.getLogger()
logger.setLevel(level=logging.INFO)

In [55]:
check_is_dog(cat_dog)

0    False
1     True
dtype: bool

In [56]:
check_is_dog(cat_dog_pig)

ERROR:root:Found something else but dogs and cats :
' 2    Pig
dtype: object


0    False
1     True
2    False
dtype: bool

<a id='docs'></a>
## Documentation

Documentation will help the users of your code.
Having documentation in your codebase is already good, but we can use Sphinx to make the documentation easier to write.


### <mark>Exercise 1: Sphinx </mark>

Install sphinx, create a sub-directory ‘docs’ and run `sphinx-quickstart` inside the docs directory.

Run:  
```bash
poetry add --dev sphinx
mkdir docs
cd docs
poetry run sphinx-quickstart
```

Create an HTML version of the generated docs by running `make html` inside the docs directory. 

Run:  
```
poetry run make html
```

Open the generated pages in your browser; the HTML pages can be found in `_build/html`.


###  <mark>Exercise 2: Docstrings</mark>

<!-- Add some API documentation using docstrings + autodoc. -->
To automatically include docstrings in the documentation, 

First edit the `docs/conf.py`: 

```python
extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.napoleon'
]
```

<!-- and uncomment
```python
import os
import sys
sys.path.insert(0, os.path.abspath('../src'))
```
 -->
Then extend the end of the `docs/index.rst` file with:

```
API
---

data module
===========

.. automodule:: animal_shelter.data
   :members:
   :undoc-members:
   :show-inheritance:

features module
===============

.. automodule:: animal_shelter.features
   :members:
   :undoc-members:
   :show-inheritance:
``` 

Recreate the docs by running 
```
poetry run make clean
poetry run make html
```

###  <mark>Exercise 3: ReadTheDocs theme</mark>

Change the theme to the ReadTheDocs theme:
Run 
```
poetry add --dev sphinx-rtd-theme`
```
edit the `docs/conf.py` with: 
```python

html_theme = 'sphinx_rtd_theme'
```

Recreate the docs by running 
```
poetry run make clean
poetry run make html
```

<a id='docs'></a>
## Building a Package

Use poetry to build the package and inspect what is in the artefacts (=generated files).

```bash
poetry build
```