# 03 - Supercharge package

## Testing

Tests help you determine if your code does what you expected it to do.

There are different types of test.
The [most important tests](http://slides.com/treycausey/pydata2015#/) for Data Scientists are:
- unit tests that focus on small units of code like functions; 
- integration tests for whole systems;
- regression tests to test if software performs the same after changes;

In addition, you probably want to have systems checking data quality and monitoring if your model is still performing as expected.
Those test won't be discussed here: we'll only show unit tests.

[Unit testing](https://jeffknupp.com/blog/2013/12/09/improve-your-python-understanding-unit-testing/) is easy as calling your function and `assert`-ing that the function behaves as expected:

In [None]:
from animal_shelter.data import convert_camel_case

result = convert_camel_case('CamelCase')
expected = 'camel_case'  # TODO: Adjust this to see what happens.

assert result == expected  # Check if it's true!

We `expected` something and compared it to the `result` our function returned, it's as easy as that.

Python unit tests generally go in a folder called `tests/` and contain modules starting with `test_`.
These modules again contain functions and classes starting with respectively `test_` and `Test`.
It's tests all the way down.

Our project has a folder called `tests/` and the modules `test_data.py` and `test_features.py` contain unit tests to check the functions that you've made. 
Check them out!

Note that most functions in `test_features.py` don't use `assert`, but use the `pandas` utility function `assert_series_equal()` to check if `Series` are the same.
Many libraries have utility functions to make writing tests easier.

Run the unit tests using [`pytest`](https://docs.pytest.org/en/latest/):

```bash
$ poetry run pytest tests/
```

You'll get some error messages because `test_is_dog()` has not been implemented yet!

> #### Exercise - Testing
> 
> Create a test case to check if `is_dog()` is implemented correctly.
> Make sure that `pytest` doesn't return any errors.
>
> Bonus: `is_dog` raises an exception if something other than cats or dogs is encountered.
Test that this exception if raised if invalid input is given.

## Logging

Logging helps you understand what's happening when run your code is being run.

An often made mistake is that people *configure* logging in their library.
This can give problem if the application using your library also wants to configure logger.

> #### Exercise - Logging
>
> The function `check_is_dog()` has a print statement. Replace it with a logging call. Make sure that your logging level is the right one.

In [1]:
import logging

logging.basicConfig(level=logging.INFO)

In [2]:
from animal_shelter.data import load_data
from animal_shelter.features import check_is_dog

animal_outcomes = load_data('../data/train.csv')
check_is_dog(animal_outcomes["animal_type"])

INFO:animal_shelter.data:Reading data from ../data/train.csv
INFO:animal_shelter.data:Read 26729 rows


0         True
1        False
2         True
3        False
4         True
         ...  
26724    False
26725    False
26726     True
26727    False
26728    False
Name: animal_type, Length: 26729, dtype: bool

## Type hinting & checking


Type hints make it much easier to statically reason about your code.
Signalling what types are used by your code can serve as documentation, help linters & IDEs, and help catch errors by checking the hints.

Type hinting in Python is not necesarily a do or do-not choice: you can gradually add type hints.
A good practice is to at least add type hints to the public functionality of your library.

Let's discuss some examples.

`-> None` tells us that this function returns `None`.

In [None]:
def p() -> None: 
    print('hello')

?p

p()

The function below accepts an argument `names` that should consist of a list with strings in it.

In [4]:
def greet_all(names: list[str]) -> None: 
    for name in names:
        print('Hello ' + name)

?greet_all


greet_all(['Jane', 'Mike'])

Hello Jane
Hello Mike


Type hints are *hints*.
You can still disregard them:

In [None]:
greet_all(('Jane', 'Mike'))

Duck typing is supported: you can signal that `names` just needs to be something to iterate over:

In [None]:
from typing import Iterable

def greet_all(names: Iterable[str]) -> None:
    for name in names: 
        print('Hello ' + name)

> #### Exercise: mypy
>
> * Add type hints to modules `data.py` and `features.py`
> * Make sure that mypy doesn't return any errors if you return `poetry run mypy src/`

## Documentation

Documentation will help the users your code.
Having documentation in your codebase is already good, but we can use Sphinx to make the documentation easier to ride.


> #### Exercise: Sphinx
> * Install sphinx, create a sub-directory ‘docs’ and run `sphinx-quickstart` inside the docs directory.
> * Create an HTML version of the generated docs by running `make html` inside the docs directory. Open the generated pages in your browser.
> * Add a new page by creating an additional RST file and add a reference in your table of contents.
> * Change the theme to the ReadTheDocs theme. (Bonus)
> Add some API documentation using docstrings + autodoc. (Bonus)