# 4. Testing

We now have a fully automated script! 🎉👏🏻🦄

😕 Annoyingly, we still cannot guarantee the results are correct... or that there are no bugs.

The next step is to include **tests**... in fact testing should be a core part of our development process. In fact all of our **reproducible workflows** are analogous to experimental design in the scientific world

![science](./assets/the_difference.png)

<small> https://xkcd.com/242/ </small>

There are various approaches to test software:
- **Assertions**: 🦄 == 🦄
- **Exceptions**: (within the code) serve as warnings ⚠️
- **Unit tests**: investigate the behaviour of units of code (e.g functions)
- **Regression tests**: defends against 🐛
- **Integration tests**: ⚙️ checks that the pieces work together as expected

## Exceptions 
Remember when you tried to run `02_visualize-wines.py`? It would not work unless you had created a figures directory beforehand.

We can catch this kinds of errors by adding this piece of code:
```python
try:
        # try to save the figure
        fig.savefig(fname, bbox_inches = 'tight')
    except OSError as e:
        # wowza! the directory does not exist
        os.makedirs('figures')
        print('Creating figures directory')
        fig.savefig(fname, bbox_inches='tight')
```
Now our `runall` should work!!! 🎉🎉

```
$ python src.runall-wine-analysis
```

## Unit testing
Open `03_country-subset.py` and add the following function:
    
```python 
def get_mean_price(filename):
    """ function to get the mean price of the wines
    rounded to 4 decimals"""
    wine = pd.read_csv(filename)
    mean_price = wine['price'].mean()
    return round(mean_price, 4)
```

And we will modify `get_country` too, so it return the data frame:
```python
def get_country(filename, country):
    # Load table
    wine = pd.read_csv(filename)

    # Use the country name to subset data
    subset_country = wine[wine['country'] == country ].copy()

    # Constructing the fname
    today = datetime.datetime.today().strftime('%Y-%m-%d')
    fname = f'data/processed/{today}-winemag_{country}.csv'

    # Saving the csv
    subset_country.to_csv(fname)
    print(fname)  # print the fname from here

    return(subset_country)  #returns the data frame
```

Now we are going to create our testing suite. 
To run the tests we are going to use **pytest**.
You can find more information in the following resources:
- Pytest usage examples can be found [here](http://doc.pytest.org/en/latest/usage.html)
- Rules for [test discovery](http://doc.pytest.org/en/latest/goodpractices.html)
- Great [posts about testing and pytest](http://pythontesting.net/start-here/)

Now we can create our tests:
```
$ mkdir tests                     # Create tests directory
$ touch tests/__init__.py         # Help find the test
$ touch test_03_country_subset.py # Create our first test
```
⭐ Your test scripts name must start with: `test`

Modifying <code>test_03_country_subset.py</code>
``` python
import importlib

country = importlib.import_module('.data.03_country-subset', 'src')

# you might need to change the date so that it matches today's date
processed_data = "data/processed/2018-05-09-winemag_Chile.csv"

def test_get_mean_price():
    mean_price = country.get_mean_price(processed_data)
    assert mean_price == 20.7865
```

And you can run it from the shell using:
```
$ pytest
```

### What if you want all the decimal numbers?

``` python
import importlib
import numpy.testing as npt

country = importlib.import_module('.data.03_country-subset', 'src')

processed_data = "data/processed/2018-05-09-winemag_Chile.csv"

def test_get_mean_price():
    mean_price = country.get_mean_price(processed_data)
    assert mean_price == 20.7865
    npt.assert_allclose(country.get_mean_price(processed_data) , 20.787, rtol = 0.01)
```

The `numpy.testing.assert_allclose` allows you to set a tolerance 

### What else could go wrong?

What if we created a data set and we want to make sure that my interim or raw data has not changed? -> Thus my dataframes have not changes either?

```python 
import pandas.testing as pdt
import pandas as pd


interim_data = "data/interim/2018-05-09-winemag_priceGBP.csv"
processed_data = "data/processed/2018-05-09-winemag_Chile.csv"

def test_get_country():
    # call the function
    df = country.get_country(interim_data, 'Chile')
    
    # load my previous dataset
    base = pd.read_csv(processed_data)
    
    # check if I am getting a dataframe
    assert isinstance(df, pd.DataFrame)
    assert isinstance(base, pd.DataFrame)
    
    # check that they are the same dataframes
    pdt.assert_frame_equal(df, base)
```    

Pytest tells us which tests passed and which did not:

```python
 {message}
    [left]:  {left}
    [right]: {right}""".format(obj=obj, message=message, left=left, right=right)

        if diff is not None:
            msg += "\n[diff]: {diff}".format(diff=diff)

>       raise AssertionError(msg)
E       AssertionError: DataFrame are different
E
E       DataFrame shape mismatch
E       [left]:  (4472, 6)
E       [right]: (4472, 7)
```

We now know what kind of bugs we can encounter.
Let's fix this, open `03_subset-country.py` and add the lines to reset index of `subset_country` and remove the index when saving to the csv file.

```python
def get_country(filename, country):
    # Load table
    wine = pd.read_csv(filename)

    # Use the country name to subset data
    subset_country = wine[wine['country'] == country ].copy()
    subset_country.reset_index(drop=True, inplace=True) 

    # Constructing the fname
    today = datetime.datetime.today().strftime('%Y-%m-%d')
    fname = f'data/processed/{today}-winemag_{country}.csv'

    # Saving the csv
    subset_country.to_csv(fname, index=False)
    print(fname)  # print the fname from here

    return(subset_country)  #returns the data frame
```

### See what we did in the previous steps?

We tested each of the functions in our module...
we did *unit testing*!
Notice something in the functions we just wrote? 
- Set-up: `mean = country.get_mean(interim_data)`
- Assertions: `assert mean_price == 20.786`

Now don't forget to commit your code:
```
$ git add .
$ git commit -m "Add unit test suite"
```

## Past as Truth (regression tests)

Regression tests assume that the past is “correct”. They are great for letting developers know when and how a code base has changed. They are not great for letting anyone know why the change occurred. The change between what a code produces now and what it computed before is called a regression.

** How many times have you tried to run a script or a notebook you found online just to realize it is broken?**

Let's do some regression testing on the Jupyter notebook using [nbval](https://github.com/computationalmodelling/nbval)

## Testing Jupyter notebooks with nbval

We first need to understand how a Jupyter notebook works. 
All the data is stored in a .json like format (organised key, data values)... this includes the results, code, and markdown.

![json](assets/json.jpg)

Nbval checks the stored values while doing a *mock run* on the notebook and compares the saved version of the notebook vs the results obtained from the mock run 


Try it on your shell 

```
$ pytest --nbval src/data/00_explore-data.ipynb
```

What would happen if you were to have a cell like this one?
```python
import time
print('This notebook was last run on: ' + time.strftime('%d/%m/%y') + ' at: ' + time.strftime('%H:%M:%S'))
```

Since we have a timestamp in the notebook (`time.strftime('%H:%M:%S'))`) and the test is performed at a different time, the tests will fail. 

We can avoid this by providing a sanitizing file `sanitize.cfg`:
<pre><code>
[regex1]
regex: \d{1,2}/\d{1,2}/\d{2,4}
replace: DATE-STAMP
[regex2]
regex: \d{2}:\d{2}:\d{2}
replace: TIME-STAMP
</pre></code>

This will identify time and data stamp like data in your notebooks. Then you can re run the test using:
<pre><code>
py.test --nbval demo.ipynb --sanitize-with sanitize_nb.cfg
</pre></code>

If you are only interested in verifying if your notebooks <strong>are broken or not</strong> (check for exceptions)
you can use `--nbval-lax` which runs notebooks and checks for errors, but only compares the outputs of cells with a `#NBVAL_CHECK_OUTPUT` marker comment.
<pre><code>
$ py.test --nbval-lax classify-demo.ipynb
</pre></code>



## TDD and tests first

Have you ever heard about test-driven-development? It is a commonly used practice in which you write the tests for your scripts before or at the time of writing the actual code. 

Some of the advantages include early bug detection, better test coverage, and generally a higher code quality. It also helps you ensure you know what your code is doing at all times and makes it easier to add new features without major risk of breakdown. 

# Provenance

Image you created a beautiful graph and some results that makes your research Nobel worthy. Of course you ran the workflow multiple times doing minimal changes every single time. But now, 6 months later you need that **one** plot for you Nobel!!

We can use the package [recipy](https://github.com/recipy/recipy) to log each run of your code to a database, keeping track of the input files, output files and the version of your code, and then let you query this database to find out how you actually did create graph.png

<div class='warn'>Make sure everything is commited to git before carrying on.</div>
<br>
Add the following line to your `runall-wine-analysis` script

```python
import recipy
```
Run the script again `python -m src.runall-wine-analysis`

You can now track the provenance of your project. 

Try using `recipy latest` and `recipy gui`