# Session 12: Unit tests for analytics

When we write code, we want to make sure that it works as expected. One way to do this is using tests. There are different types of tests:

- **Unit tests**: These tests are used to test individual units of code. They are usually small and test a single function or method.
- **Integration tests**: These tests are used to test how different parts of the code work together.
- **End-to-end tests**: These tests are used to test the entire system.

In this session we will focus on unit tests. We will write tests for the analytics functions that we have implemented in the past.

We will use the `pytest` library to write and run the tests. `pytest` is a popular testing framework for Python. It makes it easy to write simple and scalable tests.

But before anything, we need to understand how to deal with things when they go wrong in Python: errors and exceptions.

## Catching errors and exceptions

Whenever we do something wrong in Python, an error or an exception is raised. For example, if we try to divide by zero, a `ZeroDivisionError` is raised. If we try to access an element in a list that does not exist, an `IndexError` is raised.

In [1]:
my_list = [1, 2, 3, 4, 5]

my_list[5]

IndexError: list index out of range

The `IndexError` is raised when you try to access an index that is out of range.

In this case, the index 5 is out of range for the list `my_list` because the list only has indices 0 to 4. To fix this error, you can either access an index within the range of the list or handle the error using a try-except block. Here is an example of how you can handle the error using a try-except block:

In [2]:
try: # try to execute the code
    my_list[5]
except IndexError as e: # catching the exception and storing it in a variable
    print(f"An IndexError occurred: {e}")
else: # if no exception occurred
    print("No exception occurred")

An IndexError occurred: list index out of range


The difference between an error and an exception is that an error is a problem that occurs at runtime and stops the execution of the program, while an exception is a problem that occurs at runtime but can be handled by the program.

In summary, when an error occurs, the program stops executing, while when an exception occurs, the program can continue executing if the exception is handled.

So the ideal is that we know what can go wrong in our code and handle these situations. This is where unit tests come in. They help us to identify these situations and handle them.

## General syntax for catching exceptions

The general syntax for catching exceptions in Python is as follows:

```python
try:
    # code that may raise an exception
except ExceptionType as e:
    # code to handle the exception
```

We can catch several exceptions at once by using a tuple of exception types:

```python
try:
    # code that may raise an exception
except (ExceptionType1, ExceptionType2) as e:
    # code to handle the exception
```

We can also catch all exceptions by using the base class `Exception`:

```python
try:
    # code that may raise an exception
except Exception as e:
    # code to handle the exception
```

By using `Exception`, we can catch all exceptions, but it is not recommended to do so because it can hide bugs in the code. It is better to catch specific exceptions that you know how to handle.

## Raising exceptions

You can also raise exceptions in Python using the `raise` statement. This is useful when you want to stop the execution of the program and raise an exception.

Here is an example of how you can raise an exception in Python:

```python
raise Exception("An error occurred")
```

In [3]:
num = 5
denom = 0

if denom == 0:
    raise Exception("Cannot divide by zero, dummy.")
else:
    num / denom


Exception: Cannot divide by zero, dummy.

In [14]:
num = 5
denom = 0
try: 
    if denom == 0:
        raise Exception("Cannot divide by zero, dummy.")
    else:
        print(num / denom)

except Exception as e:
    print (e)

Cannot divide by zero, dummy.


So we have seen how to catch exceptions and how to raise exceptions in Python. Now let's see how we can write unit tests for our analytics functions.

## `assert` statement

Before we start writing tests, let's talk about the `assert` statement. The `assert` statement is used to check if an expression is `True`. If the expression is `False`, an `AssertionError` exception is raised.

In [None]:
assert 1+1 == 2, "1+1 should be 2"  # all good

In [None]:
assert 1+1 == 3, "1+1 should be 2"  # this will raise an AssertionError

AssertionError: 1+1 should be 2

The assert statement has the following syntax:

```python
assert expression, message
```

If the expression is `False`, the message is printed along with the `AssertionError` exception.

We are going to use the `assert` statement to write our tests in this session.

In [13]:
assert num / denom == 2.5, 'Dum'

AssertionError: Dum

## Writing tests

To write tests, we need to create a new file with the name `test_<module>.py`. For example, if we have a module called `analytics.py`, we need to create a file called `test_analytics.py`.

In this file, we will write functions that start with `test_`. These functions will contain the tests for the functions that we want to test.

For example, if we have a function called `add` in the `analytics.py` module, we can write a test for it like this:

```python
def test_add():
    assert add(1, 2) == 3
    assert add(0, 0) == 0
    assert add(-1, 1) == 0
```

Within the test function, we use the `assert` statement to check if the function `add` works as expected. It´s very easy to make a test pass if you use silly assertions, so make sure to test the function with different inputs and think of corner cases to include in oyur tests.

Also try to test things that should not happen, like passing a string to a function that expects an integer.

## Running tests

As mentioned, we will store our functions in a file called `analytics.py` and our tests in a file called `test_analytics.py`.

To run the tests, we need to execute the following command in the terminal:

```bash
pytest test_analytics.py
```

This command will run all the tests in the `test_analytics.py` file.

## Coverage

When we write tests, we want to make sure that we are testing all the code that we have written. We can use the `coverage` library to check the code coverage of our tests.

The concept of coverage is simple: it tells us how many lines of code are executed when we run the tests. The goal is to have 100% coverage, which means that all the lines of code are executed at least once.

With this information, we can identify which parts of the code are not being tested and write more tests to cover them.

To check the coverage of our tests, we need to install the `coverage` library:

```bash
pip install coverage
```

Then we can run the following command to check the coverage:

```bash
coverage run -m pytest test_analytics_1.py
```

This command will run the tests and check the coverage. To see the coverage report, we can run the following command:

```bash
coverage report
```

With this, we are ready to start writing tests for our analytics functions.

### Exercise 1

Build a function called `stats_report` in the `analytics_1.py` script that receives a list of numbers and returns a dictionary with the following statistics:

- `mean`: The average of the numbers.
- `median`: The median of the numbers.
- `std`: The standard deviation of the numbers.

Then, write tests for this function in the `test_analytics_1.py` script. The `test_analytics_1.py` script should call the `stats_report` from `analytics_1.py` function with different inputs and check if the output is correct.

After building the tests, run them using the `pytest` command and check the coverage using the `coverage` library.

```bash
pytest test_analytics_1.py
coverage run -m pytest test_analytics_1.py
coverage report
coverage html
```

After running pytest, you should see the following output:

```bash
======================================================= test session starts =======================================================
platform darwin -- Python 3.10.13, pytest-8.3.4, pluggy-1.5.0
rootdir: /Users/dgh/Desktop/PDA/pda2/class_material
plugins: Faker-24.4.0, typeguard-4.3.0, anyio-4.2.0
collected 4 items                                                                                                                 

test_analytics_1.py ....                                                                                                    [100%]

======================================================== 4 passed in 0.13s ========================================================
```

That means that 4 tests passed. Since we have 4 tests, we have 100% of tests passing.

But what about the coverage? Let´s check it:

```bash
Name                  Stmts   Miss  Cover
-----------------------------------------
analytics_1.py           10      2    80%
test_analytics_1.py      21      1    95%
-----------------------------------------
TOTAL                    31      3    90%
```

We have 90% of coverage. That means that we are not testing all the lines of code in our scripts. We should write more tests to cover the missing lines.

With the `coverage html` command, we can generate a report in HTML format. This report will show us which lines of code are not being tested. Open the `htmlcov/index.html` file in your browser to see the report.

## Tests that cover for errors

When writing tests, we should also test for errors. We can use the `pytest.raises` function to check if a function raises an exception.

For example, if we have a function called `divide` that divides two numbers, we can write a test like this:

```python
import pytest

def test_divide_by_zero():
    with pytest.raises(ZeroDivisionError):
        divide(1, 0)
```

In this test, we use the `pytest.raises` function to check if the function `divide` raises a `ZeroDivisionError` exception when we try to divide by zero.

If the function raises the exception, the test passes. If the function does not raise the exception, the test fails. So we have to include the error raising in our function to make the test pass.

## Exercise 2

Add a new function called `divide` to the `analytics_1.py` script. This function should receive two numbers and return the result of dividing the first number by the second number.

Then, write tests for this function in the `test_analytics_1.py` script. The `test_analytics_1.py` script should call the `divide` from `analytics_1.py` function with different inputs and check if the output is correct.

## Exercise 3

Create a script that calculates the factorial of a number. The factorial of a number `n` is the product of all positive integers less than or equal to `n`. For example, the factorial of `5` is `5 * 4 * 3 * 2 * 1 = 120`.

Then, create another script with tests for this function. The tests should check if the function calculates the factorial of a number correctly.

Include the following tests:

- Test the factorial of `0`, should return `1`.
- Test the factorial of `1`, should return `1`.
- Test the factorial of `5`, should return `120`.
- Test the factorial of `'a'`, should raise a `TypeError` exception because the input is not an integer.