# Unit Testing in Python

In this notebook, we'll introduce pytest and we'll go one step further with TDD, creating tests suites (i.e. groups of tests) and some more advanced features.

We will be using [pytest](https://docs.pytest.org/en/latest/) together with [unittest](https://docs.python.org/3/library/unittest.html), since they combine nicely for some parts. But before, there are some extra concepts that we need to introduce and that we haven't explained yet.

## Basic concepts and terminology

### Mocking

In the [previous notebook](1.%20Introduction_to_testing.ipynb#Unit-testing), we explained that one of the basic requirements for a unit test is isolation. However, in practice our classes might depend on other classes, meaning that our test is somehow "coupled" to the actual implementation of the other class.

This is what the concept of *mocking* tries to fix. To keep it short, mocking is about creating objects that simulate the real objects. More info [here](https://stackoverflow.com/questions/2665812/what-is-mocking).

In pure terms, there are different types of these "fake" objects:

- A **Stub** is just an object that has the same methods as the real one, but does nothing at all when you call them.
- A **Spy** is like a stub, but keeps track of what methods where called, how many times, maybe also with what parameters, etc..
- A **Mock**, which can be seen as a spy that also allows to define its behavior, meaning that we can decide what value will be returned by a function (and/or under what circumstances). This is what we'll be using in our tests.

### Test suite

Very often, tests are grouped together in what's called a **test suite**. For instance, when testing a class, you probably want to test each method indivually, but since all tests relate to the same class, it still makes sense to keep them together in one place. This is what a test suite is about, and in practice, most of the times you will have one test suite per class.

### Other terms

- **setUp** / **tearDown**: Most frameworks implement some special methods that allow you to setup your context BEFORE every test runs, and to clean up AFTER every test runs. Often these are called setUp / tearDown, although in other libraries or languages they might have different names (e.g. "before" / "after").

- **assertion**: An assertion is just a check. Most frameworks ofter shortcuts for different assertions, e.g. "assertTrue(x)" would be equivalent to "assert x == True".

- **fixture**: We'll see more on this later, but you can see a fixture as some sort of dependency that you want to reuse across tests. This could be a mock, or a real connection to a database, a file, etc... Sometimes you might also read / hear about "data fixtures", which basically refers to data used for testing purposes.

- **(code) coverage**: This is just a ratio that indicates which % of your source code is covered by tests. It's important to note that a 100% coverage doesn't mean you're covering all possible test cases, just that the tests you have are "passing" at least once over every single line of code in your project.


## Libraries

As mentioned, we'll be using the [```unittest``` module](https://docs.python.org/3/library/unittest.html) from Python, and [PyTest](https://docs.pytest.org/en/latest/), which simplifies some routine tasks like creating test suites or running your tests, among other features.

Other than these two, we'll also be using [```pyfakefs```](https://pypi.org/project/pyfakefs/), which allows us to test using a fake file system, and [```pytest-cov```](https://pypi.org/project/pytest-cov/), which is a pytest plugin providing coverage reports.

## Structuring your tests

In most Python packages, you will have a structure that somewhat ressembles this:

```
/[PROJECT BASE DIR]
│
├── setup.py
└── /mypackage
    ├── __init__.py
    ├── my_class.py
    ├── ...
    ├── another_class.py
    └── /mysubmodule
        └── subclass.py
```

In general, you want to have your tests in the same project, but not in the same "package", so most of the times, you will just see a "tests" folder in the project dir. It is not enforced, but usually the convention is to follow the same structure as the package, having one file per file in the package, just suffixed (or prefixed) with **test**. So, the previous project structure could look like this with tests:

```
/[PROJECT BASE DIR]
│
├── setup.py
├── /mypackage
│   ├── __init__.py
│   ├── my_class.py
│   ├── ...
│   ├── another_class.py
│   └── /mysubmodule
│       ├── __init__.py
│       └── subclass.py
└── /tests
    ├── my_class_test.py
    ├── ...
    ├── another_class_test.py
    └── /mysubmodule
        └── subclass_test.py
```

Now, inside each of the ```*_test.py``` files, you will have a test suite (a group of tests). Here you also have different options: you can have one function per test, which makes it easy to use some extra features in pytest, but can make the preparation for each test a bit harder (e.g. if you need to create fake data, fake objects, etc), or you can have one class per test, which makes the preparation for each test easier, but adds limitations in some features (e.g. parametrized tests). So, the style you use is more of a personal choice, and you can actually mix both in a project.

# Getting started with PyTest

For the next steps, make sure that you open a terminal and that you activate the virtual environment.

## Running your first test

For most of the examples and exercises from here, we will need to use a terminal. Also, the different examples and files we'll be working on are located inside the "testing_exercises" folder in the root of this project.

For the first example, we have put the ```add_two``` function and the three tests we have into a single file, inside the "01_intro/first_test.py". The code in that file looks like this:

```python
def add_two(number: int) -> int:
    """This function just adds 2 to any number it receives"""
    return number + 2

def test_add_two():
    number = 1
    expected_result = 3

    result = add_two(number)

    assert result == expected_result, \
        f"The result of adding 2 to {number} should be {expected_result}, but it was {result}!"

def test_add_two_to_3():
    number = 3
    expected_result = 5

    result = add_two(number)

    assert result == expected_result, \
        f"The result of adding 2 to {number} should be {expected_result}, but it was {result}!"

def test_add_two_to_minus_6():
    number = -6
    expected_result = -4

    result = add_two(number)

    assert result == expected_result, \
        f"The result of adding 2 to {number} should be {expected_result}, but it was {result}!"
```

Now, assuming that you are inside the "testing_exercises" folder in the terminal, you can run the tests like this:

```bash
pytest 01_intro/first_test.py
```

And its output should look more or less like this:

```bash
=============================== test session starts ===============================
platform win32 -- Python 3.7.3, pytest-5.1.2, py-1.8.0, pluggy-0.12.0
rootdir: C:\...\software-engineering-workshop\testing_exercises
plugins: pyfakefs-3.6, cov-2.7.1
collected 3 items               

01_intro\first_test.py ...                                                   [100%]

================================ 3 passed in 0.03s ================================
```

You can also pass a full directory with test files, and by default, pytest will look for all available tests inside the current directory (or any of its subdirectories), meaning that if you had run ```pytest 01_intro``` the result would have been the same.

With pytest, you can write your tests inside a class, so that they are logically grouped together, or you can just have them inside the same file. Each has its pros and cons. For the class, it gives room for reusing other libraries such as the unittest, which provides a TestCase class where you can easily have your setUp and tearDown methods, but on the other hand, you might have issues when using things like parametrized tests.

## Parametrized tests

Consider this class:

```python
from typing import Union

Number = Union[int, float]
class Calculator:
    def sum(self, x: Number, y: Number) -> Number:
        return x + y
```

And a few tests we have created for it:

```python
def test_sum_1_and_2():
    x = 1
    y = 2
    expected_result = 3
    
    result = Calculator().sum(x, y)

    assert result == expected_result, \
        f"{x} + {y} is {expected_result}, but the actual result was {result}"

def test_sum_3_and_0():
    x = 3
    y = 0
    expected_result = 3
    
    result = Calculator().sum(x, y)

    assert result == expected_result, \
        f"{x} + {y} is {expected_result}, but the actual result was {result}"

def test_sum_minus_5_and_2():
    # ...

def test_sum_minus_5_and_minus_2():
    # ...
```

If you think of it, the only difference between every test is just the values of ```x```, ```y``` and the ```expected_result```, so it would be handy to have a function to which we can pass these parameters and that would run the test. A first implementation of that could look like this:

```python
def test_sums():
    sum_data = [
        #x, y, expected_result
        (1, 2, 3),
        (3, 0, 3),
        (-5, 2, -3),
        (-5, -2, -7),
    ]
    
    for x, y, expected_result in sum_data:
        result = Calculator().sum(x, y)

        assert result == expected_result, \
            f"{x} + {y} is {expected_result}, but the actual result was {result}"
```

You can run this example with:

```bash
pytest 02_parametrized_test/single_test.py
```

This approach would work, but has some issues:

- Everything runs as a single test (i.e. you don't have one test for each of the input rows).
- As a consequence, if one of the assertion fails (or one of the calls throws an exception, etc), we might not be testing everything.

That's where parametrized tests come in. It will look similar to our ```test_sums``` function, but we factor the for loop out of it. Pytest provides this feature, so our previous test can be rewritten like this:

```python
import pytest

SUM_FIXTURES = [
    (1, 2, 3),
    (1, 5, 6),
    (-5, 4, -1),
]

@pytest.mark.parametrize("x,y,expected_result", SUM_FIXTURES)
def test_sum(x, y, expected_result):
    result = Calculator().sum(x, y)

    assert result == expected_result, \
        f"{x} + {y} is {expected_result}, but the actual result was {result}"
```

Again, you can run this example with:

```bash
pytest 02_parametrized_test/parametrized_test.py
```

The main benefit of this is that in our output, we get one test for each element inside ```SUM_FIXTURES```, so if one of the cases there fails, the rest are of cases are still test and marked as succeed / fail accordingly. Let's say we add ```(1, 4, 6)``` to our ```SUM_FIXTURES```. The test for that case will fail miserably, but the rest will still succeed:

```bash
=============================== test session starts ===============================
platform win32 -- Python 3.7.3, pytest-5.1.2, py-1.8.0, pluggy-0.12.0
rootdir: C:\...\software-engineering-workshop\testing_exercises
plugins: pyfakefs-3.6, cov-2.7.1
collected 4 items

02_parametrized_test\parametrized_test.py ..F.                                 [100%]

==================================== FAILURES =====================================
_________________________________ test_sum[1-4-6] _________________________________

x = 1, y = 4, expected_result = 6

    @pytest.mark.parametrize("x,y,expected_result", SUM_FIXTURES)
    def test_sum(x, y, expected_result):
        result = Calculator().sum(x, y)
    
>       assert result == expected_result, \
            f"{x} + {y} is {expected_result}, but the actual result was {result}"
E       AssertionError: 1 + 4 is 6, but the actual result was 5
E       assert 5 == 6

02_parametrized_test\parametrized_test.py:21: AssertionError
=========================== 1 failed, 3 passed in 0.14s ===========================
```

If we had done it in just one test method instead, we would get 1 failure and 0 passed, and we wouldn't be sure of how many of the test cases are failing.

## Fixtures (and mocks)

In most real applications, your classes and functions will have dependencies. One of the key parts to keep your tests maintainable is isolation, meaning that if you have class B, that depends on class A, your tests for B should not depend on the real behavior of A. To do that, we will be introducing Mock objects, as well as another important feature of pytest: test fixtures.

### Scenario

To set the context, consider the two classes below:

```python
class TaxCalculator:
    def __init__(self, tva: float = .21):
        self.tva = tva

    def add_taxes(self, amount: float) -> float:
        return amount * (1 + self.tva)

class Bill:
    def __init__(self, tax_calculator: "TaxCalculator"):
        self.tax_calculator = tax_calculator
        self.amount: float = 0.

    def add(self, amount: float):
        self.amount += amount
    
    @property
    def total(self) -> float:
        """Calculates the total amount, including taxes"""
        return self.tax_calculator.add_taxes(self.amount)
```

When testing it, you might think of doing something like this:

```python
def test_total():
    tax_calculator = TaxCalculator(tva=.1)
    bill = Bill(tax_calculator=tax_calculator)
    bill.add(100.0)
    expected_total = 110.0
    
    total = bill.total()
    
    assert total == expected_total, f"Total was {total}, but it should be {expected_total}"
```

While this might work, this approach introduces a few problems:
- Our tests are not isolated anymore: if the TaxCalculator implementation changes, our tests might start breaking. When testing the Bill class, we don't want to depend on the logic implemented inside the TaxCalculator class.
- We will have to create the ```TaxCalculator``` for every test where we want to use it. This means that if the way we create a TaxCalculator object changes, we'll have to change all our tests.

To solve the first problem, we'll be using Mock objects: objects that mimic the behaviour of other objects, but that have no actual logic in them.

To solve the second problem, we'll use fixtures.

Again, you can run this example yourself. If you're inside the ```testing_exercises``` folder, just run:

```bash
pytest 03_fixtures/01_bill_calculator.py

=========================== test session starts ===========================
platform win32 -- Python 3.7.3, pytest-5.1.2, py-1.8.0, pluggy-0.12.0
rootdir: C:\...\software-engineering-workshop\testing_exercises
plugins: pyfakefs-3.6, cov-2.7.1
collected 1 item                                                           

03_fixtures\01_bill_calculator.py .                                  [100%]

============================ 1 passed in 0.05s ============================
```

### Mocks

As we just explained, in order to keep our tests for the ```Bill``` class isolated, we need a way to not depend on the ```TaxCalculator``` class. We can do that by using the [```unittest.mock```](https://docs.python.org/3/library/unittest.mock.html) library. With it, you can create objects that mimic other objects and/or functions.

To create a mock, there are two main classes: ```Mock``` and ```MagicMock```. The difference between the two is that ```MagicMock``` implements Python's magic methods, such as ```__str__```.

#### Creating a mock

To create a mock, we just need to instantiate the MagicMock class:

```python
from unittest import mock

tax_calculator = mock.MagicMock()
```

Now that we have a mock, we can arbitrarily decide what calling a function will return. We'll see different ways of defining this return values.

#### Return a constant value

The simplest way is to have a method always return the same value:

```python
tax_calculator.add_taxes.return_value = 110.0
```

With this, anytime the ```add_taxes``` method of our mock is called, ```110.0``` will be returned, no matter the arguments passed to the function.

#### Return different values in successive calls

Sometimes, you want the method to return value "x" on the first call, "y" on the second call, etc. For this, we can use the ```side_effect``` property, passing a list of values to it: 

```python
tax_calculator.add_taxes.side_effect = [100.0, 200.0, 50.0]
```

This will return 100.0 on the first call to ```tax_calculator.add_taxes```, 200.0 on the second one and 50.0 on the third one. **IMPORTANT**: If the method is called a fourth time, an exception will be raised!

#### Calculating the return value dynamically

Finally, you can also decide the return value using your own function, and passing it to the side_effect:

```python
def add_taxes_side_effect(amount: float):
    return amount + 10

tax_calculator.add_taxes.side_effect = add_taxes_side_effect
```

#### Asserting calls to a mock method

Finally, it is a common practice in unit tests to assert that the mock methods you expect to be called have actually been called, and that they have been called with the right values. There is a whole range of methods to make different assertions, for instance: ```assert_called``` to just check that it has been called, no matter with what parameters, ```assert_called_once``` to assert that it was called once and only once, ```assert_any_call``` to assert that at least one of the calls to the mock has been made with the arguments we pass, etc. You can find the documentation [here](https://docs.python.org/3/library/unittest.mock.html#unittest.mock.Mock.assert_called)

#### Rewriting our test with mocks

Now that we have see how to create mocks, we can rewrite our test like this:

```python
import pytest
from unittest import mock

def test_total():
    tax_calculator = mock.MagicMock()
    bill = Bill(tax_calculator=tax_calculator)
    amount_to_add = 100.0
    amount_after_taxes = 110.0
    bill.add(amount_to_add)
    tax_calculator.add_taxes.return_value = amount_after_taxes
    
    total = bill.total

    assert total == amount_after_taxes, f"Total was {total}, but it should be {expected_total}"
    tax_calculator.add_taxes.assert_called_once_with(amount_to_add)
```

You can run the test with:

```bash
pytest 03_fixtures/02_bill_calculator_magic_mock.py
```

### Fixtures

The last piece we'll introduce now are fixtures. Fixtures allow you to inject dependencies so that you can have full control over them. They include not only mocks but any other thing you might want to reuse across tests. You can find the full documentation for fixtures in the [pytest docs](https://docs.pytest.org/en/latest/fixture.html)

#### Defining fixtures

You can create a fixture with the ```pytest.fixture``` decorator:

```python
import pytest

@pytest.fixture
def tax_calculator():
    tax_calculator = mock.MagicMock()
    
    return tax_calculator
```

#### Using fixtures

To use fixtures in a test, you simply need to add an argument to your test with the fixture name:

```python
def my_test(tax_calculator):
    # do something
```

#### Available fixtures

Pytest already has some [built-in fixtures](https://pytest.readthedocs.io/en/latest/builtin.html#builtin-fixtures-function-arguments) (e.g. to capture logs, the standard output..), and the pyfakefs library we have also adds a ```fs``` fixture that you can use in your tests.

If you want to check what fixtures are available for all tests, you can run this:

```python
pytest --fixtures
```

Now, some files might define their own fixtures. You can check the fixtures available for a specific file by also providing the name of the file:

```python
pytest --fixtures my_file.py
```

#### Sharing fixtures across multiple tests

Sometimes, you don't want to define a fixture only for one test file, but you want one that you can reuse through any test. To do that, you can create a ```conftest.py``` file with the fixture, and any test that's on the same directory (or in a subdirectory) will be able to use the fixture.

#### Fixture scope

A detail that can be important when writing tests is the scope of fixtures. By default, every test case 
using the fixture will instantiate it again (i.e. the fixture function will be called again). This is ok in many cases, but it's probably not what you want if your fixtures take a long time to be created (e.g. if you want to create a spark session); this is where the scope of fixtures comes into play. You can define the scope of your fixture by passing it to the decorator:

```python
import pytest
from unittest import mock

@pytest.fixture(scope="session")
def my_session_fixture():
    return mock.MagicMock()

@pytest.fixture(scope="class")
def my_class_fixture():
    return mock.MagicMock()
```

The possible values for ```scope``` are:

- **function** (default value): You fixture will be instantiated everytime it's used.
- **class**: The fixture is only instantiated once per class. So, if you have classes "ATest" and "BTest", your fixture will be instantiated twice.
- **module**: The fixture is instantiated once per module (i.e. per subdirectory)
- **package**: In this case, the fixture is instantiated once per package. Note that a package can have multiple modules.
- **session**: This is the "broadest" level. The fixture will just be instantiated once.

More information:
https://docs.pytest.org/en/latest/fixture.html#scope-sharing-a-fixture-instance-across-tests-in-a-class-module-or-session

#### Adding a tax_calculator fixture

Finally, let's add a tax_calculator fixture to our test.

You can run the example with:

```bash
pytest 03_fixtures/03_bill_calculator_fixtures.py --capture=no
```

There are two tests using the fixture, so you should see twice the line "Instantiating tax_calculator fixture". There is another example that uses the ```module``` scope, that you can also run:

```bash
pytest 03_fixtures/04_scoped_fixtures.py --capture=no
```

**IMPORTANT**: Be careful with the scope of the fixtures and the assertions. For instance, if you have a fixture with scope "module" and an assertion of the type "my_fixture.some_method.assert_called_once()" in more that one test, your assertions might fail, since the call counts for the mock will not be reset between tests.