In [14]:
import numpy as np
import pytest
import torch

## Intuition

Tests are a way for us to ensure that something works as intended. We're incentivized to implement tests and discover sources of error as early in the development cycle as possible so that we can reduce [increasing downstream costs](https://assets.deepsource.io/39ed384/images/blog/cost-of-fixing-bugs/chart.jpg) and wasted time. Once we've designed our tests, we can automatically execute them every time we implement a change to our system and continue to build on them over time. In this lesson, we'll learn how to test machine learning code, data and models to construct a system that we can reliably iterate on.

## Types of tests

There are many four majors types of tests which are utilized at different points in the development cycle:

- Unit tests: tests on individual components that each have a single responsibility (ex. function that filters a list).
- Integration tests: tests on the combined functionality of individual components (ex. data processing).
- System tests: tests on the design of a system for expected outputs given inputs (ex. training, inference, etc.).
- Acceptance tests: tests to verify that requirements have been met, usually referred to as User Acceptance Testing (UAT).
- Regression tests: testing errors we've seen before to ensure new changes don't reintroduce them.

## How should we test?

The framework to use when composing tests is the [Arrange Act Assert methodology](http://wiki.c2.com/?ArrangeActAssert).

- Arrange: set up the different inputs to test on.
- Act: apply the inputs on the component we want to test.
- Assert: confirm that we received the expected output.

## What should we be testing for?

An example:

> When arranging our inputs and asserting our expected outputs, what are some aspects of our inputs and outputs that we should be testing for?

- inputs: data types, format, length, edge cases (min/max, small/large, etc.)
- outputs: data types, formats, exceptions, intermediary and final outputs

## Best practices

Regardless of the framework we use, it's important to strongly tie testing into the development process.

- atomic: when creating unit components, we need to ensure that they have a [single responsibility](https://en.wikipedia.org/wiki/Single-responsibility_principle) so that we can easily test them. If not, we'll need to split them into more granular units.

- compose: when we create new components, we want to compose tests to validate their functionality. It's a great way to ensure reliability and catch errors early on.

- regression: we want to account for new errors we come across with a regression test so we can ensure we don't reintroduce the same errors in the future.

- coverage: we want to ensure that 100% of our codebase has been accounter for. This doesn't mean writing a test for every single line of code but rather accounting for every single line (more on this in the coverage section below).

- automate: in the event we forget to run our tests before committing to a repository, we want to auto run tests for every commit. We'll learn how to do this locally using pre-commit hooks and remotely (ie. main branch) via GitHub actions in subsequent lessons.

## Test-driven development or Otherwise?

[Test-driven development (TDD)](https://en.wikipedia.org/wiki/Test-driven_development) is the process where you write a test before completely writing the functionality to ensure that tests are always written. This is in contrast to writing functionality first and then composing tests afterwards. Here are my thoughts on this:

- good to write tests as we progress, but it's not the representation of correctness.

- initial time should be spent on design before ever getting into the code or tests.

- using a test as guide doesn't mean that our functionality is error free.

Perfect coverage doesn't mean that our application is error free if those tests aren't meaningful and don't encompass the field of possible inputs, intermediates and outputs. Therefore, we should work towards better design and agility when facing errors, quickly resolving them and writing test cases around them to avoid them next time.


## Pytest

We're going to be using [pytest](https://docs.pytest.org/en/stable/) as our testing framework for it's powerful builtin features such as parametrization, fixtures, markers, etc.

### Configuration

Pytest expects tests to be organized under a `tests` directory by default. However, we can also use our `pyproject.toml` file to configure any other test path directories as well. Once in the directory, pytest looks for python scripts starting with `tests_*.py` but we can configure it to read any other file patterns as well.

```toml
# Pytest
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = "test_*.py"
```

### Assertions

Simple assertion testing example.

In [None]:
from pathlib import Path

# Creating Directories
BASE_DIR = Path("__file__").parent.absolute()

SRC_DIR = Path.joinpath(BASE_DIR, "src")
TEST_DIR = Path.joinpath(BASE_DIR, "tests")
SRC_DIR.mkdir(parents=True, exist_ok=True)
TEST_DIR.mkdir(parents=True, exist_ok=True)

In [None]:
%%writefile {BASE_DIR}/pyproject.toml
# Pytest
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = "test_*.py"

Writing /content/pyproject.toml


In [None]:
%%writefile {SRC_DIR}/__init__.py
"init file"

Writing /content/src/__init__.py


In [None]:
%%writefile {SRC_DIR}/fruits.py
def is_crisp(fruit):
    if fruit:
        fruit = fruit.lower()
    if fruit in ["apple", "watermelon", "cherries"]:
        return True
    elif fruit in ["orange", "mango", "strawberry"]:
        return False
    else:
        raise ValueError(f"{fruit} not in known list of fruits.")
    return False

Writing /content/src/fruits.py


In [None]:
%%writefile {TEST_DIR}/test_fruits.py
import pytest
import sys 
sys.path.append("/content") # append to import properly.
from src.fruits import is_crisp
def test_is_crisp():
    assert is_crisp(fruit="apple") #  or == True
    assert is_crisp(fruit="Apple")
    assert not is_crisp(fruit="orange")
    with pytest.raises(ValueError):
        is_crisp(fruit=None)
        is_crisp(fruit="pear")

Writing /content/tests/test_fruits.py


In [None]:
!pytest                                      # all tests
!pytest tests/                               # tests under a directory
!pytest tests/test_fruits.py                 # tests for a single file
!pytest tests/test_fruits.py::test_is_crisp  # tests for a single function

platform linux -- Python 3.7.14, pytest-3.6.4, py-1.11.0, pluggy-0.7.1
rootdir: /content, inifile:
plugins: typeguard-2.7.1
[1mcollecting 0 items                                                             [0m[1mcollecting 1 item                                                              [0m[1mcollected 1 item                                                               [0m

tests/test_fruits.py .[36m                                                   [100%][0m

platform linux -- Python 3.7.14, pytest-3.6.4, py-1.11.0, pluggy-0.7.1
rootdir: /content, inifile:
plugins: typeguard-2.7.1
collected 1 item                                                               [0m

tests/test_fruits.py .[36m                                                   [100%][0m

platform linux -- Python 3.7.14, pytest-3.6.4, py-1.11.0, pluggy-0.7.1
rootdir: /content, inifile:
plugins: typeguard-2.7.1
collected 1 item                                                               [0m

tests/test_f

### Classes

See [examples from madewithml repo](https://github.com/GokuMohandas/follow/blob/testing/tests/tagifai/test_data.py) to understand better.

### Interfaces

See [madewithml interface section](https://madewithml.com/courses/mlops/testing/#interfaces).

### Parametrize

So far, in our tests, we've had to create individual assert statements to validate different combinations of inputs and expected outputs. However, there's a bit of redundancy here because the inputs always feed into our functions as arguments and the outputs are compared with our expected outputs. To remove this redundancy, pytest has the [`@pytest.mark.parametrize`](https://docs.pytest.org/en/stable/parametrize.html) decorator which allows us to represent our inputs and outputs as parameters.

Let us create a new python file `test_fruits_parametrize.py` to test it out.

In [None]:
%%writefile {TEST_DIR}/test_fruits_parametrize.py
import pytest
import sys 
sys.path.append("/content") # append to import properly.
from src.fruits import is_crisp

@pytest.mark.parametrize(
    "fruit, crisp",
    [
        ("apple", True),
        ("Apple", True),
        ("orange", False),
    ],
)
def test_is_crisp_parametrize(fruit, crisp):
    assert is_crisp(fruit=fruit) == crisp

@pytest.mark.parametrize(
    "fruit, exception",
    [
        ("pear", ValueError),
    ],
)
def test_is_crisp_exceptions(fruit, exception):
    with pytest.raises(exception):
        is_crisp(fruit=fruit)

Overwriting /content/tests/test_fruits_parametrize.py


To fix line number, but for now the line number starts from the decorator `@pytest.mark.parametrize`.

- [Line 2]: define the names of the parameters under the decorator, ex. "fruit, crisp" (note that this is one string). Note that this string names should correspond to the function defined under the decorator.

- [Lines 3-7]: provide a list of combinations of values for the parameters from Step 1.

- [Line 9]: pass in parameter names to the test function.

- [Line 10]: include necessary assert statements which will be executed for each of the combinations in the list from Step 2.

- [Line 12-20]: this tests exception handling as well if you pass in as such.

In [None]:
!pytest tests/test_fruits_parametrize.py  # tests for a single function

platform linux -- Python 3.7.13, pytest-3.6.4, py-1.11.0, pluggy-0.7.1
rootdir: /content, inifile:
plugins: typeguard-2.7.1
[1mcollecting 0 items                                                             [0m[1mcollecting 4 items                                                             [0m[1mcollected 4 items                                                              [0m

tests/test_fruits_parametrize.py ....[36m                                    [100%][0m



### Fixtures

[What's the benefits of using fixtures?](https://realpython.com/pytest-python-testing/#fixtures-managing-state-and-dependencies)

One obvious reason that I know of is about reducing the redundancies of re-defining inputs every time.

In [None]:
import numpy as np 

def add(nums_list):
    return np.sum(nums_list)


def mul(nums_list):
    return np.prod(nums_list)

def test_add():
    nums_list = [1, 2, 3, 4, 5]
    assert add(nums_list) == 15

def test_mul():
    nums_list = [1, 2, 3, 4, 5]
    assert add(nums_list) == 120

Notice that you defined `nums_list` twice when we want to test different functions with the **same inputs**. So to reduce this redundancy, we can do:

In [None]:
import pytest

@pytest.fixture
def sample_nums_list():
    nums_list = [1, 2, 3, 4, 5]
    return nums_list 

## Example Walkthrough

> **The example walkthrough assumes you have a basic understanding of pytests, which can be read from madewithml.com.**

We have two functions `voc2yolo` and its inverse `yolo2voc`, the former takes in a Pascal-VOC style bounding box and **transforms** it to its equivalent YOLO format, while the latter does the opposite.

More concretely, given the ground truth coordinates of 

```python
voc = [98, 345, 420, 462]
```

we want to transform it to the equivalent YOLO coordinates

```python
yolo = [0.4046875, 0.840625, 0.503125, 0.24375]
```

and vice versa. We also assume the height and width of the image to be `480` and `640` respectively.

In [None]:
import numpy as np

voc = [98, 345, 420, 462]
yolo = [0.4046875, 0.840625, 0.503125, 0.24375]
voc = np.asarray(voc)
yolo = np.asarray(yolo)

Our transformation functions mainly consists of `voc2yolo` and `yolo2voc` with 2 utility function.

In [None]:
from typing import Union

import numpy as np
import torch

BboxType = Union[np.ndarray, torch.Tensor]

def cast_int_to_float(inputs: BboxType) -> BboxType:
    if isinstance(inputs, torch.Tensor):
        return inputs.float()
    return inputs.astype(np.float32)


def clone(inputs: BboxType) -> BboxType:
    if isinstance(inputs, torch.Tensor):
        return inputs.clone()
    return inputs.copy()

def voc2yolo(inputs: BboxType, height: float, width: float) -> BboxType:
    outputs = clone(inputs)
    outputs = cast_int_to_float(outputs)

    outputs[..., [0, 2]] /= width
    outputs[..., [1, 3]] /= height

    outputs[..., 2] -= outputs[..., 0]
    outputs[..., 3] -= outputs[..., 1]

    outputs[..., 0] += outputs[..., 2] / 2
    outputs[..., 1] += outputs[..., 3] / 2

    return outputs


def yolo2voc(inputs: BboxType, height: float, width: float) -> BboxType:
    outputs = clone(inputs)
    outputs = cast_int_to_float(outputs)

    outputs[..., [0, 2]] *= width
    outputs[..., [1, 3]] *= height

    outputs[..., 0] -= outputs[..., 2] / 2
    outputs[..., 1] -= outputs[..., 3] / 2
    outputs[..., 2] += outputs[..., 0]
    outputs[..., 3] += outputs[..., 1]

    return outputs

We can just test the correctness of the transformation (i.e. when I pass in `voc = [98, 345, 420, 462]` to `voc2yolo`, I expect it to output `yolo = [0.4046875, 0.840625, 0.503125, 0.24375]`). 

Without using any library, we can simply do something like:

In [None]:
assert voc2yolo(voc, height=480, width=640).all() == yolo.all()

and for `yolo2voc`, we can repeat:

In [None]:
assert yolo2voc(yolo, height=480, width=640).all() == voc.all()

This may seem fine, but it is very hard to scale up when you add in more transformations. Furthermore, functions like these often have implicit assumptions that need to be rigourously tested as well.

For example, I defined `BboxType = Union[np.ndarray, torch.Tensor]` and type hinted our functions' inputs and outputs to be both of this type. In particular, when a user pass in an array of type `torch.Tensor`, I expect the output to be of the same type as in the input. This is important as many operations performed on `torch.Tensor` does not carry forward to their `np.ndarray` counterpart. ***Our assert statement above does not check this, for all we know, I could maliciously convert our tensor type to numpy array and our code still works.***

Things get a bit more complicated when I also want to check that the input dimension is the same as dimension. For example, if I pass in a 3d-array as input, I expect the same dimension for its outputs. ***Our assert statement above does not check this, for all we know, I could maliciously add a statement to squeeze the first dimension of our input during the function body and our code still works.***

Before we write the tests, let's first create the necessary folders and scripts.

In [None]:
from pathlib import Path

# Creating Directories
BASE_DIR = Path("__file__").parent.absolute()

SRC_DIR = Path.joinpath(BASE_DIR, "src")
TEST_DIR = Path.joinpath(BASE_DIR, "tests")
SRC_DIR.mkdir(parents=True, exist_ok=True)
TEST_DIR.mkdir(parents=True, exist_ok=True)

In [None]:
%%writefile {BASE_DIR}/pyproject.toml
# Pytest
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = "test_*.py"

Writing /content/pyproject.toml


In [None]:
%%writefile {SRC_DIR}/__init__.py
"init file"

Writing /content/src/__init__.py


Here we are writing the transformation functions into `src/bbox_utils.py`.

In [None]:
%%writefile {SRC_DIR}/bbox_utils.py
from typing import Union

import numpy as np
import torch

BboxType = Union[np.ndarray, torch.Tensor]


def cast_int_to_float(inputs: BboxType) -> BboxType:
    if isinstance(inputs, torch.Tensor):
        return inputs.float()
    return inputs.astype(np.float32)


def clone(inputs: BboxType) -> BboxType:
    if isinstance(inputs, torch.Tensor):
        return inputs.clone()
    return inputs.copy()

def voc2yolo(inputs: BboxType, height: float, width: float) -> BboxType:
    outputs = clone(inputs)
    outputs = cast_int_to_float(outputs)

    outputs[..., [0, 2]] /= width
    outputs[..., [1, 3]] /= height

    outputs[..., 2] -= outputs[..., 0]
    outputs[..., 3] -= outputs[..., 1]

    outputs[..., 0] += outputs[..., 2] / 2
    outputs[..., 1] += outputs[..., 3] / 2

    return outputs


def yolo2voc(inputs: BboxType, height: float, width: float) -> BboxType:
    outputs = clone(inputs)
    outputs = cast_int_to_float(outputs)

    outputs[..., [0, 2]] *= width
    outputs[..., [1, 3]] *= height

    outputs[..., 0] -= outputs[..., 2] / 2
    outputs[..., 1] -= outputs[..., 3] / 2
    outputs[..., 2] += outputs[..., 0]
    outputs[..., 3] += outputs[..., 1]

    return outputs

Writing /content/src/bbox_utils.py


First, we create some global constants.

In [11]:
HEIGHT, WIDTH = 480, 640

voc = [98, 345, 420, 462]
yolo = [0.4046875, 0.840625, 0.503125, 0.24375]

# GT_BBOXES are the ground truth bboxes, they are equivalent and all stem from [98, 345, 420, 462]
GT_BBOXES = {
    "voc": voc,
    "yolo": yolo,
}

The `GT_BBOXES` is a dictionary that holds the bounding box format name as key and its ground truth as values. Note that the ground truth are equivalent in the sense that they all stem from the ground truth `[98, 345, 420, 462]` which is in the `xmin, ymin, xmax, ymax` format (also the voc format).

### Using parametrize to allow varying input data types

Then, if we want to test the correctness of the transformation, we first need to ensure that our test function can accept two types of input data type, `np.ndarray` and `torch.Tensor`.

So for one transform function `voc2yolo`, we need to test it twice, one for which the input data type is a `np.ndarray`, the other when it's a `torch.Tensor`.

Fortunately, as we have seen in the pytest documentation, the decorator `pytest.mark.parametrize` does just that.

In [13]:
def list2numpy(list_):
    return np.asarray(list_)

def list2torch(list_):
    return torch.tensor(list_)

@pytest.mark.parametrize("convert_type", [list2numpy, list2torch])
def test_voc2yolo(convert_type):
    """Test conversion from VOC to YOLO."""
    from_bbox = convert_type(GT_BBOXES["voc"])
    to_bbox = voc2yolo(from_bbox, height=HEIGHT, width=WIDTH)

    expected_bbox = convert_type(GT_BBOXES["yolo"])

    assert expected_bbox == pytest.approx(to_bbox, abs=1e-4)

- `lines 1-5` consists of two functions, `list2numpy` and `list2torch`, which converts the ground truth bounding box inputs to either `numpy` or `torch` (note that the ground truth is created as a `list` so that the conversion is easy).

- `line 7` defines the `pytest.mark.parametrize` decorator where
    - the *first argument* is a comma-delimited string of parameter names, this string will be the argument names in the function that follows. Here I named it `"convert_type"`;
    - the *second argument* will define what *values* the *first argument* can take on. This argument has type `List[Tuple[Any]]` or `List[Any]`. In our example, our first argument is actually a function `convert_type` which can be either `list2numpy` or `list2torch`.

- `line 8` is our function name `test_voc2yolo` and as the name suggests, it will test whether our conversion of voc to yolo is correct. Note that the argument is named `convert_type`, corresponding exactly to our *first argument* in the decorator.

- `line 10` is where we apply our argument `convert_type`, a function to the input `GT_BBOXES["voc"] = [98, 345, 420, 462]`. Then the decorator will then perform `list2numpy` first on this input and convert the `list` to a `np.ndarray`.

- `line 11` will then convert the input using our `voc2yolo` to its yolo equivalent format.

- `line 13` gets the ground truth for yolo. Note I need to convert them into the same type as the input ground truth.

- `line 15` will then check if our converted value `to_bbox` matches the ground truth for yolo `expected_bbox`.

The process does not stop here, since we passed in two values for the function `convert_type`, it will also loop through the `list2torch` step.

In [38]:
%%writefile {TEST_DIR}/test_bbox_utils.py
import numpy as np
import pytest
import torch
from typing import Union
import sys
sys.path.append("/content") # append to import properly.
from src.bbox_utils import voc2yolo, yolo2voc, clone

HEIGHT, WIDTH = 480, 640

voc = [98, 345, 420, 462]
yolo = [0.4046875, 0.840625, 0.503125, 0.24375]

# GT_BBOXES are the ground truth bboxes, they are equivalent and all stem from [98, 345, 420, 462]
GT_BBOXES = {
    "voc": voc,
    "yolo": yolo,
}

def list2numpy(list_):
    return np.asarray(list_)

def list2torch(list_):
    return torch.tensor(list_)

@pytest.mark.parametrize("convert_type", [list2numpy, list2torch])
def test_voc2yolo(convert_type):
    """Test conversion from VOC to YOLO."""
    from_bbox = convert_type(GT_BBOXES["voc"])
    to_bbox = voc2yolo(from_bbox, height=HEIGHT, width=WIDTH)

    expected_bbox = convert_type(GT_BBOXES["yolo"])
    
    assert expected_bbox.all() == pytest.approx(to_bbox.all(), abs=1e-4)

Overwriting /content/tests/test_bbox_utils.py


In [39]:
!pytest -v tests/test_bbox_utils.py -s       # tests for a single file

platform linux -- Python 3.7.14, pytest-3.6.4, py-1.11.0, pluggy-0.7.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /content, inifile:
plugins: typeguard-2.7.1
collected 2 items                                                              [0m

tests/test_bbox_utils.py::test_voc2yolo[list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[list2torch] [32mPASSED[0m



As we can see

```bash
$ tests/test_bbox_utils.py::test_voc2yolo[list2numpy] PASSED
$ tests/test_bbox_utils.py::test_voc2yolo[list2torch] PASSED
```

means that the test function has tested for both combinations of `numpy` and `tensor`, and both give us the expected output!

### Using parametrize to test for dimensions

Our next step is to test that our transform functions can handle different dimensions. Whether the input is a 3d-tensor, or a 10d-array, all of them should work. Of course, our main goal here is still to test the correctness of the transformation, but bear in mind we need to have a separate test to check the consistency of input and output dimensions (i.e. passing in a 2d-array will result in an output of 2d-array).

Let's say we want to test if the code works for 3 dimensions, means checking if the code can execute correctly without error for dimensions in `[1d, 2d, 3d]`.

This is not trivial as we need to check for 6 different cases, a result of the cartesian product of

```
[list2numpy, list2torch] x [0, 1, 2] = {(list2numpy, 0), (list2numpy, 1), ...}
```

a total of 6 combinations.

We will continue to leverage `pytest`'s parametrize to test all 6 cases.

There will not be much change besides defining an extra utility function `expand_dim` will expands the input's dimensions according to the `num_dims` argument.

To be able to use the cartesian product, we simply add one more decorator below our `convert_type`, in which case it now takes in `num_dims` as first argument, and `[0, 1, 2]` as the second, indicating that we want the function to test for the aforementioned 3 dimensions. Having two parametrize decorators stacked together means it will execute in combination, exactly as what we wanted.

In [44]:
%%writefile {TEST_DIR}/test_bbox_utils.py
import numpy as np
import pytest
import torch
from typing import Union
import sys
sys.path.append("/content") # append to import properly.
from src.bbox_utils import voc2yolo, yolo2voc, clone

HEIGHT, WIDTH = 480, 640

voc = [98, 345, 420, 462]
yolo = [0.4046875, 0.840625, 0.503125, 0.24375]

# GT_BBOXES are the ground truth bboxes, they are equivalent and all stem from [98, 345, 420, 462]
GT_BBOXES = {
    "voc": voc,
    "yolo": yolo,
}

def list2numpy(list_):
    return np.asarray(list_)

def list2torch(list_):
    return torch.tensor(list_)

def expand_dim(
    bboxes: Union[np.ndarray, torch.Tensor],
    num_dims: int,
) -> Union[np.ndarray, torch.Tensor]:
    """Expand the dimension of bboxes to num_dims.

    Note:
        np.expand_dims will not work for tuple dim numpy < 1.18.0 which
        is not the version in our cicd.
    """
    bboxes = clone(bboxes)
    return bboxes[(None,) * num_dims]

@pytest.mark.parametrize("convert_type", [list2numpy, list2torch])
@pytest.mark.parametrize("num_dims", [0, 1, 2])
def test_voc2yolo(convert_type, num_dims):
    """Test conversion from VOC to YOLO."""
    from_bbox = convert_type(GT_BBOXES["voc"])
    from_bbox = expand_dim(from_bbox, num_dims)

    to_bbox = voc2yolo(from_bbox, height=HEIGHT, width=WIDTH)

    expected_bbox = convert_type(GT_BBOXES["yolo"])
    
    assert expected_bbox.all() == pytest.approx(to_bbox.all(), abs=1e-4)

Overwriting /content/tests/test_bbox_utils.py


In [45]:
!pytest -v tests/test_bbox_utils.py -s       # tests for a single file

platform linux -- Python 3.7.14, pytest-3.6.4, py-1.11.0, pluggy-0.7.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /content, inifile:
plugins: typeguard-2.7.1
collected 6 items                                                              [0m

tests/test_bbox_utils.py::test_voc2yolo[0-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[0-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[1-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[1-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[2-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[2-list2torch] [32mPASSED[0m



We see that our results passed

```bash
tests/test_bbox_utils.py::test_voc2yolo[0-list2numpy] PASSED
tests/test_bbox_utils.py::test_voc2yolo[0-list2torch] PASSED
tests/test_bbox_utils.py::test_voc2yolo[1-list2numpy] PASSED
tests/test_bbox_utils.py::test_voc2yolo[1-list2torch] PASSED
tests/test_bbox_utils.py::test_voc2yolo[2-list2numpy] PASSED
tests/test_bbox_utils.py::test_voc2yolo[2-list2torch] PASSED
```

Notice that in each line they indicate the combination, for example, the first line says `test_voc2yolo[0-list2numpy] PASSED`, which means they tested for the case of `0` dimensions and the input type of `numpy`.

### Use fixtures 

Before we move on, we talk a little about fixtures.

The idea of fixtures is that if you have multiple test functions that take in a "fixed" set of inputs (i.e. `GT_BBOXES`), then we should consider using fixtures.

For example, if I were to write a new test `yolo2voc`, then I'd expect the `GT_BBOXES` to be called inside the test again.

For that we can write a function `gt_bboxes` that has `pytest.fixture` as decorator. Subsequently, we can pass `gt_bboxes` to any test functions.

However, the same result can be achieved with our old method, defining a global constant `GT_BBOXES` work as well. However, if you have multiple tests spanned across different scripts (files), and they happen to need the same global constant `GT_BBOXES`, then it is better to define a fixture for it and place it in `conftest.py` (more on that in documentation).

For our purpose, we will still package `GT_BBOXES` into a fixture and use it as argument for both `test_voc2yolo` and `test_yolo2voc`.

In [48]:
%%writefile {TEST_DIR}/test_bbox_utils.py
import numpy as np
import pytest
import torch
from typing import Union
import sys
sys.path.append("/content") # append to import properly.
from src.bbox_utils import voc2yolo, yolo2voc, clone

HEIGHT, WIDTH = 480, 640

voc = [98, 345, 420, 462]
yolo = [0.4046875, 0.840625, 0.503125, 0.24375]

# GT_BBOXES are the ground truth bboxes, they are equivalent and all stem from [98, 345, 420, 462]
GT_BBOXES = {
    "voc": voc,
    "yolo": yolo,
}

@pytest.fixture(scope="module")
def gt_bboxes():
    return GT_BBOXES

def list2numpy(list_):
    return np.asarray(list_)

def list2torch(list_):
    return torch.tensor(list_)

def expand_dim(
    bboxes: Union[np.ndarray, torch.Tensor],
    num_dims: int,
) -> Union[np.ndarray, torch.Tensor]:
    """Expand the dimension of bboxes to num_dims.

    Note:
        np.expand_dims will not work for tuple dim numpy < 1.18.0 which
        is not the version in our cicd.
    """
    bboxes = clone(bboxes)
    return bboxes[(None,) * num_dims]

@pytest.mark.parametrize("convert_type", [list2numpy, list2torch])
@pytest.mark.parametrize("num_dims", [0, 1, 2])
def test_voc2yolo(gt_bboxes, convert_type, num_dims):
    """Test conversion from VOC to YOLO."""
    from_bbox = convert_type(gt_bboxes["voc"])
    from_bbox = expand_dim(from_bbox, num_dims)

    to_bbox = voc2yolo(from_bbox, height=HEIGHT, width=WIDTH)

    expected_bbox = convert_type(gt_bboxes["yolo"])
    
    assert expected_bbox.all() == pytest.approx(to_bbox.all(), abs=1e-4)

@pytest.mark.parametrize("convert_type", [list2numpy, list2torch])
@pytest.mark.parametrize("num_dims", [0, 1, 2])
def test_yolo2voc(gt_bboxes, convert_type, num_dims):
    """Test conversion from YOLO to VOC."""
    from_bbox = convert_type(gt_bboxes["yolo"])
    from_bbox = expand_dim(from_bbox, num_dims)

    to_bbox = voc2yolo(from_bbox, height=HEIGHT, width=WIDTH)

    expected_bbox = convert_type(gt_bboxes["voc"])
    
    assert expected_bbox.all() == pytest.approx(to_bbox.all(), abs=1e-4)

Overwriting /content/tests/test_bbox_utils.py


In [49]:
!pytest -v tests/test_bbox_utils.py -s       # tests for a single file

platform linux -- Python 3.7.14, pytest-3.6.4, py-1.11.0, pluggy-0.7.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /content, inifile:
plugins: typeguard-2.7.1
collected 12 items                                                             [0m

tests/test_bbox_utils.py::test_voc2yolo[0-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[0-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[1-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[1-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[2-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_voc2yolo[2-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_yolo2voc[0-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_yolo2voc[0-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_yolo2voc[1-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_yolo2voc[1-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_yolo2voc[2-list2

### Using parametrize for different transforms

Now I actually have more than 10 transform functions that converts bounding boxes. This means I have to write 10 cases: `test_voc2yolo`, `test_yolo2voc`, ...

Most of the code inside the test function is the same.

We can again leverage on parametrize and define the first argument to be `conversion_name` which takes on values such as `["voc2yolo", "yolo2voc"]`. These values will then be our identifier on which conversion/transform to use.

For that, we need to revamp our `GT_BBOXES` in our fixture `gt_bboxes`.

```python
@pytest.fixture(scope="module")
def gt_bboxes():
    """Return a dictionary of ground truth bboxes."""
    return {
        "voc2yolo": [voc, yolo],
        "yolo2voc": [yolo, voc],
    }
```

where by construction, the key to be the exact name of the function call of the transformation. This means that if our transform function is `voc2yolo`, then our key must be `"voc2yolo"`. Then the values will be a list, where the first element is the ground truth of the input (i.e. voc), and the second element is the ground truth of the output (i.e. yolo). 

In [50]:
%%writefile {TEST_DIR}/test_bbox_utils.py
import numpy as np
import pytest
import torch
from typing import Union
import sys
sys.path.append("/content") # append to import properly.
from src.bbox_utils import voc2yolo, yolo2voc, clone

HEIGHT, WIDTH = 480, 640

voc = [98, 345, 420, 462]
yolo = [0.4046875, 0.840625, 0.503125, 0.24375]

# GT_BBOXES are the ground truth bboxes, they are equivalent and all stem from [98, 345, 420, 462]
GT_BBOXES = {
        "voc2yolo": [voc, yolo],
        "yolo2voc": [yolo, voc],
}

@pytest.fixture(scope="module")
def gt_bboxes():
    """Return a dictionary of ground truth bboxes."""
    return GT_BBOXES

def list2numpy(list_):
    return np.asarray(list_)

def list2torch(list_):
    return torch.tensor(list_)

def expand_dim(
    bboxes: Union[np.ndarray, torch.Tensor],
    num_dims: int,
) -> Union[np.ndarray, torch.Tensor]:
    """Expand the dimension of bboxes to num_dims.

    Note:
        np.expand_dims will not work for tuple dim numpy < 1.18.0 which
        is not the version in our cicd.
    """
    bboxes = clone(bboxes)
    return bboxes[(None,) * num_dims]

@pytest.mark.parametrize("convert_type", [list2numpy, list2torch])
@pytest.mark.parametrize("num_dims", [0, 1, 2])
@pytest.mark.parametrize("conversion_name", ["voc2yolo", "yolo2voc"])
def test_correct_transforms(gt_bboxes, convert_type, num_dims, conversion_name):
    """Test conversion."""

    conversion_fn = globals()[conversion_name]

    from_bbox, expected_bbox = gt_bboxes[conversion_name]

    from_bbox = convert_type(from_bbox)
    from_bbox = expand_dim(from_bbox, num_dims)

    to_bbox = conversion_fn(from_bbox, height=HEIGHT, width=WIDTH)

    expected_bbox = expand_dim(convert_type(expected_bbox), num_dims)
    
    assert expected_bbox.all() == pytest.approx(to_bbox.all(), abs=1e-4)

Overwriting /content/tests/test_bbox_utils.py


In [51]:
!pytest -v tests/test_bbox_utils.py -s       # tests for a single file

platform linux -- Python 3.7.14, pytest-3.6.4, py-1.11.0, pluggy-0.7.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /content, inifile:
plugins: typeguard-2.7.1
collected 12 items                                                             [0m

tests/test_bbox_utils.py::test_correct_transforms[voc2yolo-0-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_correct_transforms[voc2yolo-0-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_correct_transforms[voc2yolo-1-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_correct_transforms[voc2yolo-1-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_correct_transforms[voc2yolo-2-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_correct_transforms[voc2yolo-2-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_correct_transforms[yolo2voc-0-list2numpy] [32mPASSED[0m
tests/test_bbox_utils.py::test_correct_transforms[yolo2voc-0-list2torch] [32mPASSED[0m
tests/test_bbox_utils.py::test_corr

We see that we achieved the same results.