Combinatorial Testing
=====================

Nowadays, software systems are complex and can have multiple possible configurations, for example, an application can have multiple target operating systems, execute in various types of hardware, and in multiple resolutions. But also those applications can have multiple states which they are in.
Those multiple parameters for a system cause different behaviors alone and when combined, so multiple combinations of a system’s parameters must be tested to achieve optimal test coverage, and catch errors that wouldn't be detected executing simple tests.

An example of an application that can have multiple possible configurations, is a mobile application, that can execute in multiple types of cellphone and in multiple possible states. A example of possible parameters {cite}`combinatorialexample` for a mobile application are:

| Parameter   | Options                       |
|-------------|-------------------------------|
| Orientation | portrait, landscape           |
| OS          | iOS, Android                  |
| Screen Size | 1080x1920, 750x1334, 720x1280 |

Using those parameters, we can calculate the list of possible parameter combinations multiplying the number of options for each parameter:

```
    Number of configuration combinations = 2 * 2 * 3 = 12 combinations of parameters
```

With a small quantity of parameters and options, this number is already big. But as the number of parameters and its possible values raise, the number of possible combinations rises exponentially, making it impossible to exhaustively test the software, given the time and budget constraints often existent on software projects.

A way to overcome those limitations is by using Combinatorial Testing, a method for software testing that for some input parameters, tests possible discrete combinations of those parameters, generating those combinations by systematically covering t-way interactions between parameters, the "t" being called the degree of interaction, this set of combinations is called a *covering array*.

The most popular *degree of interaction* used in combinatorial testing is t = 2, that is, it generate combinations that all parameter interact a least once with each other, this type of combinatorial testing is called *Pair Wise testing*, using the previous example a *covering array* generated by this technique would look like this:

| Orientation | Size      | OS      |
|-------------|-----------|---------|
| landscape   | 1080x1920 | iPhone  |            
| landscape   | 750x1334  | Android |
| landscape   | 720x1280  | Android |
| portrait    | 1080x1920 | Android |
| portrait    | 750x1334  | iPhone  |
| portrait    | 720x1280  | iPhone  |

As we can see, *Pair Wise testing* generates half of the combinations needed for the brute force aproach, producing high-quality testing at a lower cost because it provides a smarter way for testing using only a subset of the possible parameter combinations. 

Also, combinatorial testing is a very simple technique to apply, as it is based on the specification, it is enough to specify a system's parameters and its possible values, and the combinatorial testing tool will generate an *covering array* to test the system.

## Choosing a degree of interaction
Constructing a Combinatorial Testing suite, we have to specify the input parameters and values, but there’s an important parameter we must think about: the degree of interaction. A degree of interaction t means that we want to test the t-way interaction of our parameters, that is, we want to test all combinations of t parameters.	

Depending on the value chosen for "t", the process of generating the input can be more or less computationally complex but also can achieve more or less fault coverage. Increasing the value for "t" will increase the fault coverage, but also will increase the cost for generating and executing the tests, getting to a point that's almost no gain in coverage as the "t"  increases.

So, you would want to choose a degree of interaction that gives you an appropriate level of confidence but doesn't make the process of testing to costly. The creators of ACTS, a popular tool for combinatorial testing, have done research on the relation of the degree of interactions and software failures. They found that most of the bugs studied were caused by an interaction of at most 6 different parameters {cite}`actscombinatorial`:

```{figure} ../assets/interactions_chart.png
---
name: my-figure
---
Most failures are triggered by one or two parameters interacting, with progressively fewer by 3, 4, or more {cite}`actscombinatorial`.
```

## Hands On: Combinatorial Testing with Covertable and Pytest

Now that you understand what combinatorial testing is, let’s see in practice how it is done. In our tutorial we'll be using a tool for generating combinatorial parameters called [Covertable](https://github.com/walkframe/covertable/blob/master/python/README.rst) and a testing library for Python called [Pytest](https://github.com/pytest-dev/pytest).

### Motivation
Let’s start by looking at the file code.py which has the code we want to test

```python
def important_function(pressure, volume, velocity, low_fuel):
    if pressure < 10:
        if volume > 300:
            if velocity == 5:
                do_something_bad()
        elif low_fuel:
            do_something_good()
    else:
        do_something_good()

def do_something_good():
    pass

def do_something_bad():
    raise Exception("A bug happened!")
```


We want to test important_function, which takes three integer parameters and a boolean. Looking at the code, we can see that there’s a bug (do_something_bad) in one of the function’s branches. But the bug only happens if the three parameters satisfy some conditions: `pressure < 10`, `volume > 300`, and `velocity == 5`. So the bug is a result of a three-way interaction, because it happens only for a specific combination of those three parameters.

### Executing test set with Pytest and Covertable
Now that we took a look in the code that we want to test, let's see how can we actually use Covertable and Pytest to generate the combination of parameters that will spot the bug.

First, we need to install both pytest and covertable dependencies:

In [None]:
%%bash

pip install pytest
pip install covertable

Now that we have the dependencies installed, let's take a look in the file `test_parameterized.py`, which implements a test using those libraries.

```python
import pytest
from covertable import make

@pytest.mark.parametrize(["pressure", "volume", "velocity", "low_fuel"],
    make([[5,10,15],
        [200, 300, 400],
        [1, 2, 3, 4, 5],
        [True, False]], length=3)
)
def test_important_function(pressure, volume, velocity, low_fuel):
    important_function(pressure, volume, velocity, low_fuel)


As you can see, `pytest` provides us the annotation `@pytest.mak.parametrize` that receives two arguments: an array of parameters names and a array of array of values for each of those parameters. The function below this annotation will be tested for each of the values specified in the second argument.


In the first argument, we need to provide the name of the parameters we want to test. In this case, as we want to test `important_function`, we need to pass `["pressure", "volume", "velocity", "low_fuel"]`.

In the second argument, we need to pass an array of array of values, that would be the combination of parameters given to `important_function`. To do that, we will be using `covertable` function called `make`. This function receives an array of array of values and return an array with the combination of them. To be more clear, as we passed `[5,10,15]` as the first element of the array, those are the possible values for the parameter `pressure`, `[200, 300, 400]` would be the possible values for the parameter `volume` and so on.

It's important to notice that `make` also accepts some configuration. In the second argument, we set `length=3`. This is the degree of interaction that we teached earlier in this article. As we need the combination of `pressure`, `volume`, and `velocity` to catch the bug, our degree of interaction will be 3
(in a real-world scenario, though, we wouldn’t know when a bug happens. So, you would want to choose a degree of interaction that gives you an appropriate level of confidence, as we discussed).

Now that we have our test file, we just need to run pytest and see if there's a bug:

In [None]:
%%bash

python -m pytest

When running this command, you will see the log of parameters used to test the function and spot the bug caused by the combination of values that satisfies `pressure < 10, volume > 300, and velocity == 5`.

## References
```{bibliography} ../references.bib
```