# Basic TDD, Python, and py.test introduction dojo - Weather Kata

## About this session
This notebook is intended to facilitate a Dojo on a very simple Kata for beginners to Python and py.test.

**DISCLAIMER**:
This content introduces basic Python usage enough to solve the proposed kata, which might not be the most appropriate for many real world cases.
Some of the code used here to solve the kata could be improved by adding more adequate tools and libraries, like, for example, data munching, which could be improved a lot by using libraries like `numpy`, or `pandas`. However, showing how to use these, which are not plain basic Python, is not the goal of this session.

The goal of this session is to show up TDD techniques, using pretty much standard and basic Python, explaining just what's required to solve the proposed Kata.

This session is driven with many steps, so the audience is expected to fit just what's required, not trying to get extra bonus.
This is not a competition, not a contest, there is no price.

## Requirements
The audience setup should be based on Python 3.x, and it should include `py.test` and `pytest-benchmark`.
Some level of programming knowledge might be required.
Since Python is pretty readable and clear, the code should be easy to understand.

## Details about contents in this document
Some contents in this document, like the two following cells, are not part of the content.
These are tricks and tips to make the execution of the code easier.

Also, some of the cells containing code start with `%%...` lines.
Just omit these when copying code.

In [1]:
%%javascript

IPython.keyboard_manager.command_shortcuts.add_shortcut('9', {
    help: 'Clear all output',               // This text will show up on the help page (CTRL-M h or ESC h)
    handler: function (event) {             // Function that gets invoked
        if (IPython.notebook.mode == 'command') {
            IPython.notebook.clear_all_output();
            return false;
        }
        return true;
    }
});

<IPython.core.display.Javascript object>

In [2]:
%%bash
rm *.py*

# Software testing

Testing is a software development technique used to ensure the software being built accomplishes certain requirements and satisfy the expected features.
By writing tests, programmers ensure code runs as expected.
In many development methodologies, tests are run when code needs to be checked.

When a bug is identified and it can be reproduced, it's pretty usual to write a test reproducing the bug, to ensure future modifications will not cause the bug to appear anymore.

As well, some points of view might consider the tests as part of the code documentation, allowing contributors and developers, to learn how the software built can be used.
In some cases, tests might be even better and usually much updated than proper documentation.
Some approaches skip writing this documentation, even commenting code itself, and the resulting code is still easy to be used by just reviewing the tests.

Writing tests can be pretty difficult, since the testing code should not be a copy or version of the working code.
For example, writing a test for a function doing an addition on two numbers, should never do the actual addition, but providing two numbers to the function, and expecting the number which is the result of the addition:

```python
def addition(number1, number2):
    return number1 + number2

def bad_test_addition():
    assert addition(2,3) == 2 + 3 # Bad test, it contains the actual working code

def test_addition():
    assert addition(2,3) == 5
```

By the way, in order to increase the testing related vocabulary, these numbers, i.e. `2`,`3`, and `5`, would be called **fixtures**.
A fixture is a fixed entity, value, or data structure providing static input and output for testing a function is working.

This is another of the most common pitfalls of writing tests.
In some cases, using calculations and software for testing the functions, can cause tests to have flaky results, having them eventually passing or randomly failing, and causing tests not to be trusted.

A recommended way of writing tests is by applying the AAA pattern:
1. **Arrange**: Create and setup all required elements, fixtures, and other stuff required for the function to run.
2. **Act**: Execute the actual function or code being tested.
3. **Assert**: Check the values got from the execution of the tested software, have the effects expected.

# Test Driven Development

Test Driven Development is an agile methodology to build software, based on writing tests first, even before the actual working code really exists.
Then these tests are run, expecting to fail, just to proceed with modifying, or creating the code in such a way, a second run of the tests pass, showing the code is enough to satisfy requirements.

This technique is not always useful when writing real software, but in certain cases, it provides some benefits.
As explained when talking about tests as documentation, one of the outcomes of TDD when writing real software, is the ability to delay writing actual documentation, since the tests are already there to document how the software works.

Another outcome of doing TDD, instead of writing tests on already working code, is the fact that the code cover most of the required functions and features.
It also tends to help on keeping complexity away.

When doing TDD, the software engineering process goes iteratively over a cycle composed by three steps:
1. **Red**: Developers write or modify tests, adding a new requirement, so when run, they fail.
This execution must actually be done, to ensure the new or changed test is now failing.
2. **Green**: Developers write or modify the code itself, to satisfy all the tests, making them pass when run.
Of course, this second execution must also be actually done.
And, naturally, all previous tests and assertions, should be still passing.
3. **Refactor**: Optionally, when there is some possible optimization, or improvement, to be done on the proper code, and whenever all current tests are passing (**Green**), developers can refactor the code, so the improvement is applied.
This third step also implies running the tests, to ensure the new improved code still passes them.

# Dojos

A coding dojo is a coding session where the audience is challenged to solve a trivial problem using some techniques or tools which are to be practiced.
The problem is trivial, so not getting to the solution is not the main goal, but to understand and practice these techniques or tools.

There are many kinds of dojos with very different focuses.
This session will focus on applying TDD, but also including some peeks on Python and py.test features.

The goals of a coding dojo session are:
1. **Have fun**
2. **Learn something new**
4. **Practice something new**
3. **Share knowledge**

In order to make the session coherent, time boxing is usually applied on challenges that can be iteratively solved.
This iterations are limited in time, but should be easy to get the results in time.
Between iterations, some discussions on the steps followed, techniques applied, tools used, and problems found, usually happen.

Also it's pretty common to work in pairs, practicing pair programming.
Pair programming is a coding technique that improves collaboration and communication between team members.

One good way to apply pair programming is applying this procedure:
1. First developer writes a test focusing on some new feature or requirement.
This developer must ensure all language requirements are satisfied, i.e. modules and functions exists, syntax is correct.
And then, he runs the tests ensuring these fail.
At this stage, he passes the keyboard to the second developer.
2. The second developer modifies the code under test in order to satisfy the test not passing.
Once he's happy with the code, runs the tests.
If the tests don't pass, the code needs more work.
Else, he can choose to refactor, for improvements, or to add one more assertion or one new test, for more requirements.
Again, ensure the new assertion or test fails when run, and passing the keyboard to the first developer back.
3. Repeat until all features are satisfied.

Sometimes dojos, like this one, are focused on getting in-depth practice or learning.
Also when doing TDD and baby stept when the problem is not very clear or highly complex, helps keeping good, clean and simple code.

At the end of the session, it's pretty common to run a small and fast retrospective where everyone can point on aspects that can be improved, or what they enjoyed or learnt.
In these dojos, we appreciate feedback to improve further sessions.

Finally, the challenge proposed, following the martial arts metaphor, is called kata.
As explained before, this is usually relatively simple and easy, but sometimes, it is not.
This kata is simple enough, though.

# Weather data kata

The goal of this exercise is to get a program reading the `weather.dat` file and printing the day and minimum temperature values for the day with the lowest minimum temperature within the month depicted in the file.
The program should work like this:

    python weather.py
    9 32

Contents for the `weather.dat` file are tabular space-separated data for weather measurements for a month in a place.
The file has a header line, followed by an empty line, each month's day data, and a last line with month's mean values for some of the columns.
The data lines contain the number of the day of the month, in the first column, and the minimum temperature for this day in the third column.

The contents look like these:

In [None]:
# %load weather.dat
  Dy MxT   MnT   AvT   HDDay  AvDP 1HrP TPcpn WxType PDir AvSp Dir MxS SkyC MxR MnR AvSLP

   1  88    59    74          53.8       0.00 F       280  9.6 270  17  1.6  93 23 1004.5
   2  79    63    71          46.5       0.00         330  8.7 340  23  3.3  70 28 1004.5
   3  77    55    66          39.6       0.00         350  5.0 350   9  2.8  59 24 1016.8
   4  77    59    68          51.1       0.00         110  9.1 130  12  8.6  62 40 1021.1
   5  90    66    78          68.3       0.00 TFH     220  8.3 260  12  6.9  84 55 1014.4
   6  81    61    71          63.7       0.00 RFH     030  6.2 030  13  9.7  93 60 1012.7
   7  73    57    65          53.0       0.00 RF      050  9.5 050  17  5.3  90 48 1021.8
   8  75    54    65          50.0       0.00 FH      160  4.2 150  10  2.6  93 41 1026.3
   9  86    32*   59       6  61.5       0.00         240  7.6 220  12  6.0  78 46 1018.6
  10  84    64    74          57.5       0.00 F       210  6.6 050   9  3.4  84 40 1019.0
  11  91    59    75          66.3       0.00 H       250  7.1 230  12  2.5  93 45 1012.6
  12  88    73    81          68.7       0.00 RTH     250  8.1 270  21  7.9  94 51 1007.0
  13  70    59    65          55.0       0.00 H       150  3.0 150   8 10.0  83 59 1012.6
  14  61    59    60       5  55.9       0.00 RF      060  6.7 080   9 10.0  93 87 1008.6
  15  64    55    60       5  54.9       0.00 F       040  4.3 200   7  9.6  96 70 1006.1
  16  79    59    69          56.7       0.00 F       250  7.6 240  21  7.8  87 44 1007.0
  17  81    57    69          51.7       0.00 T       260  9.1 270  29* 5.2  90 34 1012.5
  18  82    52    67          52.6       0.00         230  4.0 190  12  5.0  93 34 1021.3
  19  81    61    71          58.9       0.00 H       250  5.2 230  12  5.3  87 44 1028.5
  20  84    57    71          58.9       0.00 FH      150  6.3 160  13  3.6  90 43 1032.5
  21  86    59    73          57.7       0.00 F       240  6.1 250  12  1.0  87 35 1030.7
  22  90    64    77          61.1       0.00 H       250  6.4 230   9  0.2  78 38 1026.4
  23  90    68    79          63.1       0.00 H       240  8.3 230  12  0.2  68 42 1021.3
  24  90    77    84          67.5       0.00 H       350  8.5 010  14  6.9  74 48 1018.2
  25  90    72    81          61.3       0.00         190  4.9 230   9  5.6  81 29 1019.6
  26  97*   64    81          70.4       0.00 H       050  5.1 200  12  4.0 107 45 1014.9
  27  91    72    82          69.7       0.00 RTH     250 12.1 230  17  7.1  90 47 1009.0
  28  84    68    76          65.6       0.00 RTFH    280  7.6 340  16  7.0 100 51 1011.0
  29  88    66    77          59.7       0.00         040  5.4 020   9  5.3  84 33 1020.6
  30  90    45    68          63.6       0.00 H       240  6.0 220  17  4.8 200 41 1022.7
  mo  82.9  60.5  71.7    16  58.8       0.00              6.9          5.3


# Introduction to py.test

While not being included in Python's standard library, `py.test` is one of the most simple, powerful, and idiomatic testing libraries.

It makes use of Python's native `assert` statement, it's really easy to write tests faster.
It also integrates with other testing libraries, like `unittest` from the standard library, or the commonly used `nosetest`, and others.
So if applied on existing project using these, `py.test` is able to detect the tests and run these.

It also includes a lot of goodies, like `capsys`, which will be used later in this session, `monkeypatching`, and there are many plugins for it adding many capabilities and features.

For example, consider this test and function:

In [3]:
%%writefile test_demo.py
def multiply(number1, number2):
    return number1 + number2

def test_multiply():
    result = multiply(5,6)
    assert result == 30

Writing test_demo.py


When we want to run this test, we might use this command:

In [4]:
%%bash
py.test .

platform darwin -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
benchmark: 3.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=5.00us max_time=1.00s calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/ifosch/src/github.com/BCNDojos/pyDojos/factor-out, inifile: 
plugins: benchmark-3.0.0
collected 1 items

test_demo.py F

__________________________________________________________________________________________________ test_multiply __________________________________________________________________________________________________

    def test_multiply():
        result = multiply(5,6)
>       assert result == 30
E       assert 11 == 30

test_demo.py:6: AssertionError


In this output, py.test shows it caught the file `test_demo.py` and run the function `test_multiply`, since it starts with `test_`, and it print an `F` close to the file name, indicating it just contained one test function that failed to pass.
It also shows what was the assertion with the value got from the Act part of the test, and the fixture expected to be.
It also shows it got an `AssertionError`, meaning the assert statement could not pass.
If the test run would failed for any other reason, i.e. any kind of error or exception, it would be there as well.

In [5]:
%%writefile test_demo.py
def multiply(number1, number2):
    return number1 * number2

def test_multiply():
    result = multiply(5,6)
    assert result == 30

Overwriting test_demo.py


Once the function is fixed, let's run the test again:

In [6]:
%%bash
py.test .

platform darwin -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
benchmark: 3.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=5.00us max_time=1.00s calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/ifosch/src/github.com/BCNDojos/pyDojos/factor-out, inifile: 
plugins: benchmark-3.0.0
collected 1 items

test_demo.py .



Now the point by the file name, indicates that the one and only test function found passed ok.

In [7]:
%%bash
rm test_demo.py

# Bootstrapping the solution with py.test

To start with the code, following Test Driven Development, the `test_weather.py` file should be created with the following contents:

```python
import weather

def test_process_weather():
    weather.process()
```

In [8]:
%%writefile test_weather.py
import weather

def test_process_weather():
    weather.process()

Writing test_weather.py


Once created, running the tests should make it break with an error, since there is no such `weather` module, yet:

In [9]:
%%bash
py.test test_weather.py

platform darwin -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
benchmark: 3.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=5.00us max_time=1.00s calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/ifosch/src/github.com/BCNDojos/pyDojos/factor-out, inifile: 
plugins: benchmark-3.0.0
collected 0 items / 1 errors

________________________________________________________________________________________ ERROR collecting test_weather.py _________________________________________________________________________________________
test_weather.py:1: in <module>
    import weather
E   ImportError: No module named 'weather'


So next step to take is to create a trivial module, which will do actually nothing:

```python
def process():
    pass
```

In [10]:
%%writefile weather.py
def process():
    pass

Writing weather.py


With the `weather.py` module created, test should pass ok, now:

In [11]:
%%bash
py.test test_weather.py

platform darwin -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
benchmark: 3.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=5.00us max_time=1.00s calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/ifosch/src/github.com/BCNDojos/pyDojos/factor-out, inifile: 
plugins: benchmark-3.0.0
collected 1 items

test_weather.py .



# Files with Python (I)

Opening a file in Python is as easy as to use the `open(filename[, mode[, buffering]])` function.
The mode parameter is a string indicating how the file is to be open, being `'r'`, i.e. read only, when omitted.
The buffering parameter is a number indicating which buffering mechanism should be used when accessing or writing the file.
By default, it uses system default.
On success, it returns a `file` object, that can be used to access or write the file, depending on the mode used.

The two last arguments are not meaningful for the whole exercise, but the documentation is pretty extensive on all possible values.

As usual, files need to be closed when the work has finished.
Python, as many other languages and interpreters, closes all files once the execution finished.
Anyway, it's considered a bad practice to keep a file open when it's not needed anymore.
Also, this might imply memory consumption and performance issues.
The way to close a file is to use the close method of the file object open returned:

```python
    myfile = open('myfile.dat')
    ...
    myfile.close()
```

File objects provide also a `readlines` method to read all lines, separated by new line characters, and getting these in a iterable list, which is pretty useful for the purpose of this exercise.
Each line is a string which includes the new line character at the end.

So the following code gets all the lines from a file:

```python
    myfile = open('myfile.dat')
    mylines = file.readlines()
    myfile.close()
```

# Printing strings with Python

In Python, printing a string is as easy as use the `print` function:

In [12]:
print("Hello")

Hello


The `print` function used like this, adds a new line character when printing.

When formatting data into string, the `format` string method can be used.
It accepts parameters which will be replacing each `{}` occurrence in the string, in the same order.
Notice that the quantity of `{}` in the string must match the number of parameters passed:

In [13]:
name = 'pythonist'
print("Hello {}".format(name))
adjective = "nice"
print("Hello {} {}".format(adjective, name))

Hello pythonist
Hello nice pythonist


# Capturing output with py.test

When using py.test with code that prints something to the output, `capsys` comes to be useful.
The `capsys` fixture is a tool to capture standard and error output caused by the function in memory, so it's dangerous when the output can be too big, or not very useful when it should be output to a file.
It is used by passing it to the test function as parameter, invoking the tested function, and using the `capsys.readouterr` function to get those:

```python
    def test_demo(capsys):
        print("Hello")
        out, err = capsys.readouterr()
        assert out == "Hello\n"
```

Notice the output includes a new line character.

# Lists with Python

Lists are arrays in Python and can be created with `[]`, which may enclose the elements:

In [14]:
a = []
b = [1, 2, 3,]
print(a)
print(b)

[]
[1, 2, 3]


In order to get the lenght of the list, Python provides the `len` function:

In [15]:
print(len(a))
print(len(b))

0
3


Python's lists are indexed starting by 0, and these can be sliced to access specific items, as well, as sublists:

In [16]:
print(b[0])
print(b[1:2])
print(b[1:])
print(b[::2])
print(b[::-1])

1
[2]
[2, 3]
[1, 3]
[3, 2, 1]


# Strings with Python

Strings in Python are enclosed by `"` and can be treated as lists of characters:

In [17]:
a = ""
b = "Hello"
print(a)
print(b)
print(len(a))
print(len(b))
print(b[::2])


Hello
0
5
Hlo


Python's strings provides some useful methods, like `startswith`:

In [18]:
if b.startswith("H"):
    print(b)

Hello


`split`, returning a list of substrings splitting by the string parameter passed:

In [19]:
print("1-2-3".split("-"))
print("Hello123Bye".split("123"))

['1', '2', '3']
['Hello', 'Bye']


`join`, that allows to concatenate strings in a list, as first argument, with the string the method is called on:

In [20]:
"-".join(["1", "2", "3"])

'1-2-3'

or `strip`, which strips the string passed as argument, or space and new line by default, from the beginning and end of the string on which it's called:

In [21]:
print(" This ia a line starting with a space\n".strip())
print("New line New".strip("New"))
print("New line New".strip("New").strip())

This ia a line starting with a space
 line 
line


# First iteration

The first iteration will focus on load data lines from the file.
The approach chosen is pretty simple, just read lines from the file, and print them all.

**This should take not more than 40 minutes.**

## Regular expressions in Python (I)

Regular expressions is a pattern matching and data extraction technique in which a string defining a pattern with a specific language is used to process a variable string over automata.
This automaton is a mathematical object, a finite-state machine, that takes inputs to operate on ending in a final state. Concretely, a regular expression automaton takes two strings, one pattern, and another one containing data, and walks over the second checking the first.
The pattern string is what is called a regular expression.

Regular expressions uses a set of special characters and constructs to determine what the pattern matches, or what doesn't. Some examples:
* `[a-z]` matches one single character which can be any lower case alphabet letter.
* `[a-zA-Z]*` matches with any amount (`*`) of occurrences of any lower or upper case alphabet letter.
* `[0-9]+ [a-z]*` will match with at least one or more (`+`) occurrences of digits followed by a space and any amount of lower case letters.
* `.*` matches any string with any amount of elements, even an empty string.

In Python, regular expression strings are differentiated by enclosing them with `r""`, like in `r"[a-z]*"`.
In order to process these, the `re` module is used.
This module provides two functions:
* `search`, which checks for the pattern anywhere in the string.
* `match`, which will match only when the pattern matches at the beginning of the string.

When there is no match, the return value of both is `None`, otherwise, they return a match object.

In [22]:
import re

a = "this is 1 2 3 4 500"
regexp = r"[0-9 ]+"

match = re.search(regexp, a)
if match:
    print("It matches with search!")
match = re.match(regexp, a)
if match:
    print("It matches with match!")

It matches with search!


# Second iteration

In this second iteration, the program should be able to skip header and empty lines, printing only data lines.

**This should be easily completed in 20 minutes.**

# Regular expressions in Python (II)

A little bit more on regular expressions:
* `()` allows to get particular matching groups as part of the result, enabling for extracting data in a list.
* `(?P<id>any_regex)` enables to get these groups in a key-value structure (called dictionaries in Python).
* `\s` represents any space-like character, like spaces, or new lines.

In [23]:
a = "Name: Guido, Country: Netherlands, Year of birth: 1956"
pattern = r"Name: (?P<name>[a-zA-Z]+), Country: (?P<country>[a-zA-Z]+), Year of birth: (?P<year>[0-9]+)"
match = re.match(pattern, a)
if match:
    print("{} from {} was born in {}".format(match.group('name'), match.group('country'), match.group('year')))

Guido from Netherlands was born in 1956


# Third iteration

The third iteration should allow to get data using grouping in regular expressions, and finishing by reducing the numer of lines to the correct output, i.e. day and minimum temperature for the day with minimum temperature.

**This shouldn't take more than 15 minutes**

# Files with Python (II)

As discussed before, a file object should be closed within Python, for performance reasons when not sure about the size of the file. But there is a way of opening a file and ensuring correct closing behavior, using a `with` block:

```python
    with open('myfile') as data_file:
        ...
```

Here, Python will close the file once the with block is finished.

# First refactor

In the last iteration, the code should be easily modified to be more idiomatic and efficient when opening and closing the file, without causing the tests to fail.

**This should be accomplished in 10 minutes.**

# Files with Python (III)

When reading the file lines, usage of the `readlines` method is discouraged, specially when the lenght of the lines and the amount of lines is not very well known. So, another way of reading lines is needed, instead of `readlines`:

```python
    for line in data_file:
        ...
```

This makes reading lines much more efficient, since the file keeps iterating over the file without loading the whole content in memory.

# Second refactor

The way the lines are being read must be refactored to ensure it is memory efficient and faster.

**This should not take more than 10 minutes.**

# Exceptions in Python

Python code can raise exceptions and errors, and these can be controlled by using the `try ... except` construct:

In [24]:
try:
    a = 1 / 0
except ZeroDivisionError:
    print("Operation is not valid")

Operation is not valid


There is a very particular error Python, defined as `ValueError` can raise when types are not matching appropriately:

In [25]:
my_integer_string = "10"
my_string = "Hello"
print(int(my_integer_string))
print(int(my_string))

10


ValueError: invalid literal for int() with base 10: 'Hello'

In [26]:
try:
    print(int(my_string))
except ValueError:
    print("my_string doesn't represent an integer")

my_string doesn't represent an integer


# Test benchmarking

Benchmarking is a set of techniques and methodologies to provide insight on software performance.
Usually, it consists in running the software several times, timing and/or sizing it, and providing some stats on these metrics.

There is a plugin to benchmark functions in py.test, called pytest-benchmark:

In [27]:
%%writefile test_benchmark.py
import time

def variable_time_function(seconds=0.001):
    time.sleep(seconds)
    return 123

def test_variable_time_function(benchmark):
    result = benchmark(variable_time_function)
    assert result == 123

Writing test_benchmark.py


In [28]:
%%bash
py.test test_benchmark.py

platform darwin -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
benchmark: 3.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=5.00us max_time=1.00s calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /Users/ifosch/src/github.com/BCNDojos/pyDojos/factor-out, inifile: 
plugins: benchmark-3.0.0
collected 1 items

test_benchmark.py .


Computing stats ...Computing stats ... group 1/1Computing stats ... group 1/1: minComputing stats ... group 1/1: min (1/1)Computing stats ... group 1/1: min (1/1)Computing stats ... group 1/1: maxComputing stats ... group 1/1: max (1/1)Computing stats ... group 1/1: max (1/1)Computing stats ... group 1/1: meanComputing stats ... group 1/1: mean (1/1)Computing stats ... group 1/1: mean (1/1)Computing stats ... group 1/1: medianComputing stats ... group 1/1: median (1/1)Computing stats ... group 1/1: median (1/1)Computing stats ... group 1/1: iqrComputing stats ... group 1/1: iqr (1/1)Co

This py.test run took slightly more time and its output is including a table with some timing statistics.

`Rounds` shows how many times the function was executed.
`Min`, `Max`, `Mean`, `StdDev`, and `Median` show corresponding statistics for this set of runs.
`IQR` is the distance between the first and third quartile.
`Outliers` is the quantity of samples the execution time for the function took more than 1 time the standard deviation.

With very slow functions, usually, standard stadistics are enough to see the improvement.
However, with so fast functions, those are not enough, but then, when having low values for the IQR and Outliers metrics can point to mean some possible improvement, if the function is used frequently.

**WARNING**: The combined usage of `capsys` and `benchmark` is not recommended, since the output is printed on each round and the number of rounds is variable, so it could make the output assertion invalid.

In [29]:
%%bash
rm test_benchmark.py

# Third refactor

The usage of `re` module might be overkill for splitting the data in the lines as required for this case.
In this refactor the usage of the py.test benchmark plugin can help to view possible time optimization.

Some pitfalls this change might imply are:
* Some numbers in the columns are marked with an `*`.
* Watch out with header and empty lines.

**This should take not more than 15 minutes.**