# Testing in Python

<Brown.TerryN@epa.gov> ORD / CCTE / SCDCD / ADB 2022-06-14

## Testing in general

### Value of testing

Not always immediate, although it can be in more complex projects which are evolving.

Invaluable when code is re-factored, and code will always be refactored at some point, unless it's abandoned.

Valuable when errors in outputs might not be obvious, but could be serious (e.g. EPA data).

Like git, can make it easier for devs. to experiment with confidence.

### Testing in the development cycle

Ideally tests are fast enough to run when needed, to answer the question "does this work?".

Should be run as part of PR (pull request) review process and before releases of any kind.

Adding tests for new code / functionality should be an unstated part of any ticket, although it should also be stated.

### Unit tests vs. not

Unit tests test small well controlled pieces of code, often just a single
small function.

Sometimes this involves creating an artificial environment where inputs,
outputs, and other functions responses are controlled (stubbing or mocking).

Tests that aren't unit tests, *end-to-end* tests or *integration* tests, may
test the output of a pipeline of functions and inputs.

**Unit tests**: precisely identify the piece of code being tested / causing the error.

**End-to-end tests**: tell you that something's wrong.  This is really valuable, even if it's less efficient.  Knowing something is wrong lets you fix it before deployment or data-analysis
or a user report, so it's better than not knowing.

In small - medium sized projects, cost of debugging failed end-to-end tests is usually minor.

### Test Driven Design

The process of writing failing tests before writing code, then writing code
until the tests pass.
Creates a space to analyze design and integration without getting hung up on implementation.

## Python testing

### Use pytest, not unittest

 - The Python `unittest` module docs. point to the 3rd party `pytest` module.
 - Other examples, `lxml` for XML manipulation, `requests` for HTTP interactions,
   `dateutil` for date manipulation.
   
<https://docs.pytest.org/>

### Testing and notebooks / Jupyter Lab

Python testing frameworks, `pytest` and `unittest`, expect to test functions imported from `.py` files, this doesn't translate directly to a notebook environment, although notebooks can run code from `.py` files tested with `pytest` etc.

### Installation

In [None]:
!pip install pytest pytest-cov pytest-xdist
# extra for Jupyter in a Docker container, ignore
!export PYTHONPATH=.
!export PATH=/home/jovyan/.local/bin:$PATH

### How tests are found

There are options, see the docs., but one possibility:

```
project_folder
  myproject
    lib
      utils.py
  tests
    lib
      # scanned because it starts with "test_"
      test_utils.py
        # is a test because it starts with "test_"
        def test_something():
```

### Running pytest, command line params

Running, normally just
```shell
pytest tests
```
but if that fails
```shell
python -m pytest tests
```

Specifying the `tests` folder is not required but helps with finding some config. etc.

First phase is test "collection", with possible filtering / selection, then tests are run.

In [3]:
!python -m pytest tests

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
collected 9 items                                                              [0m

tests/lib/test_utils.py [31mF[0m[32m.[0m[32m.[0m[31m                                              [ 33%][0m
tests/lib/test_utils_more.py [32m.[0m[32m.[0m[32m.[0m[33ms[0m[33ms[0m[33ms[0m[31m                                      [100%][0m

[31m[1m_________________________________ test_invert __________________________________[0m

    [94mdef[39;49;00m [92mtest_invert[39;49;00m():
        [94massert[39;49;00m invert(([94m1[39;49;00m, [94m2[39;49;00m, [94m3[39;49;00m)) == [[94m3[39;49;00m, [94m2[39;49;00m, [94m1[39;49;00m]
>       [94massert[39;49;00m [33m"[39;49;00m[33m"[39;49;00m.join(invert([33m"[39;49;00m[33mabcdefghijklmnop[39;49;00m[33m"[39;49;00m)) == [33m"[

pytest uses `assert` and clever introspection,
no need to learn special functions for making comparisons.

In [4]:
!python -m pytest -vv tests # more verbose

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python
cachedir: .pytest_cache
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
collected 9 items                                                              [0m

tests/lib/test_utils.py::test_invert [31mFAILED[0m[31m                              [ 11%][0m
tests/lib/test_utils.py::test_summarize_db_size [32mPASSED[0m[31m                   [ 22%][0m
tests/lib/test_utils.py::test_summarize_db_inversion [32mPASSED[0m[31m              [ 33%][0m
tests/lib/test_utils_more.py::test_more_invert[[1, 2, 3]] [32mPASSED[0m[31m         [ 44%][0m
tests/lib/test_utils_more.py::test_more_invert[['a', 'b', 'c']] [32mPASSED[0m[31m   [ 55%][0m
tests/lib/test_utils_more.py::test_more_invert[['one', 'two', 'three']] [32mPASSED[0m[31m [ 66%][0m
tests/lib/test_utils_more.py::test_more2_invert[forward0-back0] [33mSKIPPED[0m (n

In [5]:
!python -m pytest -vv --lf tests # just the last failed tests

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python
cachedir: .pytest_cache
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
collected 1 item                                                               [0m
run-last-failure: rerun previous 1 failure (skipped 1 file)

tests/lib/test_utils.py::test_invert [31mFAILED[0m[31m                              [100%][0m

[31m[1m_________________________________ test_invert __________________________________[0m

    [94mdef[39;49;00m [92mtest_invert[39;49;00m():
        [94massert[39;49;00m invert(([94m1[39;49;00m, [94m2[39;49;00m, [94m3[39;49;00m)) == [[94m3[39;49;00m, [94m2[39;49;00m, [94m1[39;49;00m]
>       [94massert[39;49;00m [33m"[39;49;00m[33m"[39;49;00m.join(invert([33m"[39;49;00m[33mabcdefghijklmnop[39;49;00m[33m"[39;49;00m)) == [33m"[39;49;00m[33mponmlkjihg[39;49;00m[33m"[39;49;00m

### Selecting tests

In [6]:
# select test with 'test_inv' in its name or path
!python -m pytest -vv -k test_inv tests

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python
cachedir: .pytest_cache
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
collected 9 items / 8 deselected / 1 selected                                  [0m

tests/lib/test_utils.py::test_invert [31mFAILED[0m[31m                              [100%][0m

[31m[1m_________________________________ test_invert __________________________________[0m

    [94mdef[39;49;00m [92mtest_invert[39;49;00m():
        [94massert[39;49;00m invert(([94m1[39;49;00m, [94m2[39;49;00m, [94m3[39;49;00m)) == [[94m3[39;49;00m, [94m2[39;49;00m, [94m1[39;49;00m]
>       [94massert[39;49;00m [33m"[39;49;00m[33m"[39;49;00m.join(invert([33m"[39;49;00m[33mabcdefghijklmnop[39;49;00m[33m"[39;49;00m)) == [33m"[39;49;00m[33mponmlkjihg[39;49;00m[33m"[39;49;00m
[1m[31mE       AssertionError: assert 'ponmlkjihgfedcba' 

### Marking tests

Describe marks in `pytest.ini`, mark with decorator.

### Selecting tests by mark

In [7]:
!python -m pytest -vv -m very_slow tests

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python
cachedir: .pytest_cache
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
collected 9 items / 8 deselected / 1 selected                                  [0m

tests/lib/test_utils.py::test_summarize_db_inversion [32mPASSED[0m[32m              [100%][0m



### Special files

`tests/conftest.py` - Used for defining fixtures and advanced set up.

`tests/pytest.ini` - used to name markers and set other params.

### Fixtures

Things useful to multiple tests for reference, data access, etc.
Fixtures are available to tests if they fixture's name is given
as an argument name for the test function.

The last three iterations of temporary files created by pytest
are retained in `/tmp` for inspection.

In [8]:
!ls -R /tmp/pytest-of-jovyan  # `jovyan` is the username

/tmp/pytest-of-jovyan:
pytest-0  pytest-1  pytest-2  pytest-current

/tmp/pytest-of-jovyan/pytest-0:
tmp_data0  tmp_datacurrent

/tmp/pytest-of-jovyan/pytest-0/tmp_data0:
data.db

/tmp/pytest-of-jovyan/pytest-1:
tmp_data0  tmp_datacurrent

/tmp/pytest-of-jovyan/pytest-1/tmp_data0:
data.db

/tmp/pytest-of-jovyan/pytest-2:
tmp_data0  tmp_datacurrent

/tmp/pytest-of-jovyan/pytest-2/tmp_data0:
data.db


### Parametrize

Feed a bunch of different values through a test

In [9]:
# run tests marked `extras`, note obscure param. descriptions
!python -m pytest -vv --my-extras tests

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python
cachedir: .pytest_cache
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
collected 9 items                                                              [0m

tests/lib/test_utils.py::test_invert [31mFAILED[0m[31m                              [ 11%][0m
tests/lib/test_utils.py::test_summarize_db_size [32mPASSED[0m[31m                   [ 22%][0m
tests/lib/test_utils.py::test_summarize_db_inversion [32mPASSED[0m[31m              [ 33%][0m
tests/lib/test_utils_more.py::test_more_invert[[1, 2, 3]] [32mPASSED[0m[31m         [ 44%][0m
tests/lib/test_utils_more.py::test_more_invert[['a', 'b', 'c']] [32mPASSED[0m[31m   [ 55%][0m
tests/lib/test_utils_more.py::test_more_invert[['one', 'two', 'three']] [32mPASSED[0m[31m [ 66%][0m
tests/lib/test_utils_more.py::test_more2_invert[forward0-back0] [32mPASSED[0m[31

### Test coverage, is all the code being tested?

In [10]:
!python -m pytest --cov=mything --cov-report html tests

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
collected 9 items                                                              [0m

tests/lib/test_utils.py [31mF[0m[32m.[0m[32m.[0m[31m                                              [ 33%][0m
tests/lib/test_utils_more.py [32m.[0m[32m.[0m[32m.[0m[33ms[0m[33ms[0m[33ms[0m[31m                                      [100%][0m

[31m[1m_________________________________ test_invert __________________________________[0m

    [94mdef[39;49;00m [92mtest_invert[39;49;00m():
        [94massert[39;49;00m invert(([94m1[39;49;00m, [94m2[39;49;00m, [94m3[39;49;00m)) == [[94m3[39;49;00m, [94m2[39;49;00m, [94m1[39;49;00m]
>       [94massert[39;49;00m [33m"[39;49;00m[33m"[39;49;00m.join(invert([33m"[39;49;00m[33mabcdefghijklmnop[39;49;00m[33m"[39;49;00m)) == [33m"[

### Running tests in parallel

pytest-parallel plug-in is one option, but [pytest-xdist](https://pytest-xdist.readthedocs.io/en/latest/index.html) is more flexible.
See docs. for controlling processes (for CPU intensive test) vs.
threads, for tests that spend time waiting for DB queries and other
input / output (I/O) delays.

In [11]:
!python -m pytest -vv -n 2 tests

platform linux -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0 -- /opt/conda/bin/python
cachedir: .pytest_cache
rootdir: /home/jovyan/repo/pycop_pytest/tests, configfile: pytest.ini
plugins: forked-1.4.0, cov-3.0.0, xdist-2.5.0, anyio-3.6.1
[gw0] linux Python 3.10.5 cwd: /home/jovyan/repo/pycop_pytest
[gw1] linux Python 3.10.5 cwd: /home/jovyan/repo/pycop_pytest
[gw0] Python 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0]
[gw1] Python 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0]
gw0 [9] / gw1 [9][0m[1m
scheduling tests via LoadScheduling

tests/lib/test_utils.py::test_invert 
tests/lib/test_utils.py::test_summarize_db_size 
[gw1][36m [ 11%] [0m[32mPASSED[0m tests/lib/test_utils.py::test_summarize_db_size 
tests/lib/test_utils_more.py::test_more_invert[[1, 2, 3]] 
[gw1][36m [ 22%] [0m[32mPASSED[0m tests/lib/test_utils_more.py::test_more_invert[[1, 2, 3]] 
tests/lib/test_utils_more.py::test_more_invert[['a', 'b', 'c']] 