Skip to content
Sina D edited this page May 23, 2021 · 5 revisions

Introduction

Ivadomed has two main types of tests:

  1. Unit Tests

    • Unit tests typically test a small portion of code, such as a single function.
    • They test individual modules/functions in the ivadomed Python API, or isolated functions in CLI scripts (but not the full script).
  2. Functional Tests

    • Functional tests aim to test a specific feature/outcome from the code.
    • Often multiple functions/scripts are involved.
    • We don't care about the implementation of each function, as long as the desired outcome is achieved.
    • The nomenclature is unfortunately confusing, since the use of the word function in functional testing refers to a purpose or outcome, as opposed to the word function used in unit testing, which refers to a single programmatic method.
    • We test ivadomed's CLI scripts in full (as if called by a user).

Writing Tests

Tests are run in parallel using the pytest command. The default configuration uses the xdist plugin to perform distributed testing (num_workers == num_vcpu). Tests are grouped by module for test functions and by class for test methods.

General Recommendations

  • Ensure your tests clean up after themselves, so that no new files are left behind. Ideally, the test data directory (testing_data/) should be left in the same state it was in prior to the test.
    • The built-in pytest fixture tmp_path provides an place to output files that will take care of cleanup automatically.
  • Use logging in tests (don't print).
  • It is a good idea to test the pass case and the fail case as well.

Unit Tests

  • Should be located in testing/unit_tests/
  • testing/unit_tests/test_${module}.py
  • Unit tests should aim to be simple and specific.
  • Unit tests should aim to avoid coupling/interdependence.
  • Unit tests should not cause unexpected side effects (modify an input file, create unwanted artefacts on the filesystem, etc).
  • It is preferable to generate input data instead of relying on external input data where possible. This avoids regressions caused by the external data changing without the test being updated.
  • It is a good idea to parameterize tests.

Functional Tests

  • Should be located in testing/functional_tests/
  • testing/functional_tests/test_${script_to_test}.py
  • Should call the script using script_to_test.main(args) instead of using subprocess/run. This allows things like pytest.raises to be used if checking for an expected exception.
  • Shared initialization/setup logic should be implemented in a fixture.
  • Sets of (input_data/expected_results) can be passed to the functional test via parameterization.
  • Should make a best effort at testing the main functionalities provided by the script.
  • Should not call logic from another function/module. That would be a higher-level functional test (TBD where to put?)

Multiprocessing

One thing to note is that when testing any of the command line scripts that use multiprocessing, you should use script_runner to call the module instead of script_to_test.main(args). This is because if you do the latter, the processes aren't closed between the tests in your script.

For example, let's say you have a script, mp_script.py:

import multiprocessing as mp
import time

def sleep_for_a_bit(seconds):
    print(f"Sleeping for {seconds} second(s)")
    time.sleep(seconds)
    print("Done sleeping!")
    print(mp.current_process())

def main():
    pool = mp.Pool(processes=2)
    pool.map(func=sleep_for_a_bit, iterable=[1 for i in range(0, 2)])
    pool.close()    

if __name__ == "__main__":
    main()

Let's say you want to test this script using script_to_test.main(args):

import mp_script

def test_mp_script_a():
    mp_script.main()


def test_mp_script_b():
    mp_script.main()

If you run this test, you will get something like:

Sleeping for 1 second(s)
Sleeping for 1 second(s)
Done sleeping!
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91203 started daemon>
<SpawnProcess name='SpawnPoolWorker-2' parent=91203 started daemon>
Sleeping for 1 second(s)
Sleeping for 1 second(s)
Done sleeping!
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-4' parent=91203 started daemon>
<SpawnProcess name='SpawnPoolWorker-3' parent=91203 started daemon>

Note here how between the two separate tests, test_mp_script_a and test_mp_script_b, the SpawnPoolWorkers aren't resetting? In this example, the tests would pass fine, because the script isn't doing anything other than sleeping. However, if your script relies on the processes to be distinct, there could be a problem, as in automate_training.py.

To correctly test, you will need to use script_runner from pytest-console-scripts:

import pytest

@pytest.mark.script_launch_mode('subprocess')
def test_mp_script_a(script_runner):
    ret = script_runner.run('mp_script')
    print(f"{ret.stdout}")
    print(f"{ret.stderr}")
    assert ret.success

@pytest.mark.script_launch_mode('subprocess')
def test_mp_script_b(script_runner):
    ret = script_runner.run('mp_script')
    print(f"{ret.stdout}")
    print(f"{ret.stderr}")
    assert ret.success

If you run this test, you should see something like:

Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-2' parent=91378 started daemon>

Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-2' parent=91378 started daemon>

As you can see, the SpawnPoolWorker is not being carried over between tests anymore.