Tests
Ivadomed
has two main types of tests:
-
Unit Tests
- Unit tests typically test a small portion of code, such as a single function.
- They test individual modules/functions in the
ivadomed
Python API, or isolated functions in CLI scripts (but not the full script).
-
Functional Tests
- Functional tests aim to test a specific feature/outcome from the code.
- Often multiple functions/scripts are involved.
- We don't care about the implementation of each function, as long as the desired outcome is achieved.
- The nomenclature is unfortunately confusing, since the use of the word
function
infunctional
testing refers to a purpose or outcome, as opposed to the wordfunction
used in unit testing, which refers to a single programmatic method. - We test
ivadomed
's CLI scripts in full (as if called by a user).
Tests are run in parallel using the pytest
command. The default configuration uses the xdist
plugin to perform distributed testing (num_workers == num_vcpu
). Tests are grouped by module for test functions and by class for test methods.
- Ensure your tests clean up after themselves, so that no new files are left behind. Ideally, the test data directory (
testing_data/
) should be left in the same state it was in prior to the test.- The built-in pytest fixture
tmp_path
provides an place to output files that will take care of cleanup automatically.
- The built-in pytest fixture
- Use logging in tests (don't print).
- It is a good idea to test the
pass
case and thefail
case as well.
- Should be located in
testing/unit_tests/
testing/unit_tests/test_${module}.py
- Unit tests should aim to be simple and specific.
- Unit tests should aim to avoid coupling/interdependence.
- Unit tests should not cause unexpected side effects (modify an input file, create unwanted artefacts on the filesystem, etc).
- It is preferable to generate input data instead of relying on external input data where possible. This avoids regressions caused by the external data changing without the test being updated.
- It is a good idea to parameterize tests.
- Should be located in
testing/functional_tests/
testing/functional_tests/test_${script_to_test}.py
- Should call the script using
script_to_test.main(args)
instead of usingsubprocess
/run
. This allows things likepytest.raises
to be used if checking for an expected exception. - Shared initialization/setup logic should be implemented in a fixture.
- Sets of
(input_data/expected_results)
can be passed to the functional test via parameterization. - Should make a best effort at testing the main functionalities provided by the script.
- Should not call logic from another function/module. That would be a higher-level functional test (TBD where to put?)
One thing to note is that when testing any of the command line scripts that use multiprocessing
, you should use script_runner
to call the module instead of script_to_test.main(args)
. This is because if you do the latter, the processes aren't closed between the tests in your script.
For example, let's say you have a script, mp_script.py
:
import multiprocessing as mp
import time
def sleep_for_a_bit(seconds):
print(f"Sleeping for {seconds} second(s)")
time.sleep(seconds)
print("Done sleeping!")
print(mp.current_process())
def main():
pool = mp.Pool(processes=2)
pool.map(func=sleep_for_a_bit, iterable=[1 for i in range(0, 2)])
pool.close()
if __name__ == "__main__":
main()
Let's say you want to test this script using script_to_test.main(args)
:
import mp_script
def test_mp_script_a():
mp_script.main()
def test_mp_script_b():
mp_script.main()
If you run this test, you will get something like:
Sleeping for 1 second(s)
Sleeping for 1 second(s)
Done sleeping!
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91203 started daemon>
<SpawnProcess name='SpawnPoolWorker-2' parent=91203 started daemon>
Sleeping for 1 second(s)
Sleeping for 1 second(s)
Done sleeping!
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-4' parent=91203 started daemon>
<SpawnProcess name='SpawnPoolWorker-3' parent=91203 started daemon>
Note here how between the two separate tests, test_mp_script_a
and test_mp_script_b
, the SpawnPoolWorker
s aren't resetting? In this example, the tests would pass fine, because the script isn't doing anything other than sleeping. However, if your script relies on the processes to be distinct, there could be a problem, as in automate_training.py
.
To correctly test, you will need to use script_runner
from pytest-console-scripts
:
import pytest
@pytest.mark.script_launch_mode('subprocess')
def test_mp_script_a(script_runner):
ret = script_runner.run('mp_script')
print(f"{ret.stdout}")
print(f"{ret.stderr}")
assert ret.success
@pytest.mark.script_launch_mode('subprocess')
def test_mp_script_b(script_runner):
ret = script_runner.run('mp_script')
print(f"{ret.stdout}")
print(f"{ret.stderr}")
assert ret.success
If you run this test, you should see something like:
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-2' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-2' parent=91378 started daemon>
As you can see, the SpawnPoolWorker
is not being carried over between tests anymore.