Testing: Split test string #2789

Lestropie · 2024-01-28T23:28:52Z

Currently, there are quite a few tests where a single test string actually contains:

One or more commands that generate intermediate data prior to running the test
(partly to reduce storage space of the test data, partly to make the nature of the test data self-explanatory)
The command invocation itself
Evaluation of the outputs from the command, typically by comparing to a pre-generated output from an earlier version

Questions primarily for @daljit46:

Can ctest "classify" lines based on which of the three categories each belongs to?
Is there an "industry accepted" way of arranging such?

The text was updated successfully, but these errors were encountered:

daljit46 · 2024-01-30T09:36:00Z

I think the answer here is no. CTest is just a test launcher, it's not aware of what those tests are doing.
I don't know any big software projects that implement tests the way we do, although there may be projects like terminal shells that use a similar strategy for part of their testing. Looking for "approval testing" on Google, I came across this project https://github.com/DannyBen/approvals.bash.

Lestropie · 2024-01-31T04:47:17Z

Given:

I don't know any big software projects that implement tests the way we do

, is the complexity I'm pointing out a problem that arises because of such non-conformity, and adopting a more conventional approach would naturally resolve it? Or does it appear to you that our "unconventional" testing approach is out of necessity?

Alternatively, I'm wondering if your comment is moreso an observation on our extensive use of regression testing rather than unit testing?

Just picking at your brain to see if there's a logical & efficient way to change our tests across the board.

daljit46 · 2024-01-31T11:40:32Z

For our usecase, I think that our tests are fairly suitable especially since they emulate how an end user would use the tools that we develop.

I think the "peculiarity" of our testing framework is mainly due to the use of bash scripts (which is sort of made necessary due to our commands workflow). I was just pointing out that because of this, I don't know about the existence of guidelines and best practices that are commonly used in this scenario.

Alternatively, I'm wondering if your comment is moreso an observation on our extensive use of regression testing rather than unit testing?

On this point, I don't think that unit testing and integration/approval testing are mutually exclusive. In an ideal scenario, we would have both. For example, an obvious limitation of our test suite is that we don't test any GUI code. This is because there is not obvious way of running mrview and verify an "output". While GUI testing is extremely hard, unit testing would at least allow you to test part of the UI logic without running the entire executable.
Furthermore, unit tests may allow you to test certain edge cases more easily than integration tests (e.g. suppose you want to test a square root calculator implemented using add,sub,mul and div functions, then an integration/approval test would not be easily be able to test a division by zero in div since you may not know what input in the general case would trigger that path). Of course, more testing is considerable effort upfront, but it may also lead to less effort in the long-term (fewer bugs). Introducing unit tests to the codebase would be fairly easy and note that we don't have an obligation to write tests for everything. Just writing tests for (parts of) new code would be an improvement over the current situation.

Just picking at your brain to see if there's a logical & efficient way to change our tests across the board.

I've thought about this when I was shifting the build system to CMake. I think for example, we could probably provide the delineation between the "types" of operation in a bash command by delegating the running of commands to designated bash functions (similarly to the Github project I linked above). This would allow for better error messages and a more structured way of writing tests. The downside would be that our bash tests would (slightly) distance themselves from how users use mrtrix in practice, so I'm not sure this would be worth it.

However, I do think there are few low-hanging fruits that we can pick. One of them would be that our tests don't have any description or even a name. It's very difficult to understand what exactly a test is doing and why a certain output is expected. I think we should at least provide comments for each test to improve this (or some other mechanism like a custom bash function), so that a reader doesn't need to know the exact mechanics of each command to be able to understand what's going on.

Lestropie · 2024-01-31T23:19:53Z

Yes, the absence of labels for the tests is something I have been thinking about in this context that I didn't state in the OP. I've tried adding comments in some test bash script files, but particularly now using ctest that then requires cross-reference. If, for each executable, we were to create a directory for that executable, and put each test of that executable into its own file, then the names of those files could convey the purpose of each test. It would also mean that the three steps bullet-pointed in the OP could be split across multiple lines for each individual test. That seems like a pretty sensible change to me. The catch here is that quite often, there'll be eg. an mrconvert call at the commencement of the first test, for which many subsequent tests use the output. But having such order dependency in a set of tests is bad practise anyway, it was purely about efficiency when making our set of tests greater than zero.

So to be clear, what I'm proposing is to change from:

$ cat testing/binaries/tests/dwiextract

dwiextract dwi.mif - | testing_diff_image - dwiextract/out1.mif
dwiextract dwi.mif -bzero - | testing_diff_image - dwiextract/out2.mif

to something like:

$ cat testing/binaries/tests/dwiextract/default

dwiextract dwi.mif tmp.mif
testing_diff_image tmp.mif dwiextract/out1.mif

$ cat testing/binaries/tests/dwiextract/bzero

dwiextract dwi.mif -bzero tmp.mif
testing_diff_image tmp.mif dwiextract/out2.mif

Lestropie · 2024-03-01T13:19:05Z

One other factor here is that when CI tests were first introduced, we struggled to get them running within the timeframe afforded by the CI service. As such, there are a number of tests where there is a conversion step done inline as part of one of the initial tests, and the result of such is re-used by later tests. Since then, not only are CI resources typically beefier, but also ccache should reduce the Action job time required for compilation. Therefore I think I would want to have it so that no test is dependent on the data from any other test, and also each test should clean up after itself, rather than relying on *tmp* globs.

Lestropie · 2024-03-12T00:36:58Z

Rather than each test having to manually specify its own cleanup, could instead have ctest run each test with a trap to delete any temporary files regardless of test success or failure.

Lestropie added question wishlist CI/tests labels Jan 28, 2024

This was referenced Feb 29, 2024

Regenerate CMake tests when test file is modified #2834

Merged

cmake: tests that are not .cpp standalones #2836

Closed

Lestropie mentioned this issue Mar 12, 2024

"Quick" testing of Python scripts #2825

Closed

2 tasks

Lestropie self-assigned this Mar 13, 2024

This was referenced Mar 15, 2024

New command testing_diff_mask #2864

Open

Testing: shell script per test #2865

Merged

Lestropie closed this as completed Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing: Split test string #2789

Testing: Split test string #2789

Lestropie commented Jan 28, 2024

daljit46 commented Jan 30, 2024

Lestropie commented Jan 31, 2024

daljit46 commented Jan 31, 2024 •

edited

Loading

Lestropie commented Jan 31, 2024

Lestropie commented Mar 1, 2024

Lestropie commented Mar 12, 2024

Testing: Split test string #2789

Testing: Split test string #2789

Comments

Lestropie commented Jan 28, 2024

daljit46 commented Jan 30, 2024

Lestropie commented Jan 31, 2024

daljit46 commented Jan 31, 2024 • edited Loading

Lestropie commented Jan 31, 2024

Lestropie commented Mar 1, 2024

Lestropie commented Mar 12, 2024

daljit46 commented Jan 31, 2024 •

edited

Loading