-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve test coverage #800
Comments
Here some reflections on this:
Agree with that. But I would first test that reading data from different brands/formats result in the same milestone data in g.part1. So, we would test the reading and the generation of the milestone data in part 1 separately in a test including all brands/formats possible (maybe with 4-hour recordings is enough for that). Once we test that the all brands can be read and the output is the same, we can proceed with just 1 milestone data from part 1, where I would use meta data from a geneactive file to include also lux variables in the output and facilitate later testing. This has the challenge that we need to store such raw data somehow, it might be heavy for an R package. In this separate test for part 1, we should make sure that we include:
After thinking about this, I'm not sure this would improve the test coverage. In a project we usually select a specific GGIR configuration, but the GGIR pipeline is so flexible that using the script from different projects would result in reprocessing the part 1 milestone data an infinite number of times. A potential solution is using the default GGIR configuration over the 5 strategies in part 2 to test the interaction between the 5 parts of the package. And then test the extra functionalities separately. |
I left out part 1 from my proposal because testing part 1 as a whole for real life data does not seem feasible: We will have to add large files to the package or downloaded them, and even if we have those large files it will take time to process them. So, I think testing the part 1 functionalities is best done based on specific unit test to cover each single functionality separately. This is what we have been doing and we will keeping doing. Instead, I would like to focus now on creating a more high level unit test (integration test) that runs the other parts (2-5) with more realistic study data compared with our current synthetic data or tiny example files. Even 10 MB of GGIR milestone data is too large for a package. So, a possible solution could be to include a numeric data.frame with 100 days worth of real ENMO, MAD, nonwear, anglez, temperature, and LUX with values rounded to 1 decimal place. This would then be based on real data from a variety of studies appended to each other. Next as part of the test we could write a function to convert this data.frame in semi-synthetic test milestone data files. For example, split up as multiple recordings. Advantages I see:
|
If we create a unit-test for let's say the Whitehall study then it will at the very least provide an integrated test of all functionalities they need, including LUX analysis, their specific sleeplog size, their specific, ID format, their specific way of dealing with missing files. Here, I do not want to loop over all possible functionalities GGIR offers as then we are just doing the same as |
The test coverage has gone a bit down in the past year: https://app.codecov.io/gh/wadpac/GGIR
Some functionalities are notoriously difficult to test in GitHub actions:
Some of the functionalities are possibly easy to address:
Rather then trying to improve each of these tests, it may be easier to:
In this way, it may be possible to improve the quality of the continuous integration for GGIR parts 2, 3, 4, 5.
The text was updated successfully, but these errors were encountered: