-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests (via testthat) #8
Comments
The only tests I could really think of that would make sense in a data package might be just checking to make sure the data is correct. In one of the test files, I had testthat check the dimensions of the data, datatype, and variable names. But if this doesn't appear to be a pressing (or existant) problem then maybe we don't need them at all. I fully admit to being ignorant in this area. Writing a lot of tests would take some time but since I'm a bit time-limited I could instead work on the documentation for the other datasets and catalog context name use etc. |
I think checking the integrity of the data is something that the package could in principle do, but if we want to do this, we should think about how. One extreme version would be to check each variable and make sure that it is equal to what's in the data. But this becomes a bit silly -- we would then essentially have to store the data one more time in the tests. Checking the dimensions, types, and names seems more sensible. We could also check a hash of the data. This could be done via https://stat.ethz.ch/R-manual/R-devel/library/tools/html/md5sum.html or https://cran.r-project.org/web/packages/digest/index.html I haven't thought through a proper workflow for this, but I think this would in general be a nice addition. |
Maybe just checking md5 hash would be enough for now and if we decided to check dimensions and names later it can be added easily. Starting with checking the hash for each dataset would also take the least amount of time to set up for the datasets. Line 10-12 are how I did checked the hash last time. https://github.com/wviechtb/metadat/blob/5993e1bc2f09ec057866f1d8c0d8171ab61b692f/tests/testthat/test_data_barone2019.R |
Oh, I didn't see that! (sorry, I did not look through all the test files). Yes, that's it. And yes, that should automatically cover things like dimensions and names anyway, so there isn't really a need to check those separately. |
I assumed it would be easier to write too many tests for one dataset and then just cut out what wasn't needed and let it serve as a template. When I get some time I'll start putting something together unless I hear otherwise. |
I think we can close this now. We have the md5 hash checks in place. |
In #5, @kylehamilton suggested to add tests, so that we can do automated testing via
testthat
. Although I am all for testing, I am not sure we should do this in the context of this package. Instead, I think tests of other packages should be done by those packages themselves. So, I removed the tests that were originally added in #5, but I wanted to open up the discussion here in case there are different opinions.The text was updated successfully, but these errors were encountered: