-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Httptest to cache test and vignette responses #34
Conversation
Codecov Report
@@ Coverage Diff @@
## master #34 +/- ##
==========================================
+ Coverage 86.09% 87.16% +1.06%
==========================================
Files 4 4
Lines 187 187
==========================================
+ Hits 161 163 +2
+ Misses 26 24 -2
Continue to review full report at Codecov.
|
Alternatives https://blog.r-hub.io/2020/06/03/vignettes/#how-to-include-a-compute-intensive--authentication-dependent-vignette
|
@maelle , I imagine I can just close this PR now right? |
Yes! |
I’ve been going round in circles on my own but haven’t shared this so
far because of issues with the approach (i.e. the size of the cached
data, see below), but I realised that if I just share the alterations to
the code, it obviously won’t work on GitHub Actions but others could at
least pull the branch and experiment locally to give feedback.
So decided to try and write up the issues I’ve been coming up against,
some steps and proposed solutions so far and try and get some feedback
from you!
The PR relates to #26 and describes using
httptest
to cache tests &vignette queries & problems so far:
Using
httptest
Approach:
Tests
Wrap functions in
with_mock_dir
. Provide a unique folder name thatwill be used or created under
tests/testthat/
Vignettes
Use
start_vignette()
at the start of any vignette to either usepreviously recorded responses, if they exist, or capture real responses
for future use. Use
end_vignette()
at the end.Result:
http request responses for tests are cached in named folder in
tests/testthat/
whereas for vignettes in thevignettes/
folder.Issues
1. Path to created files too long triggering an error during
checks regarding “non-portable file paths”. Need to add a request
preprocessing function as described in the
faq
vignette, e.g.
2. Size of files can be too large and add significantly to
the size of the repo. Currently the
tests/testthat/
folder size is37.8 MB whereas the vignettes folder is a whopping 1.5 GB!!. To run
tests using the framework using GitHub Actions though, all these
files would need to be commmited to git! For this reason, although
I’ve committed the code alterations I’ve made to incorporate
httptest
, I am ignoring the actual cached data folders for thetime being until a solution is decided.
3. Don’t work outside the box for wfs! According to docs,
once the tests are first run, they should then run successfully the
next time they are run, even without an internet connection.
However, when run with internet, some tests appear to fail or throw
warnings (but not all).
categorical filters work
Warning during test run:
layer_attributes_summarise works
Solutions [WIP]
Follow instructions to implement appropriate
set_requester
function (will tackle last.
One option to tackle the size of cached data is to refactor our
tests and vignettes (vignettes especially) to request smaller
layers. To that effect, I wrote a short script to query each layer
of each service (but time out and return
NA
after 5 secs forefficiency) and return the size of each layer. You can find both
script & result
csv in the
attic/
dir. Using that info,we can focus on the smaller layers in our examples, tests and
vignettes.
From a similar question in the rOpenSci slack (where this super useful
resource was shared: 😉
https://blog.r-hub.io/2020/05/29/distribute-data/#data-outside-of-your-packag
), a suggestion was to store large cached data externally from the
package repo and pull it in during testing. That obviously requires
internet and effort to implement as not only do we need the data but we
also need a mechanism for periodically updating the cached data. Setting
this up, at first thought, feels like it would require duplication of
code somewhere else, introducing the need for keeping two copies of it
synced, so would need some thought as how to do this successfully.
httptest
is the rightway to go. It feels like most of the warnings / errors generated
with no internet are the result of the interaction of the
sf
package with the requests once they are received. Indeed most of the
tests throw warnings, not actual errors so they actually pass.
When there IS and internet connection (which there will always be on
GitHub Actions), they actually all pass and quicker than on
master
soI’m assuming the cached data is being used with some differences in time
where the cached response needs to be processed to
sf
during testruntime.
Testing on
http-tests
with cache & internet (2 fewer tests):Testing on
master
without cache (test fail because of chance in crsdefinition on server corrected in this PR:
So not sure how much of a problem this is. It’s just a bit
unsatisfactory as technically these should now work without internet.
Overall just wanted to update you and make the issues a bit more visible
and see if anyone had any feedback or suggestions.