{{ message }}

# ropensci / software-review Public

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

# wateRinfo package#255

Closed
11 of 19 tasks
opened this issue Oct 2, 2018 · 28 comments
Closed
11 of 19 tasks

# wateRinfo package#255

opened this issue Oct 2, 2018 · 28 comments
Assignees
Labels

### Summary

• What does this package do? (explain in 50 words or less):

wateRinfo facilitates the access to a variety of environmental water-related data about Flanders (Belgium) available on waterinfo.be, a website managed by the Flemish Environmental Agency.

• Paste the full DESCRIPTION file inside a code block below:
Package: wateRinfo
Title: Download Time Series Data from Waterinfo.be
Version: 0.2.0
Description: The wateRinfo package provides an R interface to the data available
at waterinfo.be, the data portal provided by the Flemish Environmental
Agency. The package provides R functions to check the available stations for
a given variable, check the available variable for a given station and
download time series using the database identifiers as well as station
names.
Authors@R: c(
person("Stijn", "Van Hoey", role = c("aut", "cre"), email = "stijn.vanhoey@inbo.be", comment = c(ORCID = "0000-0001-6413-3185")),
person("Willem", "Maetens", role = "ctb", email = "w.maetens@vmm.be"),
person("Peter", "Desmet", role = "ctb", email = "peter.desmet@inbo.be", comment = c(ORCID = "0000-0002-8442-8025"))
)
URL: https://github.com/inbo/wateRinfo, https://inbo.github.io/wateRinfo
BugReports: https://github.com/inbo/wateRinfo/issues
Depends:
R (>= 2.10)
Imports:
dplyr,
httr,
jsonlite,
openssl,
lubridate (>= 1.6.0),
rlang,
utils
Suggests:
covr,
ggplot2,
knitr,
rmarkdown,
testthat
LazyData: true
Encoding: UTF-8
VignetteBuilder: knitr
RoxygenNote: 6.1.0.9000

• URL for the package (the development repository, not a stylized html page):

https://github.com/inbo/wateRinfo

• Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):

data retrieval: because the package provides R access to environmental data (water level and tidal information, discharge data, water quality variables, meteorological variables...) provided by the Flemish Environmental Agency. Without the package, people need to download data manually using the web interface or using the available API calls. The web interface, although evolving, does not support downloads for a list of stations and variables leading to lots of clicks to download data. The API documentation is limited and only available in Dutch. The package overcomes these limitations, supporting downloads of multiple variables and stations more efficiently.

•   Who is the target audience and what are scientific applications of this package?

Anyone interested in using the water-related data from waterinfo.be, for example hydrologists and ecologists. Scientific applications are ranging from environmental prediction and risk assessment modelling studies (e.g hydrological flood models) to ecological studies requiring environmental data.

To our knowledge, there is no other R package to retrieve data from waterinfo.be. Although some loose scripts have been circulating among researchers from different institutes, this package aims to support a collaborative/community effort.

•   If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

### Requirements

Confirm each of the following by checking the box. This package:

• does not violate the Terms of Service of any service it interacts with.
• has a CRAN and OSI accepted license.
• contains a README with instructions for installing the development version.
• includes documentation with examples for all functions.
• contains a vignette with examples of its essential functions and uses.
• has a test suite.
• has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
• I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

#### Publication options

• Do you intend for this package to go on CRAN?
• Do you wish to automatically submit to the Journal of Open Source Software? If so:
• The package has an obvious research application according to JOSS's definition.
• The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
• The package is deposited in a long-term repository with the DOI:
• (Do not submit your package separately to JOSS)
• Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
• The package is novel and will be of interest to the broad readership of the journal.
• The manuscript describing the package is no longer than 3000 words.
• You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
• (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
• (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
• (Please do not submit your package separately to Methods in Ecology and Evolution)

### Detail

• Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

• Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

• NEWS file - there is one now, but only started with version 0.1.1 and using the ropensci structure since 0.2.0
• Package name: Unfortunately, the name contains capital letters (wateRinfo), but this is how the package is already in use for almost a year
• The main download function names are setup as get_** which does not use the object_verb() recommendation. As these are the main functions, we would keep the naming as such.
• We did not use the @family tag as this provides full control on the order in between different sections. Considering the limited set of functionalities, the management of the pkgdown yml file is still feasible.
• We do run the documentation website locally before deployment.
• The current set of untested functionalities consists of interactions (mainly exceptions) with the waterinfo.be API which are hard to replicate.
• If this is a resubmission following rejection, please explain the change in circumstances:

• If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

### sckott commented Oct 2, 2018

 thanks for your submission @stijnvanhoey - editors are discussing and we'll get back to you soon

### sckott commented Oct 3, 2018

 @stijnvanhoey et al. Do other regions of Belgium have water data available? If so, can the pkg be generalized to use those other data?

### stijnvanhoey commented Oct 3, 2018

 Good point. I'll try to provide an overview of the information I'm aware of. Notice that we have for water-related issues different governmental levels: national level and regional levels, i.e. Flanders, Walloon and Brussels). Rivers are monitored on regional (and subregional) level: The main data source for meteorological data (rainfall, evapotranspiration,...) on the national level is the Royal Meteorological Institute of Belgium, but they do not provide open data for the moment. For the Walloon region, there is no data on the open data platform, but there data on discharge on their aqualim website. The latter presents data from the last 30 days online and a form to download data as an excel-sheet when providing contact details etc. For water quality, there are two websites: aquapol and aquaphyc providing water quality data for a number of stations. I do not have a direct idea how these could be incorporated (no REST or something alike services)? For Brussels, the is no water data available on the open data platform. Their main river is Zenne, see the map. I do know of a project belini with a monitoring component, but the data seems not available as open data or a webservice. The Flemish environmental agency (VMM) also provides data about water quality, but they told us their water quality system is currently in redevelopment (with other endpoints), so rather something to take into scope when the new system will be in place. There is a database/webportal on soil and groundwater in Flanders, called DOV, which is, among other subsoil data (e.g. geological data), the aggregator for groundwater time series. There is a current initiative to increase the number of data contributors to DOV and a Python package for data access to DOV is currently in development as well, https://github.com/dov-vlaanderen/pydov. As the groundwater level data is thematically (and technically) linked to the other data sources of DOV (geological interpretations, boreholes,...) we consider it out of scope for the wateRinfo package. Notice that we opt for GBIF to publish species occurrences related to water, e.g. fish, so this data is covered by the rgbif package. As such, to my knowledge, there is no direct other source of data to incorporate. I'm adding @WillemMaetens (VMM), @pietercolpaert (open knowledge), @Sachagobeyn, @bartpannemans, @pjhaest to the discussion, maybe they know about other initiatives?

### Sachagobeyn commented Oct 4, 2018

 Dear all, With respect to water quality data hosted by the VMM (Flanders): as I am currently aware, they are not available open source. They can be requested via the VMM website, however from my own experience, requests are often declined. An alternative for this type of information is the European Union Open Data Portal, in which ecological water quality, and a number of physico-chemical water quality variables are reported. For the latter, it should noted that data are only available for a few years. In addition, the datasets are not 'callable' as they are in waterinfo (zips hosted on the website).

### sckott commented Oct 5, 2018

 Thanks very much @stijnvanhoey and @Sachagobeyn for the thorough overview. It sounds like there are no other data sources at the moment, so we can move forward with this submission. assigning an editor now

### karthik commented Oct 6, 2018

 👋 @stijnvanhoey. I’ll be editing your submission. Stay tuned for more instructions from me shortly.

### Editor checks:

• Fit: The package meets criteria for fit and overlap
• Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
• License: The package has a CRAN or OSI accepted license
• Repository: The repository link resolves correctly
• Archive (JOSS only, may be post-review): The repository DOI resolves correctly
• Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

👋 @stijnvanhoey. Apologies for the delay in assigning a reviewer (just returning from travels). Laura has kindly agreed to review. Given that we have a week off in November, I have given her a month to review this submission. I have also posted some results from the goodpractice check (from the goodpractice package). Please address these issues as they will also come up in the review.

---
── GP wateRinfo ────────────────────────────────────────────────────────────────

It is good practice to

✖ avoid long code lines, it is bad for readability. Also,
many people prefer editor windows that are about 80 characters
wide. Try make your lines shorter than 80 characters

R/token.R:44:1
tests/testthat/test-token_usage.R:75:1
tests/testthat/test-waterinfo_call.R:35:1

✖ fix this R CMD check ERROR: Running examples in
‘wateRinfo-Ex.R’ failed The error most likely occurred in: > ###
Name: get_timeseries_tsid > ### Title: Download timeseries data
from waterinfo.be > ### Aliases: get_timeseries_tsid > > ### **
Examples > > get_timeseries_tsid("35055042", from = "2017-01-01",
to = "2017-01-02") Timestamp Value Quality Code 1 2016-12-31
23:00:00 0.00 130 2 2016-12-31 23:15:00 0.00 130 3 2016-12-31
23:30:00 0.00 130 4 2016-12-31 23:45:00 0.00 130 5 2017-01-01
00:00:00 0.00 130 6 2017-01-01 00:15:00 0.00 130 7
...
Error: Waterinfo API request failed [500] Waterinfo error:
InvalidParameterValue Waterinfo return message: Datasource
parameter not found in config. Execution halted
✖ checking tests ... ERROR Running the tests in
‘tests/testthat.R’ failed. Last 13 lines of output: >
library(testthat) > library(wateRinfo) > > test_check("wateRinfo")
── 1. Failure: non existing tsid to API (@test-waterinfo_call.R#21)
─────────── call_waterinfo(query) threw an error with unexpected
message. Expected match: "Waterinfo API request
failed.*InvalidParameterValue" Actual message: "API did not return
json - " ══ testthat results
═══════════════════════════════════════════════════════════ OK: 104
SKIPPED: 4 FAILED: 1 1. Failure: non existing tsid to API
(@test-waterinfo_call.R#21) Error: testthat unit tests failed
Execution halted
✖ fix this R CMD check WARNING: LaTeX errors when creating
PDF version. This typically indicates Rd problems.
✖ fix this R CMD check ERROR: Re-running with no
redirection of stdout/stderr. Hmm ... looks like a package You may
want to clean up by 'rm -Rf /tmp/RtmpHVX5D6/Rd2pdf9fb1c6e9df4'
─────────────────────────────────────────────────────────────────────────


Reviewers: @ldecicco-USGS
Due date: Nov 29, 2018

### stijnvanhoey commented Nov 7, 2018

 Thanks for the information. With respect to the issues with the example and unit tests: Apparently, the unique time series identifiers have changed on the waterinfo.be side for a subset of stations. I was expecting them to be stable, but they are not. I'm doing the best I can to fix these issues asap.

### stijnvanhoey commented Nov 7, 2018 • edited

 New version fix-source-info branch (will be merged soon), has following output for the gp command: ── GP wateRinfo ───────────────────────────────────────────────────────────────────────────────────────────────────────── It is good practice to ✖ write unit tests for all functions, and all package code in general. 87% of code lines are covered by test cases. R/call_waterinfo.R:58:NA R/call_waterinfo.R:59:NA R/call_waterinfo.R:60:NA R/call_waterinfo.R:61:NA R/call_waterinfo.R:62:NA ... and 49 more lines ✖ fix this R CMD check WARNING: LaTeX errors when creating PDF version. This typically indicates Rd problems. ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────  One of the elements missing in the unit testing is the XML-parsing of a specific type of error from the serverside at waterinfo.be (eventhough requesting json). We encountered the issue earlier, but it is hard to reproduce it systematically. Hence, we do not know exactly how to properly handle this. Advice is welcome.

### stijnvanhoey commented Nov 7, 2018

 With respect to the Tex errors, I'm not completely aware on how to tackle this. I tried to execute devtools check with the manual creation option: devtools::check(document = FALSE, cleanup = TRUE, manual = TRUE)  I get the warning: ❯ checking PDF version of manual ... WARNING LaTeX errors when creating PDF version. This typically indicates Rd problems.  However, when I do check the logs and output in the wateRinfo.Rcheck folder, the wateRinfo-manual.pdf is there and the Rdlatex.log is (I just subset some potentially relevant sections): Hmm ... looks like a package This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016/Debian) (preloaded format=pdflatex) restricted \write18 enabled. entering extended mode (./Rd2.tex LaTeX2e <2016/03/31> Babel <3.9r> and hyphenation patterns for 5 language(s) loaded. ... No file Rd2.aux. ... No file Rd2.toc. ... Overfull \hbox (15.18042pt too wide) in paragraph at lines 370--370 ... Output written on Rd2.pdf (12 pages, 90183 bytes). Transcript written on Rd2.log. Saving output to ‘wateRinfo-manual.pdf’ ... Done You may want to clean up by 'rm -rf /tmp/RtmpeypCkA/Rd2pdf508051cc1987'  And the output looks like: wateRinfo-manual.pdf Another file, wateRinfo-Ex.pdf is also created, but document contains no pages. Any input on how to tackle this issue is certainly welcome.

mentioned this issue Nov 7, 2018

## Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

• As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

#### Documentation

The package includes all the following forms of documentation:

• A statement of need clearly stating problems the software is designed to solve and its target audience in README
• Installation instructions: for the development version of package and any non-standard dependencies in README
• Vignette(s) demonstrating major functionality that runs successfully locally
• Function Documentation: for all exported functions in R help
• Examples for all exported functions in R Help that run successfully locally
• Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
##### For packages co-submitting to JOSS

I don't think this is the case???

The package contains a paper.md matching JOSS's requirements with:

• A short summary describing the high-level functionality of the software
• Authors: A list of authors with their affiliations
• A statement of need clearly stating problems the software is designed to solve and its target audience.
• References: with DOIs for all those that have one (e.g. papers, datasets, software).

#### Functionality

• Installation: Installation succeeds as documented.
• Functionality: Any functional claims of the software been confirmed.
• Performance: Any performance claims of the software been confirmed.
• Automated tests: Unit tests cover essential functions of the package
and a reasonable range of inputs and conditions. All tests pass on the local machine.
• Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

#### Final approval (post-review)

• The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 3

Holy cow, nicely done! Followed the recommendations to the letter. A very minor suggestions...none of them are mandatory changes:

Maybe you could add a description to the get_token help file that states up front that you do not need to get a token right away. I was just randomly walking through the examples and tried that one towards the beginning. I then went back to the README to realize I didn't need to do it.

Seems like this package would be a prime candidate for language translation. The little bit I looked into this awhile back, I think this SO post could get you started:
https://stackoverflow.com/questions/29093673/how-to-translate-package-content

I wonder if supported_frequencies() would make more sense to output a vector?

There are a lot of Factors returned, but a character vector now-and-then. Factors work great for the plots, maybe not so great for other purposes. I'd double-check that the columns that are coming back as Factors make sense to be factors.

resolve_datasource...is this a function the general user will use? If not, maybe make it an internal function.

It might be useful to the user to offer a way to see or save the URL/GET query of the retrieved data. I'm not sure the best way, one could just be a message when retrieving, one way could be to add it to the attribute of the data.frame. (for the user, they may need to use it to cite the data, for you, you might need to use it when troubleshooting problems)

All-in-all, nice job!

### peterdesmet commented Nov 28, 2018

 @ldecicco-USGS thank you so much for your review, this is really useful input.

mentioned this issue Nov 28, 2018

### stijnvanhoey commented Nov 30, 2018 • edited

 @ldecicco-USGS thanks for your time and effort, it is really appreciated and provided very useful input. I have made changes and responded to your comments/suggestions below, quoting the commit the change was made in. Maybe you could add a description to the get_token help file that states up front that you do not need to get a token right away. I was just randomly walking through the examples and tried that one towards the beginning. I then went back to the README to realize I didn't need to do it. Response: Adapted both the Roxygen header of the get_token function as well as the Readme.md file, ropensci/wateRinfo@1be6f64 and ropensci/wateRinfo@8fc0cc1 Seems like this package would be a prime candidate for language translation. The little bit I looked into this awhile back, I think this SO post could get you started: https://stackoverflow.com/questions/29093673/how-to-translate-package-content Response: As we are not controlling the server-side error messages neither the dutch fields embedded in the waterinfo.be website, we would have to do translations against a moving target that we do not control. Something to keep in mind if documentation of the API service improves, but we would keep this currently out of scope. I wonder if supported_frequencies() would make more sense to output a vector? Response: Indeed, as such users will be able to iterate over these frequencies or subset them. Adapted (and adapted test), ropensci/wateRinfo@6023ca1 There are a lot of Factors returned, but a character vector now-and-then. Factors work great for the plots, maybe not so great for other purposes. I'd double-check that the columns that are coming back as Factors make sense to be factors. Response: Checked this and the main issue was the get_variables function, returning all factors instead of characters. I would keep it to characters by default as suggested. Adapted (and added test), ropensci/wateRinfo@aefecbe and ropensci/wateRinfo@65e0441 resolve_datasource... is this a function the general user will use? If not, maybe make it an internal function. Response: Available to the user, as it iw useful to have a check or control on the so-called datasource (1 or 4 for the moment) themselves. As such, we keep it public available to enable users to do this check. It might be useful to the user to offer a way to see or save the URL/GET query of the retrieved data. I'm not sure the best way, one could just be a message when retrieving, one way could be to add it to the attribute of the data.frame. (for the user, they may need to use it to cite the data, for you, you might need to use it when troubleshooting problems) Response: Great suggestion! I found a possible solution to add the URL of the response as a comment attribute to the returned data.frame. As such we can actually check the URL later... ropensci/wateRinfo@26d48c6, ropensci/wateRinfo@6594e10 Updated roxygen documentation ropensci/wateRinfo@2139ffd and tests ropensci/wateRinfo@c6e04b5 as well. As I used sapply during development of the new tests, goodpractice package warned me against it and I replaced by vapply, ropensci/wateRinfo@6184839 I hope these adaptations incorporates your suggestions? Any suggestions are welcome.

### ldecicco-USGS commented Dec 3, 2018

 Sounds great to me!

### stijnvanhoey commented Dec 6, 2018

 Thanks!

### karthik commented Dec 6, 2018

 Congrats @stijnvanhoey, your submission has been approved! 🎉 Thank you for submitting and @ldecicco-USGS for thorough and timely review. I'm very pleased with this package and find it very easy to use and well put together. To-dos: Transfer the repo to the rOpenSci organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do. Add the rOpenSci footer to the bottom of your README [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)  Fix any links in badges for CI and coverage to point to the ropensci URL. (We'll turn on the services on our end as needed) Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/technotes/) or long-form post with more narrative about its development. ((https://ropensci.org/blog/). If you are, @stefaniebutland will be in touch about content and timing.

### stijnvanhoey commented Dec 7, 2018 • edited

 Thanks for the approval. Great news. I'll to the transfer as soon as possible. @stefaniebutland, I've written a post/story about using the wateRinfo package to download tidal data to study eel behaviour. It is not a short-form intro or a narrative about the development, but would it also be an option as blog post? (notice, the current version is a draft, so text-wise not polished; but the storyline is there already)

### stijnvanhoey commented Dec 7, 2018

 @karthik can you please send the invite again, I can't find any invite to the organisation. Thanks

### karthik commented Dec 7, 2018

 @stijnvanhoey I resent the invite. Once you accept, I will give you admin on it (can't do it till you accept).

### peterdesmet commented Dec 7, 2018 • edited

 👋 could you invite me (@peterdesmet) as well? Other task (for me): Redirect current pkgdown website at https://inbo.github.io/wateRinfo to what I assume will be https://ropensci.github.io/wateRinfo

### stijnvanhoey commented Dec 8, 2018

 @karthik thanks, I'm member now.

### stijnvanhoey commented Dec 10, 2018

 @karthik I transferred the repo, but apparently I should have added the wateRinfo team during transfer. Sorry to ask you again, could you provide the waterinfo team admin access to the repo? Thanks

### stefaniebutland commented Dec 10, 2018

 @stijnvanhoey Your draft post will be great for the rOpenSci blog! Such a good weaving of practical and very engaging story with package use case. That video is incredible. We can publish on Jan 15, 2019. Please submit your draft as a pull request by Jan 8 according to the instructions here: https://github.com/ropensci/roweb2#contributing-a-blog-post. That will give me an opportunity to review prior to publication. Looking forward to this one!

### karthik commented Dec 10, 2018

 @stijnvanhoey Your team should have full admin access on the repo now.

mentioned this issue Dec 11, 2018

### stijnvanhoey commented Dec 11, 2018

 @stefaniebutland Great! We'll make sure to have a version submitted by Jan 8. @karthik thanks for all the help, a PR is ready, just a single check on the coveralls link, ropensci/wateRinfo#47 Any other admin we need to do or can I/you close this issue?

### karthik commented Dec 20, 2018

 @stijnvanhoey All good. Closing now. 🎉

closed this Dec 20, 2018

### stijnvanhoey commented Jan 8, 2019

 @stefaniebutland, we are running into deadline trouble with the blog post. The current version is available here, but we still need to have some internal rewriting and review. Would it still be ok to submit the PR on Monday? Our excuses for the rather late notice.

### stefaniebutland commented Jan 8, 2019

 Not a problem @stijnvanhoey. We're very happy to host your post when it is ready. If you submit a pull request on Monday I'll do my best to do a final review asap after it's in. My responses might be delayed, however, because I'm involved in a workshop next week. The post is in very good shape so I don't expect it will need many changes after my review. If you're not able to submit on Monday, then submit when ready and we can assign a different publication date. rOpenSci will tweet about the post and you are welcome to suggest any tweet-content, preferred images or people who should be tagged and I will work to incorporate them. Thank you for getting in touch!

to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet