New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bomrang #121

Closed
adamhsparks opened this Issue Jun 3, 2017 · 38 comments

Comments

Projects
None yet
7 participants
@adamhsparks
Copy link
Member

adamhsparks commented Jun 3, 2017

Summary

  • What does this package do? (explain in 50 words or less):

bomrang provides functions to interface with Australian Government Bureau of Meteorology (BOM) data, fetching data and returning a tidy data frame of précis forecasts, current weather data from stations or ag information bulletins.

  • Paste the full DESCRIPTION file inside a code block below:
Package: bomrang
Type: Package
Title: Fetch Australian Government Bureau of Meteorology Data
Version: 0.0.4-1
Date: 2017-06-05
Authors@R: c(person("Adam", "Sparks", role = c("aut", "cre"),
    email = "adamhsparks@gmail.com"),
    person("Hugh", "Parsonage", role = "aut",
    email = "hugh.parsonage@gmail.com"),
    person("Keith", "Pembleton", role = "aut",
    email = "keith.pembleton@usq.edu.au"))
Description: Provides functions to interface with Australian Government Bureau
    of Meteorology (BOM) data, fetching data and returning a tidy data frame of
    précis forecasts, current weather data from stations or agriculture 
    bulletin data.
URL: https://github.com/ToowoombaTrio/bomrang
BugReports: https://github.com/ToowoombaTrio/bomrang/issues
License: MIT + file LICENSE
Depends:
    R (>= 3.2.0)
Imports:
    curl,
    data.table,
    dplyr (>= 0.7.0),
    foreign,
    httr,
    magrittr,
    readr,
    jsonlite,
    tidyr,
    tools,
    utils,
    xml2
Encoding: UTF-8
LazyData: true
Suggests:
    covr,
    testthat,
    knitr,
    rmarkdown
RoxygenNote: 6.0.1
NeedsCompilation: no
ByteCompile: TRUE
VignetteBuilder: knitr
  • URL for the package (the development repository, not a stylized html page):
    https://github.com/ToowoombaTrio/bomrang

  • Please indicate which category or categories from our package fit policies this package falls under and why? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.): data retrieval

  • Who is the target audience? Scientists that work with meteorological data from BOM or anyone interested in working with BOM data in R or other applications more easily than the formats BOM provides to the public

  • Are there other R packages that accomplish the same thing? If so, what is different about yours?

There are a few other R packages that I've found that provide access BOM data or ways to import it into R.

  • BOMdataRipper, https://github.com/johnDorian/BOMdataRipper, provides the most similar functionality for BOM data as bomrang but is not yet on CRAN and development seems sporadic. It only fetches station data, it does not provide the ag bulletins or précis forecasts like bomrang. It uses the older XML and RCurl packages, whereas bomrang uses the newer xml2 and curl packages to take advantage of more efficient R packages. Additionally, BOMdataRipper provides functionality for visualising the data. bomrang takes a much more UNIX like approach only doing one thing, downloading data and serving it to the end-user in an R data.frame object.

  • bomdata, https://github.com/mbertolacci/bomdata, this package provides only rainfall data and site metadata only in a SQLite database.

  • ReadAxfBOM, https://github.com/MarkusLoew/ReadAxfBOM, this package reads Axf files from BOM but does not automate the downloading, only importing from local
    files and does not offer forecasts or ag bulletins.

  • bomr, https://github.com/njtierney/bomr, this package is the basis for much of bomrang. It was set up for the #AUUnconf2016, which I attended. We didn't end up with a package but a vignette that is referenced in the bomrang README file. Several of the ideas presented here could be incorporated into bomrang.

The rOpenSci packages GSODR and rnoaa provide similar functionality for differing data sets. There is some overlap with data in GSODR and rnoaa (GHCN data), but the audience and end use will probably differ in most cases.

Lastly, bomrang is just a cool name. Not that I can take credit for it or that it affects the package functionality, but I'll just point it out anyway.

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration, including reporting of test coverage, using services such as Travis CI, Coeveralls and/or CodeCov.
  • I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package contains a paper.md with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI: http://doi.org/10.5281/zenodo.598301
    • (Do not submit your package separately to JOSS)

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings: None

  • Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions: Yes, no exceptions

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Jun 6, 2017

We're aware that jsonlite is preferred over rjson by rOpenSci and are addressing that.

Right now I've unchecked the box for meeting rOpenSci packaging guidelines.

ropensci/bomrang#19

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Jun 7, 2017

We've addressed this, bomerang now uses jsonlite

ropensci/bomrang@fdee7bd

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Jun 19, 2017

@adamhsparks Apologies for the delay but I am just getting ready to process your submission. I will follow up with some initial feedback and reviewer assignments shortly.

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Jun 21, 2017

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Thanks for your submission, @adamhsparks! I'm currently looking for reviewers. 🙏

In the meantime here is the output of goodpractice::gp() that can help you and the reviewers. Note that reaching 100% code coverage isn't compulsory, the output just encourages you to write more of them.

It is good practice to

  ✖ write unit tests for all functions, and all package code
    in general. 22% of code lines are covered by test cases.

    R/get_ag_bulletin.R:44:NA
    R/get_ag_bulletin.R:46:NA
    R/get_ag_bulletin.R:49:NA
    R/get_ag_bulletin.R:52:NA
    R/get_ag_bulletin.R:53:NA
    ... and 553 more lines

  ✖ omit "Date" in DESCRIPTION. It is not required and it
    gets invalid quite often. A build date will be added to the package
    when you perform `R CMD build` on it.
  ✖ avoid long code lines, it is bad for readability. Also,
    many people prefer editor windows that are about 80 characters
    wide. Try make your lines shorter than 80 characters

    R/get_ag_bulletin.R:126:1
    R/get_current_weather.R:121:1
    R/get_current_weather.R:129:1
    R/get_precis_forecast.R:124:1
    R/internal_functions.R:7:1
    ... and 10 more lines

  ✖ not import packages as a whole, as this can cause name
    clashes between the imported packages. Instead, import only the
    specific functions you need.
  ✖ fix this R CMD check NOTE: Namespace in Imports field not
    imported from: ‘tools’ All declared Imports should be used.
@karthik

This comment has been minimized.

Copy link
Member

karthik commented Jun 21, 2017

Assigning @mpadge as reviewer 1 (Please get your review in before July 12th)

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Jun 21, 2017

👋 @geanders! Will you be able to also review this submission?

@geanders

This comment has been minimized.

Copy link
Collaborator

geanders commented Jun 21, 2017

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Jun 21, 2017

Assigning @geanders as reviewer 2.
Please get your review in before July 12th (and let me know if you will be delayed for any reason). 🙏

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Jun 22, 2017

Hi @karthik, thanks for your suggestions. A few comments and some questions for discussion.

  • I've removed the "Date" field from DESCRIPTION

  • I've removed tools and it passes checks, but when I had checked this locally and on WinBuilder I didn't get any NOTES, i.e. tools not being used? How did you find that?

  • Coverage is better than it appears using gp() since it's not possible to run the tests on CRAN to download files, Travis takes ~20 minutes to run, but coverage is 99% according to covr.io

  • What is the thought on long lines? I've reduced line lengths everywhere in code as a general practice except where it's a comment for the user. Should these be broken just to fit the code window with indentation? Change indentation? Something else?

  • I've also imported only the functions necessary for bomrang to work, not as a whole

@mpadge

This comment has been minimized.

Copy link
Contributor

mpadge commented Jun 24, 2017

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and URL, Maintainer and BugReports fields in DESCRIPTION

No contribbution guidelines; otherwise all okay

Paper (for packages co-submitting to JOSS)

The package contains a paper.md with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

I note only that four levels of versioning might be overkill - current version is 0.0.4-1. Up to the authors, but with just 3 levels you'd now be up to 0.0.11 (or maybe 0.2.x), which still allows an approximately infinite future before version numbers get too high, and makes it easier for others to track version numbers.

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 6


Review Comments

This package certainly provides a unique and demonstrably important functionality of allowing ready access to data from the Australian Bureau of Meteorology (BoM). It's very cleanly coded and the authors are to be commended for ensuring the tidyness of all data.frame objects returned from the primary functions.

Note that as a former meteorologist for the BoM turned producer of R data packages, this idea has long been on my to-do/wish list. Accordingly, please do not interpret the following comments as overt criticisms, rather as reflections of my desire to see this package become the best it can! I'm merely a bit envious that you guys got around to it before i did, and came up with such a great package name for this task.


Loading package

No problems installing from a local tarball, but problems arose after installing using devtools::install_all()

devtools::install_github("ToowoombaTrio/bomrang",lib="~/R/x86_64-pc-linux-gnu-library/3.4")

This installs okay, but then:

update_forecast_locations()
# Overwriting existing database
# Error in bzfile(file, "wb") : cannot open the connection
# In addition: Warning message:
# In bzfile(file, "wb") :
#   cannot open bzip2-ed file '<local_dirs>/bomrang/inst/data/AAC_codes.rda',
#   probable reason 'No such file or directory'

This arises largely because the file paths for internal data are not specified correctly. The following instances exemplify:

  1. The line (#34) destfile = paste0(tempdir(), "AAC_codes.dbf") is wrong because it won't insert the necessary system delimiter. All args should simply be passed to file.path, and not paste0. At it stands, this creates the .dbf file one directory up from tempdir(), with a name that is a mash of the last part of tempdir() plus the desired file name, so the function works but not really as intended. Critically, it creates a file outside the tempdir() of the current env, so will fail on any systems on which users do not have permission to do that!

  2. Same for the subsequent lines (#39-40)

    AAC_codes <-
         foreign::read.dbf(paste0(tempdir(), "AAC_codes.dbf"), as.is = TRUE)
    

    This successfully reads back the file, but not from tempdir(), rather from one dir up.

  3. Same again for L#47-48

     path <- file.path(file.path(pkg, "data"), paste0("AAC_codes.rda"))
    

    this should just be

     path <- file.path(file.path(pkg, "data"), "AAC_codes.rda")
    
  4. But Step#3 won't work in a failsafe way because, for example, using devtools::load_all() generates

    system.file(package = "bomrang")
    # [1] "<local>/bomrang/inst"
    

    and

    list.files(system.file(package = "bomrang"))
    # [1] "CITATION" "paper"
    

    this does not contain the expected /data directory. Installing tarball or using devtools::install_github() returns the expected result (that is, the full pkg with data/ directory).

  5. Same issues will also affect update_station_locations()

All of this ought to be able to be resolved by simply using robust commands to access the internal package data, such as:

system.file ("data", "AAC_codes.rda", package = "bomrang")

This just works as is for all of the installation methods described above, but also leads to another suggestion

package data

In line with the general principle of exporting as little as possible for package functionality, i suspect it would be better to put all of this data in inst/data so it's not lazy-loaded with the package. Will the users really need access to "AAC_codes" or "JSONurl_latlon_by_station_name"? Putting these in inst/data will make the package considerably tidier.

Also note that "JSONurl_latlon_by_station_name" only just passes lint tests for variable names which should be <= the precisely 30 chars this is. Could it be shorter?


package functionality

All functions perform as described and intended, and the data.frame tidying is exemplary. Given the amount of work the Toowoomba Trio (also a great name, by the way!) put into tidying, I wonder whether it might not be worth while returning all data.frame objects as tibble objects? This would offer a couple of distinct advantages:

  1. Many functions, such as get_precis_forecast(), return very large dfs. Naively examining these quickly fills the screen with junk, and requires a doubling of much code in the vignette such as,
    QLD_forecast <- get_precis_forecast(state = "QLD")
    head(QLD_forecast)
    
    If it were a tibble, it would suffice to simply have,
    get_precis_forecast(state = "QLD")
    
  2. At the moment,
    class (stations_site_list)
    # [1] "tbl_df"     "tbl"        "data.frame"
    
    yet all other objects are data.frame objects. I suspect this particular data is worth keeping in data/ (rather than inst/data), yet having one tibble and data.frame objects for everything else is perhaps inconsistent (even untidy, one might assert).

Happy to leave the decision of whether to do this to the trio.


Package functions

get_precis_forecast

Given the nice way you've used agrep for partial matching of get_current_weather(), why not use that for state names? I should be able to enter just one letter for all states except NSW and NT which would require two.

vic <- get_precis_forecast(state = "VIC") # okay
vic <- get_precis_forecast(state = "vic") # okay
vic <- get_precis_forecast(state = "vi") # Not okay
vic <- get_precis_forecast(state = "victoria") # Not okay - really!!?

get_current_weather

w <- get_current_weather("castle")
# Warning message:
# In get_current_weather("castle") :
#   Multiple stations match station_name. Using
#         station_name = 'NEWCASTLE NOBBYS SIGNAL STATION AWS'
# did you mean:
#         station_name = 'WILLIAMTOWN (NEWCASTLE  RADAR)NEWCASTLE
# UNIVERSITYCASTLEMAINE PRISONCASTERTON'?

Note that the recommended station_name values are not separated by spaces, which they should be - please fix. Then note that no warning is generated for

w <- get_current_weather("castlemaine")
w <- get_current_weather("castlema")

yet one is nevertheless generated for the unambiguous

w <- get_current_weather("castlem")

Then note that,

w <- get_current_weather("castlema")
w$name [1]
# [1] "Castlemaine"

and yet the warning told me station_name = 'CASTLEMAINE PRISON'. I think there needs to be more consistency in handling the names of observation stations and what they are referred to in the data.frame objects, and personally suspect that the full names ought to be used at all times - so this ought to be "CASTLEMAINE PRISON" at all times. This is likely to be particularly important because many stations within a given town/city/area have changed location over time, and second denominators often serve to clarify which station is being used. ("CASTLEMAINE PRISON" is, for example, relatively new, with previous observations at "CASTLEMAINE" actually taken at a different location.)

Now note the nomenclature for the names of weather stations,

names (stations_site_list) # it's called "name"
names (get_current_weather("castle"))
# it's called "name", but the `Warning` refers to `station_name`
names (get_precis_forecast(state = "vic")) # it's called "location"

The variable names which hold these stations also ought to be consistently named.

Digging a bit further

w <- get_current_weather("castlema")
cbind (w$lat, w$lon)
#       [,1]  [,2]
# [1,] -37.1 144.2
# [2,] -37.1 144.2
# [3,] -37.1 144.2

and yet,

w <- get_current_weather(latlon = c (-37, 144))
# Using station_name = 'CASTLEMAINE PRISON', at latitude = -37.0809, longitude =
#    144.2392

Why are the locations rounded in the data.frame? Surely it would be better to keep the full values?

On a positive note: The conversion to local user times is fantastic and likely to be highly useful and informative - great idea!

get_ag_bulletin

ag_bulletin <- get_ag_bulletin(state = "tassie")

Oh yeah, that doesn't work. Okay, try this,

ag_bulletin <- get_ag_bulletin(state = "tasmania")
# Error in get_ag_bulletin(state = "tasmania") :
#   TASMANIAis not recognised as a valid state or territory

What?! But actually, I just reiterate that point to show again a missing white space ("TASMANIAis") that needs to be fixed. Okay,

ag_bulletin <- get_ag_bulletin(state = "tas")
head (ag_bulletin)

Now these locations are called "station", so we've now had locations variously referred to in four different ways:

  1. "name"
  2. "station_name"
  3. "location"
  4. "station"

Please fix!

Finally, the following code re-iterates some of the points raised above:

agb <- get_ag_bulletin(state = "vic")
agb [which (agb$station == "Kyabram"), ]
#        obs_time_local        obs_time_utc time_zone  site dist station start
# 5 2017-06-24 09:00:00 2017-06-23 23:00:00       EST 80091   80 Kyabram  1964
#    end state     lat      lon elev bar_ht   WMO r  tn   tx twd ev tg sn t5 t10
# 5 2017   VIC -36.335 145.0638  105     NA 95833 0 0.5 12.7  NA NA NA NA NA  NA
#   t20 t50 t1m wr
# 5  NA  NA  NA NA
get_current_weather (latlon = c (-36.3, 145)) [1, ]
# Using station_name = 'KYABRAM', at latitude = -36.335, longitude = 145.0638
#   sort_order   wmo    name history_product local_date_time local_date_time_full
# 1          0 95833 Kyabram        IDV60801      24/10:30pm  2017-06-24 22:30:00
#          aifstime_utc   lat   lon apparent_t cloud cloud_base_m cloud_oktas
# 1 2017-06-24 12:30:00 -36.3 145.1        5.1     -           NA          NA
#   cloud_type cloud_type_id delta_t gust_kmh gust_kt air_temp dewpt press
# 1          -            NA     0.7        0       0      6.3   4.8    NA
#   press_msl press_qnh press_tend rain_trace rel_hum sea_state swell_dir_worded
# 1        NA        NA          -          0      90         -                -
#       swell_height swell_period vis_km weather wind_dir wind_spd_kmh wind_spd_kt
#   1           NA           NA      -       -     CALM            0           0
  1. The warning "Using station_name = " is misleading because it's here called "name"
  2. The lat and lon values from get_ag_bulletin() are full precision, yet again truncated in get_current_weather()

lint and goodpractise notes

I note here that the goodpractise notes regarding long lines mostly refer to lines such as this (from get_ag_bulletin():

stop(
  "\nThe server with the bulletin files is not responding. Please retry again later.\n"
))

This would be better written,

stop(paste0("\nThe server with the bulletin files is not responding.",
            "Please retry again later.\n"))

This is neater and easier to read, and will not raise any lint notes. There are many other instances of lines of code being written like this (from update_forecast_locations()):

AAC_codes <-
foreign::read.dbf(paste0(tempdir(), "AAC_codes.dbf"), as.is = TRUE)

which would be more neatly formatted as,

AAC_codes <- foreign::read.dbf(paste0(tempdir(), "AAC_codes.dbf"),
                               as.is = TRUE)

(Although of course that line is wrong for reasons mentioned above.) And note that you can also use #nolint at the end of problematic lines to avoid lint/goodpractise notes that really are unavoidable.


A more general concern

bomrang claims to return "current weather data", yet omits the 9am and 3pm state bulletins. These are the major daily data output of the BoM (more recent incorporation of hydrological monitoring notwithstanding) and serve as the authoritative historical repository for Australian climate data. I really do believe that the bomrang package ought to offer these data as well, as discussed in this bomrang issue, in response to which the authors have indicated their willingness to incorporate these data within the package.

Finally, I would expect some kind of message on package load describing the source of the data and acknowledging BoM. (Not blowing my own trumpet here, it's just one i know, but see the equivalent for 'OpenStreetMap' data.)

A subsequent edit: Just a personal opinion, but I think the vignette would be better presented as a single document with the current 3 additional ones simply incorporated as tabular appendices to the first. I think this would even help general interpretability of the otherwise clean and succinct main vignette.

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Jun 24, 2017

Thanks for a very thorough review @mpadge

I'll open your suggestions as issues and start working on addressing them.

A few thoughts I had reading your review.

  • I'd had a message on loading regarding BoM, but later dropped it after reading discussions on some of the mail lists expressing a general distaste for them.

  • Yesterday I started moving my packages over to using file.path in place of paste0 and you've just given me my first demonstrated reason why it's not just faster to operate but necessary in some cases, thank you.

  • I debated on returning a tibble. A discussion on Twitter had support for both returning a tibble and a simple data.frame as bomrang now does. In the end, based on the discussion, I decided to return a data.frame because new users might not be familiar with how to convert back to a data.frame but someone wanting a tibble will certainly know how to create one from the bomrang output. I'm not strongly convinced either way I suppose and can be convinced that a tibble is the appropriate format.

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Jun 24, 2017

Regarding station and location names.

With the précis forecast, my understanding was that these are for "locations" not for a given station but rather a forecast for a named location usually a post office or something of that nature. If this is incorrect, I'll rename the field to stations. Perhaps with your BoM experience you can help clarify this for us, @mpadge?

@mpadge

This comment has been minimized.

Copy link
Contributor

mpadge commented Jun 25, 2017

Oh yeah, I had kinda forgotten that. Precis forecasts are actually first made for "districts" (not exactly sure of that terminology), then disaggregated to select "towns". Example here. The verbal forecast gets written for the district, then numbers for each select town in a defined district. For example, if you click on all "Northern Country" towns, you'll see that some get their own slightly different verbal forecast, while others just inherit the generic "Northern Country Area" forecast.

What does this mean for your nomenclature? I guess you ought to use "towns" for the precis forecasts, after first ensuring that all states do refer to them as towns. It would also be useful to explain these nuances in the function doc.

@geanders

This comment has been minimized.

Copy link
Collaborator

geanders commented Jul 13, 2017

@geanders

This comment has been minimized.

Copy link
Collaborator

geanders commented Jul 14, 2017

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R). (This package has this in CONDUCT.md, referenced in the README)
Paper (for packages co-submitting to JOSS)

The package contains a paper.md with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 3


@geanders

This comment has been minimized.

Copy link
Collaborator

geanders commented Jul 14, 2017

Review Comments

I think this is a great package, which will potentially be very useful to some researchers, with a very, very cool name. There are some ideas in here that are really cool, like the ability to update the data that shipped with the package through the two update_* functions. The documentation is very helpful, particularly the warning messages, which typically provide enough information to help a new or infrequent R user to troubleshot as they use the package. I did not find any critical problems in the code beyond those pointed out by the first reviewer, but I did have some minor suggestions throughout.

Code

  • I'm not sure if this has the potential to cause any problems for some users, but it might be worthwhile to try to revise the code to use some of the recommended techniques for standard evaluation when programming with tidyverse packages rather than getting around CRAN notes by defining column names in dataframes as NULL within functions (i.e., all the spots noted now with # CRAN NOTE avoidance). I know that the underscore alternative dplyr functions can do that (mutate_ rather than mutate), but I think they've been deprecated now. I think the latest best practices are outlined here: http://dplyr.tidyverse.org/articles/programming.html
  • As noted by the first reviewer, JSONurl_latlon_by_station_name is a very long variable name. It may be worth renaming this dataset to something shorter.
  • Some of the package dependencies have gone through a number of big changes since they were first released. It may be worthwhile adding minimum version requirements in the DESCRIPTION file, in case some users haven't updated packages in a while.
  • Some of the code could probably be refactored to be much shorter. One example is the series of checks for whether certain columns are in the data frame within the .get_obs function. Another example is the code to create xmlbulletin based on the state chosen in a couple of the functions, which could probably be shortened a good bit using the case_when function from dplyr. Although, in that second specific case, given that the number of states should stay the same in the future, I guess there probably shouldn't be any maintanance problems resulting from the current approach for that piece of code.
  • A couple of the functions (cook and .get_obs) are defined within other functions. Is this for some scoping-related reason? Otherwise, it might make the code a bit more readable to define them outside of other functions.
  • In some places, the code switches between tidyverse and base R for data cleaning. It may be easier to read / maintain the code if it sticks to tidyverse instead. (Again, not a critical issue, just something to consider to make maintenance easier.)
  • I really like %notin% in the internal functions and will be using that idea in my daily life (usually I do !(... %in% ...), which gets annoying).
  • For get_precis_forecast, return(tidy_df) seems to be repeated. I couldn't tell why (maybe just a typo?).

Documentation

  • Throughout, I found the documentation very helpful. I was particularly impressed with the efforts taken to make the warning and error messages clear and useful to users.
  • Some of the warning messages need a space added when they're pasted with variables (I think the other reviewer also noted this).
  • I agree with the other reviewer's suggestion to combine the vignettes into one. Also, I would recommend evaluating more of the code in the vignette, so that users can see what the output looks like.
  • The processing of raw data was well documented through use of .Rmd files in the data-raw folder.
  • It might be nice to include a map of current station locations in the vignette, so users can see the typical coverage of weather stations.
  • I thought the datasets in the package were also well-documented.
  • For the latlon argument, it would be helpful to add what conventions the user should use for latitude and longitude (for example, are units decimal degrees? Should southern latitudes be negative, or is that assumed?)
  • It's very nice that some functions try to find a match to a station name if the user didn't get the station name exactly right (and how it gives a message about other possible matches in that case).
  • There's only one "big idea" in the package that I would suggest increasing documentation on. The update_* functions are a cool idea, but I think it would be useful to explain a bit more what's going on with these functions in the vignette. In particular, it would be good if users understand exactly where (in their computer's file system) these updates are being saved. Also, it may be helpful for reproducibility to add some comments on the vignette about how this functionality might result in users with the same version of the package may end up with different results from running the same code (i.e., if one has updated the data at some point and one has not or updated at a different time).
@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Jul 15, 2017

Thanks @geanders!

Good thoughts and comments. I like your suggestions here and we'll work on getting them incorporated.

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Jul 26, 2017

@adamhsparks Quick update, I'll need a couple of days to read through the comments and look over the changes before providing you with an update. Once I complete that check, I'll check in with Brooke and Mark for their final signoff.

In the meantime you can go ahead and add a ropensci review badge to your README. It will update once this review is complete.

[![](https://badges.ropensci.org/121_status.svg)](https://github.com/ropensci/onboarding/issues/121)
@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Jul 27, 2017

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Jul 28, 2017

Thanks @adamhsparks!
Please ping when it's ready for me to review further.

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Aug 13, 2017

@karthik, we're getting there. We've been busy (with this and other things) and we just need to get one more vignette added and then summarise what we've done.

In the meantime, I decided to add some new functionality to the package that the reviewers might want to look at before the final decision. Everything (except tests and documentation, naturally) is in this file for retrieving satellite imagery from BoM - https://github.com/ToowoombaTrio/bomrang/blob/master/R/get_satellite_imagery.R

I need to add a few more tests for this to get our code coverage back up, but I've tested it on my Mac and Linux VM, so far so good.

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Aug 13, 2017

Thanks @adamhsparks 🙏
@geanders @mpadge Could you please take a quick look at the updates when you get a chance?

adamhsparks added a commit to ropensci/bomrang that referenced this issue Aug 17, 2017

Add output to vignette
Fixes,
> Also, I would recommend evaluating more of the code in the vignette,
so that users can see what the output looks like.

From ropensci/software-review#121
@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Aug 18, 2017

Hi all,
Here is a changelog of the updates/changes made to bomrang in response to the reviews.

Code

Updates to Code in General

  • The package has been linted,

    • line lengths are <80 chars,
    • best practice naming conventions are followed (where possible)
  • Onload a message regarding the copyright and data source,

                Data (c) Australian Government Bureau of Meteorology,,
                Creative Commons (CC) Attribution 3.0 licence or,
                Public Access Licence (PAL) as appropriate.,
                See http://www.bom.gov.au/other/copyright.shtml

is displayed

  • All of the internal files are now loaded from "inst/extdata" and not exposed to the end user and now use
    load(system.file("extdata", "*.rda", package = "bomrang")) to load the files.

  • agrep is now used in all functions where the user enters state or Australia values to query BoM data

  • best practices for programming with dplyr 0.7 using rlang are now employed, which reduces the need for the # CRAN NOTE avoidance

  • The DESCRIPTION file now states minimum package versions for packages that are undergoing rapid development

  • Code has been refactored to be shorter, e.g., xml_bulletin_url in
    get_ag_bulletion()

Updates to get_current_weather()

  • "JSONurl_latlon_by_station_name" has been shortened to
    "JSONurl_site_list".

  • Recommended station_name values are separated by spaces in
    get_current_weather()

  • Better consistency in handling names of observation stations for
    get_current_weather()

  • Station names and location names are more consistent in the supplied data and returned data frames.

  • Lat/Lon values for get_current_weather() results are now reported using the values from the internal database, which has a higher degree of accuracy. The json file values are rounded while the values from the stations list has four decimal places

Updates to get_ag_bulletin()

  • .get_obs() has been moved out of the .parse_bulletin() function for easier reading/maintenance

get_precis_forecast()

  • fixed a repeat of return(tidy_df) in get_precis_forecast()

Additions to code

  • Added new get_satellite_imagery() function to download GeoTIFF files from BoM ftp server

Updates to Documentation

  • the bomrang vignette now contains instructions for use along with appendices that document the data fields and units, rather than separate vingettes

  • ramifications of updating station lists are now stated clearly in the vignette and help files for applicable functions

  • a map of BoM stations is included in an appendix of the bomrang vingette

  • Lat/Lon values are specified to be in decimal degrees in get_current_weather() help and vignette

  • Added new use-case vignette

Updates to the JOSS paper

  • Edited for clarity and added new functionality for satellite imagery. This will need to be edited once more before submission to include the 9 & 3 state bulletins, which @mpadge will be adding before we submit the paper to JOSS and package to CRAN
@mpadge

This comment has been minimized.

Copy link
Contributor

mpadge commented Aug 25, 2017

@karthik @adamhsparks For the record at this point in the proceedings:

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

I've added the new functionality that @adamhsparks mentioned above, but it lacks full documentation at the moment. I'm totally happy for the package to be accepted, and suggest that the subsequent JOSS submission could merely be postponed ever so slightly until a more comprehensively documented version can be officially released and doi-ed. An initial CRAN submission with just a lightweight description of the new data should be no problem, and can simply be updated asap afterwards. Sound good?

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Aug 25, 2017

I'm in no hurry. We can wait until we've got it fully documented to release it on CRAN and submit to JOSS.

It's there on GitHub for anyone that wants to otherwise use it and we might find more bugs in it. I've also a few other issues open that could be nice to address before releasing on CRAN.

Thank you, @mpadge for adding the new functionality. The package is really coming together nicely now.

@geanders

This comment has been minimized.

Copy link
Collaborator

geanders commented Aug 28, 2017

@karthik @adamhsparks The package authors did a really excellent job of responding to the comments and suggestions I had. I think the final package is really, really nice. I checked my final approval in the box in my original review.

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Aug 28, 2017

Fantastic, thank you @mpadge and @geanders!
🙏 🙏 🙏

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Aug 28, 2017

Hi @adamhsparks,
I'm pleased to accept the bomrang package into the ropensci suite. I have also created a team for this package on ropensci's github org. Please accept that team invitation request and proceed with transferring your repo. Once it's transferred, I'll make you full admin and you'll be able to continue updating the package.

Badges:

Please add these two badges to your repo:

[![](https://badges.ropensci.org/121_status.svg)](https://github.com/ropensci/onboarding/issues/121)

and the footer:

[![](http://ropensci.org/public_images/github_footer.png)](http://ropensci.org)

Once the transfer is complete you'll need to re-activate the CI services (I will help with this bit).

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Aug 28, 2017

Thanks Karthik.

I've moved the repository over, have added the footer and updated the links to the repository in the Description, etc. Just waiting on admin status so I can update the package's website link now.

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Aug 28, 2017

@adamhsparks done!

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Aug 28, 2017

Hi @adamhsparks

Just a last few items to check off!

Since you've asked for a JOSS submission, please generate a new release with the up-to-date Zenodo DOI.

Then please initiate a submission at http://joss.theoj.org/papers/new and suggest me as the editor. I will watch for it and take care of it at the other end.

Before you update the DOI, can you please add a little bit more in terms of research applications for the paper?

I see:

The data have applications in applied meteorology and agricultural meteorology and agricultural and environmental modelling.

Also, satelite is misspelled in the paper.

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Aug 29, 2017

Hi @karthik,
It'll be a bit before the JOSS submission as @mpadge is not able to work on this right at the moment and there is still documentation left to add for the get_weather_bulletin() functionality before the package is fully ready for submission to CRAN and the paper still needs his ORCID. So it'll be a bit, but I'll add some more detail and correct the spelling in the meantime.

thanks!

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Sep 20, 2017

Hi @karthik,
I've submitted the latest version, 0.0.7, of bomrang to JOSS this morning and selected you as the editor. @mpadge has finalised the documentation and added a bit more functionality to handle typos in weather bulletins along with his ORCID and affiliation in the paper.

I've also done spellchecking and other minor fixes here and there to make it nicer and added information and references to the paper.

Here's the DOI for today's release: DOI

Thanks for your patience.

@sckott

This comment has been minimized.

Copy link
Member

sckott commented Oct 2, 2017

so can this be closed then?

@adamhsparks

This comment has been minimized.

Copy link
Member Author

adamhsparks commented Oct 2, 2017

I believe so, unless you need anything else from me that I'm unaware of?

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Oct 2, 2017

Yes this can be closed.

@karthik karthik closed this Oct 2, 2017

@karthik

This comment has been minimized.

Copy link
Member

karthik commented Oct 2, 2017

It hasn't shown up on JOSS yet (the system has been having some issues lately) but I'll make sure to handle it (and pass it on to the EIC) as soon as I see it. 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment