Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfema: access data through the FEMA API. #484

Closed
10 of 26 tasks
dylan-turner25 opened this issue Nov 9, 2021 · 50 comments
Closed
10 of 26 tasks

rfema: access data through the FEMA API. #484

dylan-turner25 opened this issue Nov 9, 2021 · 50 comments
Assignees

Comments

@dylan-turner25
Copy link

dylan-turner25 commented Nov 9, 2021

Date accepted: 2022-01-23
Reviewers:
Submitting Author Name: Dylan Turner
Submitting Author Github Handle: @dylan-turner25
Repository: https://github.com/dylan-turner25/rfema
Version submitted: 0.0.0.9000
Submission type: Standard
Editor: @emilyriederer
Reviewers: @fmichonneau, @fawda123

Due date for @fmichonneau: 2021-12-10

Due date for @fawda123: 2021-12-14
Archive: TBD
Version accepted: TBD
Language: en

  • Paste the full DESCRIPTION file inside a code block below:
Package: rfema
Title: Access the openFEMA API
Version: 0.0.0.9000
Authors@R: 
    person(given = "Dylan",
           family = "Turner",
           role = c("aut", "cre"),
           email = "dylan.turner3@outlook.com",
           comment = c(ORCID = "000-0002-0915-7384"))
Description: `rfema` allows users to access The Federal Emergency Management Agency's (FEMA) publicly available data through their API. The package provides a set of functions to easily navigate and access data from the National Flood Insurance Program along with FEMA's various disaster aid programs, including the Hazard Mitigation Grant Program, the Public Assistance Grant Program, and the Individual Assistance Grant Program.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.2
Imports: 
    dplyr,
    httr,
    memoise,
    plyr,
    rvest,
    utils
Suggests: 
    rmarkdown,
    knitr,
    testthat (>= 3.0.0),
    covr,
    vcr (>= 0.6.0)
Config/testthat/edition: 3
VignetteBuilder: knitr
URL: https://github.com/dylan-turner25/rfema
BugReports: https://github.com/dylan-turner25/rfema/issues

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

rfema allows users to access The Federal Emergency Management Agency's (FEMA) publicly available data through the open FEMA API. The package provides a set of functions to easily navigate and access all data sets provided by FEMA, including (but not limited to) data from the National Flood Insurance Program and FEMA's various disaster aid programs.

  • Who is the target audience and what are scientific applications of this package?
    rfema is intended to be used by researchers that require the use of data that is available through the FEMA API, but may not have familiarity in working with APIs.

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?

I am unaware of any other packages that offer similar functionality.

N/A

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

The following problem was found in your submission template:

  • 'author1' variable must be GitHub hanle only ('@myhandle')
    Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.

👋

@ropensci-review-bot
Copy link
Collaborator

ropensci-review-bot commented Nov 9, 2021

Checks for rfema (v0.0.0.9000)

git hash: dcc9bd2c

  • ✔️ Package name is available
  • ✖️ does not have a 'CITATION' file.
  • ✔️ has a 'codemeta.json' file.
  • ✖️ does not have a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✖️ Package has no HTML vignettes
  • ✖️ These functions do not have examples: [bulk_dl.Rd].
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 90.6%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE


1. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 7 files) and
  • 1 authors
  • 1 vignette
  • no internal data file
  • 6 imported packages
  • 9 exported functions (median 13 lines of code)
  • 1 non-exported function in R (median 54 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 7 41.4
files_vignettes 1 64.8
files_tests 10 88.6
loc_R 247 23.5
loc_tests 1483 89.3
num_vignettes 1 60.7
n_fns_r 10 6.8
n_fns_r_exported 9 38.9
n_fns_r_not_exported 1 0.0 TRUE
n_fns_per_file_r 1 13.9
num_params_per_fn 4 54.3
loc_per_fn_r 16 65.0
loc_per_fn_r_exp 13 32.2
loc_per_fn_r_not_exp 54 96.0 TRUE
rel_whitespace_R 40 49.6
rel_whitespace_tests 4 82.3
doclines_per_fn_exp 18 10.1
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 0 0.0 TRUE

1a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


2. goodpractice and other checks

Details of goodpractice and other checks (click to open)

3a. Continuous Integration Badges

github
github

GitHub Workflow Results

name conclusion sha date
R-CMD-check success dcc9bd 2021-11-09

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following note:

  1. checking dependencies in R code ... NOTE
    Namespaces in Imports field not imported from:
    ‘dplyr’ ‘httr’ ‘plyr’ ‘rvest’
    All declared Imports should be used.

R CMD check generated the following check_fail:

  1. rcmdcheck_imports_not_imported_from

Test coverage with covr

Package coverage: 90.56

Cyclocomplexity with cyclocomp

No functions have cyclocomplexity >= 15

Static code analyses with lintr

lintr found the following 75 potential issues:

message number of times
Avoid 1:length(...) expressions, use seq_len. 1
Avoid using sapply, consider vapply instead, that's type safe 3
Lines should not be more than 80 characters. 71


Package Versions

package version
pkgstats 0.0.2.72
pkgcheck 0.0.2.107


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@ropensci-review-bot
Copy link
Collaborator

I'm sorry @dylan-turner25, I'm afraid I can't do that. That's something only editors are allowed to do.

@dylan-turner25
Copy link
Author

The most recent commit has attempted to address all of the failed initial checks.

@ldecicco-USGS
Copy link

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

The following problem was found in your submission template:

  • 'author1' variable must be GitHub hanle only ('@myhandle')
    Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for rfema (v0.0.0.9000)

git hash: 568c9783

  • ✔️ Package name is available
  • ✔️ has a 'CITATION' file.
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✖️ These functions do not have examples: [bulk_dl.Rd].
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 90.6%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE


1. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 7 files) and
  • 1 authors
  • 1 vignette
  • no internal data file
  • 6 imported packages
  • 9 exported functions (median 13 lines of code)
  • 1 non-exported function in R (median 54 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 7 41.4
files_vignettes 0 0.0 TRUE
files_tests 10 88.6
loc_R 247 23.5
loc_tests 1483 89.3
num_vignettes 1 60.7
n_fns_r 10 6.8
n_fns_r_exported 9 38.9
n_fns_r_not_exported 1 0.0 TRUE
n_fns_per_file_r 1 13.9
num_params_per_fn 4 54.3
loc_per_fn_r 16 65.0
loc_per_fn_r_exp 13 32.2
loc_per_fn_r_not_exp 54 96.0 TRUE
rel_whitespace_R 40 49.6
rel_whitespace_tests 4 82.3
doclines_per_fn_exp 18 10.1
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 0 0.0 TRUE

1a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


2. goodpractice and other checks

Details of goodpractice and other checks (click to open)

3a. Continuous Integration Badges

github
github

GitHub Workflow Results

name conclusion sha date
R-CMD-check success 568c97 2021-11-10

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following note:

  1. checking dependencies in R code ... NOTE
    Namespaces in Imports field not imported from:
    ‘dplyr’ ‘httr’ ‘plyr’ ‘rvest’
    All declared Imports should be used.

R CMD check generated the following check_fail:

  1. rcmdcheck_imports_not_imported_from

Test coverage with covr

Package coverage: 90.56

Cyclocomplexity with cyclocomp

No functions have cyclocomplexity >= 15

Static code analyses with lintr

lintr found the following 75 potential issues:

message number of times
Avoid 1:length(...) expressions, use seq_len. 1
Avoid using sapply, consider vapply instead, that's type safe 3
Lines should not be more than 80 characters. 71


Package Versions

package version
pkgstats 0.0.2.72
pkgcheck 0.0.2.107


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@ldecicco-USGS
Copy link

@ropensci-review-bot assign @emilyriederer as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @emilyriederer is now the editor

@emilyriederer
Copy link

Hi @dylan-turner25 ! 👋 It's nice to meet you, and I'm excited to be working with you on this package. I'm going to start seeking out good reviewers for this package. In the meantime, I wanted to direct your attention to a few items from the automated checks:

  • lintr suggests you have a lot of lines >80 characters long. For readability, it might be helpful to shorten. You can run lintr::lint_package() locally in RStudio and R will point you to specific lines to reconsider. This doesn't have to be at zero, but if there are opportunities to add spacing and improve readability, that would be great.
  • Consider adding some documentation to your one internal function (it looks like there may be none now). You can suppress .Rd generation with the #' @noRd tag , but it will help reviewers understand the code
  • Please try to remove the one note being generated by R CMD check. The Imports section of the R Packages book discusses a few good options for defining your imports. In short, I think there's inconsistencies in what packages you are importing (via the NAMESPACE), which packages you are acknowledging in the roxygen docs (with import or importFrom), and how you are referencing those functions inside your code (with or without ::)

Thanks again !

@dylan-turner25
Copy link
Author

@emilyriederer , thank you for the initial comments. The latest commit for the package has attempted to address each of these.

@emilyriederer
Copy link

Thanks @dylan-turner25 ! I'll take another look. I've also started reaching out to prospective reviewers.

@emilyriederer
Copy link

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

The following problem was found in your submission template:

  • 'author1' variable must be GitHub hanle only ('@myhandle')
    Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for rfema (v0.0.0.9000)

git hash: f42a4d3c

  • ✔️ Package name is available
  • ✔️ has a 'CITATION' file.
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 85.5%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Package License: MIT + file LICENSE


1. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 7 files) and
  • 1 authors
  • 1 vignette
  • no internal data file
  • 5 imported packages
  • 9 exported functions (median 13 lines of code)
  • 1 non-exported function in R (median 62 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 7 41.4
files_vignettes 0 0.0 TRUE
files_tests 10 88.6
loc_R 298 28.3
loc_tests 1483 89.3
num_vignettes 1 60.7
n_fns_r 10 6.8
n_fns_r_exported 9 38.9
n_fns_r_not_exported 1 0.0 TRUE
n_fns_per_file_r 1 13.9
num_params_per_fn 4 54.3
loc_per_fn_r 16 65.0
loc_per_fn_r_exp 13 32.2
loc_per_fn_r_not_exp 62 97.0 TRUE
rel_whitespace_R 34 49.9
rel_whitespace_tests 4 82.3
doclines_per_fn_exp 19 11.9
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 0 0.0 TRUE

1a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


2. goodpractice and other checks

Details of goodpractice and other checks (click to open)

3a. Continuous Integration Badges

github
github

GitHub Workflow Results

name conclusion sha date
R-CMD-check success f42a4d 2021-11-15

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following check_fail:

  1. no_import_package_as_a_whole

Test coverage with covr

Package coverage: 85.53

Cyclocomplexity with cyclocomp

No functions have cyclocomplexity >= 15

Static code analyses with lintr

lintr found the following 25 potential issues:

message number of times
Avoid using sapply, consider vapply instead, that's type safe 3
Lines should not be more than 80 characters. 22


Package Versions

package version
pkgstats 0.0.2.72
pkgcheck 0.0.2.107


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

@emilyriederer
Copy link

emilyriederer commented Nov 18, 2021

Thanks for the update @dylan-turner25 ! Adding the first reviewer now and still in search of a second. When I change the tag, the bot will give instructions for adding an "Under Review" tag to your README, so please do that!

@emilyriederer
Copy link

@ropensci-review-bot @fmichonneau as reviewer

@emilyriederer
Copy link

@ropensci-review-bot seeking reviewers

@ropensci-review-bot
Copy link
Collaborator

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/484_status.svg)](https://github.com/ropensci/software-review/issues/484)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

@emilyriederer
Copy link

@ropensci-review-bot add @fmichonneau as reviewer

@ropensci-review-bot
Copy link
Collaborator

@fmichonneau added to the reviewers list. Review due date is 2021-12-10. Thanks @fmichonneau for accepting to review! Please refer to our reviewer guide.

@emilyriederer
Copy link

@dylan-turner25 - please meet your reviewers @fawda123 and @fmichonneau ! Thanks so much to them both for volunteering their time. I'm excited to hear their thoughts on this package.

@emilyriederer
Copy link

Hi @fawda123 and @fmichonneau ! I hope you both are doing well. This is just a friendly reminder that the target review deadline for rfema is about a week away. Let us know if you have any questions or concerns. Otherwise, looking forward to seeing all of your feedback!

@fawda123
Copy link

fawda123 commented Dec 12, 2021

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README

    Very nice statement of need, but see my comments below. I'm also wondering if you could narrow the target audience a bit. Is this intended mostly for academics/researchers or more for general public interest (e.g., downloading flood maps, etc.)? Clearing this up from the beginning could be helpful.

  • Installation instructions: for the development version of package and any non-standard dependencies in README

  • Vignette(s): demonstrating major functionality that runs successfully locally]

    You have a very nice README file that explains some of the core rationale for the package, how to install, etc. I did not know this package had a vignette until the end of the README file, so consider moving that up to the beginning if users want to dive right into that to learn about the package. The README vignette link also points to an HTML file on GitHub, which is not a readable format. After I installed the package and I ran the following in my R console, it indicated no vignettes were found. I honestly don't know how to rectify, but this may not be a problem once it's hosted on ROpenSci. Regardless, I could only access the vignette by forking the source repo and inspecting the .Rmd file.

    vignette(package = 'rfema')
    #> no vignettes found
    
  • Function Documentation: for all exported functions

    The description for bulk_dl is missing text, i.e., "For large files, the function"... does what?

    Also, there are several functions in helpers.R that seem to be just that, used internally by the package in the core functions. Do these need to be exported, i.e., would a user ever have a need to call these functions by themselves?

  • Examples: (that run successfully locally) for all exported functions

    Many of the examples in the help files return a lot of printed text in the console that is problematic for two reasons. First, it complicates interpretation when first time users are trying to understand function output. Second, this will not render well on the ROpenSci reference pages when the package is finally onboarded. Consider how you can print the example output in a more user-friendly style. For example, in the fema_data_sets example, you just call the function as is:

    fema_data_sets()
    

    Something like this would produce a cleaner output for the help files.

    dat <- fema_data_sets()
    dim(dat)
    head(dat[, c(1:3)])
    

    Better yet, consider converting all output as tibble objects since there is already a nice print method that handles these issues. I've made a more detailed comment about that below.

  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

    There should be links in the README to the CONTRIBUTING file. I'd also suggest including an issues template with usethis::use_tidy_issue_template().

Functionality

  • Installation: Installation succeeds as documented.

  • Functionality: Any functional claims of the software been confirmed.

  • Performance: Any performance claims of the software been confirmed.

    I really appreciate you give the user an option of continuing API requests that may take some time (i.e., the ask_before_call argument in open_fema). I would imagine users may also wonder how long a request will take before they make the decision and I think it would be useful to provide some approximation in the printed text on the console. I know this varies between platforms and ISPs, but is there some simple rule of thumb you could use to approximate request time, e.g., 1 second per 1000 records, then scale this up when you estimate number of requests? So instead of saying something like "it will take 500 individual API calls" you could also say "it will take approximately 8.33 minutes". I don't know, maybe this is a ridiculous idea.

  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.

    Tests did not pass locally, but I do note that the CI on GitHub Actions for the main repo has no problems. I also was able to run the tests going through each test file in the tests/testthat folder by excluding the vcr functionality, so it's probably an issue on my end with the latter. I just point this out in case others had the same problem.

  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 5

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Hi @dylan-turner25, thanks for the opportunity to review the rfema package. I don't typically work with these kinds of data, so I've adopted the "naive user" approach where my first introduction to the package was to download and load it, run through some examples, and interpret the output. My general comments relate to this experience of exploring the package for the first time with some preconceived idea of what I think it does.

Looking at the README, you make a nice case for why you developed the package. I particularly like that you showed a side-by-side comparison of how a query would work with and without rfema. It's a good selling point, although it certainly applies to any R package that taps into an API. Newer R users will find this appealing, but more seasoned users will probably wonder if there are other advantages to the package, or more importantly, are there any limitations using your package vs going the home-cooked httr route. Any time you build a custom workflow (e.g., a set of functions to handle API requests), something else is often sacrificed (e.g., absolute control over API calls using httr). Does your package offer any enhanced functionality that benefits users if they're sacrificing control over API requests with httr? This is kind of an annoying question, but I ask it mostly because the convenience of making API requests may not be a selling point for all.

For example, it looks like most of the function calls return date/time columns as characters, but consider having those columns formatted as POSIX objects to save the user time (maybe even formatted to the system timezone). This would be a nice added feature of your package that one would not get if building the API calls from scratch. One could readily plot time series data if the date/time columns were already formatted.

The code for the example comparison that runs open_fema returns a nice printed output to the console showing iteration progress. However, some of the content doesn't make sense - what does the FALSE text mean? Also, it may not be important to print all the text, just 1 of 3, 2 of 3, etc. and maybe only include iterations completed: at the top of the loop so as not to clutter the console output each time a new iteration of the loop is initiated.

As an aside, I really like that the FEMA API does not require a key. I know this is completely out of your hands, but getting connected to an API is often a critical hurdle that can prevent others from using what very well may be a very helpful package. So, this is definitely a good-selling point because the package works out of the box without having to talk to a third-party. Perhaps emphasize this a bit more in the README/vignette, i.e., most other APIs require a "complicated" token exchange process, etc.

I think one of the major issues a naive user would have is understanding the datasets available from the API. I really appreciate that you've included a function to view what's available (fema_data_sets), but there still is no easy way to understand what's included in each dataset aside from parsing the data frame (i.e., the kable calls truncate the description). It seems the most accessible place to understand what's available is on the FEMA data sets web page here. It would be good to include this link in the README file when you talk about the fema_data_sets() function (it's at the top of the README, but good to mention again), vignette, and help file for fema_data_sets() so users know where to look for a more accessible description.

Also going through the vignette, it seems like one limitation of the package is that the data needs to be downloaded first to identify values that are queryable. There's one point in the vignette where you state that the parameter_values function "returns the unique values of a variable contained in the first 1000 observations of a data set". One advantage of having an API is the ability to know these parameters before downloading the data, where the latter can often be quite large and downloading an entire dataset can be impractical. Are there API requests that can be made to retrieve queryable paramaters, rather than having a function in the package that downloads the data first to identify these parameters?

Another limitation is that you've hard-coded the download from which the queryable parameters are identified to the first 1000 rows, so this very likely does not represent the entire suite of parameters in a column of the entire dataset. If the API only provides the ability to download the data, then there's no way around the issue, but you should state this in the vignette so users are aware of the limitation. Personally, I would just download all of the data for a table I'm interested in, then query later in R (using dplyr for example). If I've misunderstood how these functions work, then disregard.

Finally, I really like the examples at the end of the vignette. These gave me a good sense of how I could use this package to address some more general questions. I suspect many users will find this section of the vignette helpful. One minor concern about the examples is I would not use kable output to show results since this format is more for making easy to read HTML tables in a report-style document where the need to show source code is not important. A user of the package would not use kable in their scripting workflow and instead would be interpreting output in the console directly. Showing results in the README and vignette w/ kable makes the output less accessible from an exploratory data analysis workflow that may be more comfortable for many folks. It also creates unnecessary code in your vignette that can distract from the more important code that shows how to use your functions. If the output from open_fema is formatted as a tibble (ideally inside the package functions), the print method will already take care of the table overflows in the rendered HTML.

Coding issues

Minor issue, but on the README file, please update the badge link for the RCMD checks to GitHub Actions. It currently goes to a .svg file for the ijtiff package.

Use this:

[![R-CMD-check](https://github.com/dylan-turner25/rfema/workflows/R-CMD-check/badge.svg)](https://github.com/dylan-turner25/rfema/actions)

Instead of this:

![R-CMD-check](https://github.com/ropensci/ijtiff/workflows/R-CMD-check/badge.svg)

I've also noticed a few typos on the README, vignette, and function docs. Please make sure to proofread since I'm not sure the automated package checks have caught everything.

Other than that, the package structure seems pretty solid. The testing infrastructure on GitHub Actions is very comprehensive (covers multiple platforms and R versions) and I did not even notice a single ERROR, WARNING, or NOTE for any of the checks. Code coverage at ~85% is pretty good too. Nice work!

Cheers,
Marcus

@emilyriederer
Copy link

Thank you for the thoughtful review, @fawda123 !

@fmichonneau , do you have a sense when your review may be ready? Thank you!

@fmichonneau
Copy link
Member

Package Review

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README

The need for the package is well justified. When reading quickly the README the first time, I was slightly confused by the first block of code that demonstrates what it would be to not have the package to work with this data. I suggest adding a comment in the block of text to clarify this: ropensci/rfema#3

  • Installation instructions: for the development version of package and any non-standard dependencies in README

The installation instructions are straightforward. I suggest you recommend your users to use the remotes package instead of devtools to install your package from GitHub: ropensci/rfema#4

  • Vignette(s): demonstrating major functionality that runs successfully locally

The vignette is clear and comprehensive. However, the source of the vignette file is in the .Rbuildignore file which means that it's not included in the package/available from the R console (it is however available from the pkgdown generated site). If your intention is to pre-compute the vignette so it's not generated when the package gets built, you can refer to Joeren's blog post.

  • Function Documentation: for all exported functions

It's only something to worry about after the approval, but it'd be useful to add the URL for the pkgdown-generated site in the GitHub repo info.

I also notice that T or F is used instead of TRUE or FALSE in the function declarations. I encourage you to use the long-form in the documentation.

  • Examples: (that run successfully locally) for all exported functions

If you plan to submit your package to CRAN, I recommend you wrap your examples in your function help pages with \dontrun{} to avoid failures in case the API is down when CRAN performs their daily checks.

  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

The link to the code of conduct from the CONTRIBUTING file is broken.

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 5

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Thanks for the opportunity to review your package @dylan-turner25. I really don't know much about this domain so it is difficult for me to come up with good questions to try to explore the package and the FEMA API on my own. However, your vignette and the examples give a good indication of how to use the package and interact with the API. I especially appreciated the extra examples you provide in the vignette. I think the package's interface gives the appropriate building blocks to interact with an API that provides access to a lot of relatively complex data.

I agree with everything that @fawda123 said in his excellent and comprehensive review (I read his review only after interacting with the package for a little bit). He expressed elequantly the main things I found about the package so I'm not going to repeat them. I do encourage you to proofread carefully your README, vignette, and help files. The devtools::spell_check() function could help with that.

I noticed that you use memoise on your internal package functions. The memoise README notes that for packages, the memoization should be done in the .onLoad function (see the last paragraph of the "Details" section).

I submitted a few pull requests on your package for suggestions to improve/simplify the code. Don't hesitate to let me know if you have questions:

Great work on this package and on making public data easily accessible in the R ecosystem!

@emilyriederer
Copy link

Thanks @fmichonneau for the great review and even PRs! That's really nice.

@dylan-turner25 , the ball is back in your court now! Please take some time to review the feedback from our reviewers and incorporate changes. We typically aim for this step to take ~2 weeks. Feel free to use this thread to ask reviewers any questions about their feedback that you have along the way.

@emilyriederer
Copy link

@ropensci-review-bot submit review #484 (comment) time 5

@ropensci-review-bot
Copy link
Collaborator

Logged review for fmichonneau (hours: 5)

@emilyriederer
Copy link

@ropensci-review-bot submit review #484 (comment) time 5

@ropensci-review-bot
Copy link
Collaborator

Logged review for fawda123 (hours: 5)

@dylan-turner25
Copy link
Author

@fmichonneau and @fawda123, thank you both for taking time to review this package. Peer reviewing can be a thankless job, but know that your time spent on this is much appreciated. I think all the suggestions made are good ones and will start working on implementing them.

@dylan-turner25
Copy link
Author

@emilyriederer, @fawda123 , and @fmichonneau, I have incorporated all the feedback I received and documented all the changes made in the NEWS.md file. I'll note that I'm still kind of perplexed on the best practices for making vignettes available. My understanding is a package down site will document everything once the package is hosted with the other rOpensci packages. When I generate the package down site for this package, everything seems to be working as intended with all documentation and vignettes made available. I also made a copy of my vignette as a .md file so it can be viewed directly on Github (the README links directly to this .md file). Let me know if some other method of vignette distribution is a better choice.

Thanks again for all your feedback,
Dylan

@fmichonneau
Copy link
Member

Generally, you write your vignette as an Rmd file in the vignettes/ directory.

When you build your package (using R CMD build from your terminal or devtools::build() from your R console), then R will take your vignette, and put the rendered version in the archive file. When your users are going to install your package, this rendered vignette will be available to them. Currently, you have the path to your vignette in your .Rbuildignore and so this file is excluded at build time and the vignette is not available to your users if they try to locate the vignette directly on their systems using browseVignettes("rfema"). The vignette will also not be listed on CRAN if you plan to submit your package there.

The pkgdown package doesn't look at .Rbuildignore file and therefore builds the vignette and makes it available on the generated site. That's why you see it there.

If you leave it as it is, your vignette would only be available from the pkgdown site associated with your package but will not be included in the R package that your users would install on their system.

@dylan-turner25
Copy link
Author

@fmichonneau , thank you for the explanation. I just pushed a new commit that removed the path to my vignette's .Rmd file from .Rbuildignore

@emilyriederer
Copy link

Hey @dylan-turner25 - thanks for working through the review feedback! @fmichonneau and @fawda123 , if you could please take on final look and confirm with the approval template, that would be great!

@fmichonneau
Copy link
Member

It's all good on my end! Great work @dylan-turner25!

Reviewer Response

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

@emilyriederer
Copy link

Hey @fawda123 ! Just checking in - if you could please take a look at @dylan-turner25 's updates and let us know if we have your final approval, that would be great!

@fawda123
Copy link

Hi there, apologies for the delay. The revisions look fantastic, I especially like the addition of the time_iterations() function that estimates the API request time. Very useful changes. I've included one more PR with some very minor edits to the README.

Everything looks good on my end, nice work @dylan-turner25!

Reviewer Response

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 6

@emilyriederer
Copy link

@ropensci-review-bot approve rfema

@ropensci-review-bot
Copy link
Collaborator

Approved! Thanks @dylan-turner25 for submitting and @fmichonneau, @fawda123 for your reviews! 😁

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so.
  • After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
  • If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
    • deactivate the automatic deployment you might have set up
    • remove styling tweaks from your pkgdown config but keep that config file
    • replace the whole current pkgdown website with a redirecting page
    • replace your package docs URL with https://docs.ropensci.org/package_name
    • In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
  • Fix any links in badges for CI and coverage to point to the new repository URL.
  • Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant.
  • We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
  • You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

@dylan-turner25
Copy link
Author

@ropensci-review-bot finalize transfer of rfema

@ropensci-review-bot
Copy link
Collaborator

Transfer completed. The rfema team is now owner of the repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants