New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fingertipsR #168

Closed
sebastian-fox opened this Issue Dec 8, 2017 · 23 comments

Comments

Projects
None yet
6 participants
@sebastian-fox
Member

sebastian-fox commented Dec 8, 2017

Summary

  • What does this package do? (explain in 50 words or less):
    The package provides an interface to the API for the Fingertips website, a site developed by Public Health England. It enables users to import data on over 2,000 indicators of public health in England at different levels of geography.

  • Paste the full DESCRIPTION file inside a code block below:

Package: fingertipsR
Type: Package
Version: 0.1.3.9000
Title: Fingertips Data for Public Health
Description: Fingertips (<http://fingertips.phe.org.uk/>) contains data for many indicators of public health in England. The underlying data is now more easily accessible by making use of the API.
Depends: R (>= 2.10)
Authors@R: c(person("Sebastian", "Fox", , "sebastian.fox@phe.gov.uk", c("aut", "cre")),
    person("Julian", "Flowers", , "julian.flowers@phe.gov.uk", c("aut","ctb")))
URL: https://fingertips.phe.org.uk
BugReports: https://github.com/PublicHealthEngland/fingertipsR/issues
Imports:
    dplyr,
    jsonlite,
    data.table,
    httr,
    purrr,
    DT,
    miniUI,
    shiny,
    shinycssloaders,
    readr
Suggests:
    knitr,
    rmarkdown,
    testthat,
    ggplot2,
    pander,
    curl
VignetteBuilder: knitr
License: GPL-3
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.0.1
  • URL for the package (the development repository, not a stylized html page):

https://github.com/PublicHealthEngland/fingertipsR

  • Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):

Data retrieval because it provides users with functions to import data for over 2,000 indicators of public health in England from the Fingertips website.

  •   Who is the target audience and what are scientific applications of this package?  

The target audience is anyone with an interest in public health. This includes researchers, public and private sector workers and the general public. The website gets almost one million hits in a year. The majority of users are in the public sector who need to make decisions that affect the public's health. This website helps inform those decisions. Increasingly, the website is being accessed by academic institutions. The fingertipsR package is already on CRAN and it is being discussed in some institutions as a future resource for teaching materials.

No

  •   If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

NA

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration, including reporting of test coverage, using services such as Travis CI, Coeveralls and/or CodeCov.
  • I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

  • [Already on CRAN] Do you intend for this package to go on CRAN?
  • [No] Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)
  • [No] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
    • The package is novel and will be of interest to the broad readership of the journal.
    • The manuscript describing the package is no longer than 3000 words.
    • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal.
    • (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

  • Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

@karthik karthik self-assigned this Dec 8, 2017

@karthik

This comment has been minimized.

Member

karthik commented Dec 8, 2017

👋 sebastian-fox
Thank you for your submission to rOpenSci's onboarding. I've tagged your submission as a presubmission while I do some preliminary checks before I assign an editor.

@karthik karthik assigned noamross and unassigned karthik Dec 9, 2017

@karthik karthik added the package label Dec 9, 2017

@noamross

This comment has been minimized.

Collaborator

noamross commented Dec 10, 2017

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Thank you for your submission, @sebastian-fox ! I see you are running the package on Travis already, but we do require a CI badge in your README as well as a continuous code coverage service (generally codecov or coveralls) and a badge from that, as well, so we can see package status at a glance. I'm seeking reviewers and will assign them once you've made this update.

Here is output from goodpractice::gp(). I see nothing sufficient to hold up assigning reviewers, but 58% test coverage is a bit on the low side, and the rest should be handled by the end of review. Things like line length and whole-package imports are up to the judgement of reviewers and this serves a reference for them and you.

── GP fingertipsR ────────────────────────────────────

It is good practice to

  ✖ write unit tests for all functions, and all package code in general. 58% of code lines
    are covered by test cases.

    R/area_types.R:36:NA
    R/area_types.R:93:NA
    R/area_types.R:94:NA
    R/area_types.R:95:NA
    R/area_types.R:96:NA
    ... and 181 more lines

  ✖ use '<-' for assignment instead of '='. '<-' is the standard, and R users and
    developers are used it and it is easier to read your code for them if you use '<-'.

    R/select_indicators.R:48:33

  ✖ avoid long code lines, it is bad for readability. Also, many people prefer editor
    windows that are about 80 characters wide. Try make your lines shorter than 80 characters

    R/area_types.R:36:1
    R/area_types.R:61:1
    R/area_types.R:66:1
    R/area_types.R:69:1
    R/deprivation_decile.R:38:1
    ... and 103 more lines

  ✖ avoid sapply(), it is not type safe. It might return a vector, or a list, depending on
    the input data. Consider using vapply() instead.

    R/fingertips_data.R:187:33
    R/fingertips_data.R:188:48
    tests/testthat/test-deprivation.R:21:22
    tests/testthat/test-deprivation.R:22:22
    tests/testthat/test-deprivation.R:23:22
    ... and 1 more lines

  ✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...), 1:NROW(...) and 1:NCOL(...)
    expressions. They are error prone and result 1:0 if the expression on the right hand side is
    zero. Use seq_len() or seq_along() instead.

    R/retrieve_data.R:10:19

  ✖ not import packages as a whole, as this can cause name clashes between the imported
    packages. Instead, import only the specific functions you need.
──────────────────────────────────────────────

Reviewers: @carlganz, @nacnudus
Due date: 2018-01-15

@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Dec 15, 2017

Thanks for taking a look at this. I have added the badges as requested. I have also improved the coverage of the package.

I've also updated some of the code based on the goodpractice feedback

@noamross

This comment has been minimized.

Collaborator

noamross commented Dec 18, 2017

Reviewers are @carlganz and @nacnudus. In light of the end-of-year holidays, due date is 2018-01-15.

@noamross

This comment has been minimized.

Collaborator

noamross commented Dec 18, 2017

@sebastian-fox You can add our Peer Review badge to your README. It will read "in-progress" and update after acceptance:

[![](https://badges.ropensci.org/168_status.svg)](https://github.com/ropensci/onboarding/issues/168)
@nacnudus

This comment has been minimized.

nacnudus commented Dec 29, 2017

Hi @sebastian-fox @noamross, here's my review.

Package Review

By Duncan Garmonsway @nacnudus

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work.

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed (no
    claims made)
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:


Review Comments

The fingertipsR package provides programmatic access to data published by Public Health England on the Fingertips website. This is useful in its own right, but especially because the downloads offered by the website appear to be exclusively as spreadsheets.

The package has some fairly heavy dependencies (both dplyr and data.table), but makes the most of shiny and related packages to provide a searchable index of datasets.

The vignettes demonstrate common usage and are easy to follow, and data structure is self-explanatory once it has been retrieved.

README

The introductory statement explains that the package is for interacting with a data tool called 'Fingertips'. It would be helpful if it also explained what the Fingertips data tool is.

The first installation instruction requires the devtools package. Please could you also give instructions that only use base R, e.g. https://stackoverflow.com/a/1474125/937932. Please can you also add an instruction for installing from CRAN, and make it clear that there is a difference, e.g. 'stable' vs 'development'.

Please can the example in the README be described in full. The current text is generic:

This is a basic example which shows you how to solve a common problem:

The instruction for browsing vignettes only works if the vignettes have been installed, which is not the default. If using devtools, then the call should be

devtools::install_github("PublicHealthEngland/fingertipsR",
                         build_vignettes = TRUE,
                         dependencies = "suggests")

Life Expectancy vignette

I read this vignette before reading any of the function documentation, so some of my points are answered in the docs (but it would still be helpful not to have to go to the docs for the answers).

The .md file committed to GitHub is out of date.

There seems to be a line missing

We need to begin by calling the fingertipsR package:

Should it be followed by

library(fingertipsR)

?

In the code chunk under 'IndicatorID', the comment beginning # Because the same indicators ... would be easier to read (and render more nicely as html!) if it were wrapped to lines of 80 characters as below. It also mentions 'profiles' without defining what profiles are. That is defined on the Fingertips website, but it would be helpful to repeat some information here for users who have come straight here. They might come here first if they are maintaining code written by others. From the packaging_guide:

  • If your package connects to a data source or online service, or wraps other software, consider that your package README may be the first point of entry for users. It should provide enough information for users to understand the nature of the data, service, or software, and provide links to other relevant data and documentation. For instance, a README should not merely read, "Provides access to GooberDB," but also include, "..., an online repository of Goober sightings in South America. More information about GooberDB, and documentation of database structure and metadata can be found at link.

If filtering the output of indicators() is a common workflow (code chunk below), then it might help users if there was a built-in function to return one record per IndicatorID.

inds <- indicators()
life_expectancy <- inds[grepl("life expectancy", tolower(inds$IndicatorName)),]

# Because the same indicators are used in multiple profiles, there are many
# repeated indicators in this table (some with varying IndicatorName but same
# IndicatorID)

# This returns a record for each IndicatorID
life_expectancy <-
  unique(life_expectancy[duplicated(life_expectancy$IndicatorID) == FALSE,
                        c("IndicatorID", "IndicatorName")])
knitr::kable(life_expectancy, row.names = FALSE)

areaTypes()$AreaTypeID is numeric vector, but only contains integers. Would you consider using an integer vector instead. Similarly areaTypes()$ParentAreaTypeID

Please could you explain the relationship between AreaTypeID and ParentAreaTypeID. It is another thing that is probably explained on the Fingertips website, but that would be convenient to know at this point in the vignette.

Typo: though it change should be though it changes.

The quoted paragraph below begs the question, are there any other defaults, and are they guessable? What's the rationale? Also, what do 101, 102 and 6 translate to?

ParentAreaTypeID is 6 by default for the fingertips_data() function for AreaTypeID of 102 (though it change if different AreaTypeIDs are entered), so we can stick with that in this example.

Typo in the docs for deprivation_decile(). Outputs a data frame allocating deprivation decile to area code base on should be based on.

The deprivation section interrupts the workflow from finding the IDs to downloading the data with fingertips_data(). It would flow better to finish downloading the data before seeking the deprivation deciles.

It also took me a while to understand what deprivation_decile() does. It downloads a dataset that is already available in Fingertips the usual way, filters it for the given IDs, renames a column nd recalculaties the decile column. It would be helpful to explain somewhere why the deprivation decile data gets this special treatment.

Should the error message below be "Year must be 2010, 2011, 2012 or 2015"?

    if (!(Year %in% c(2010, 2011, 2012, 2015))) {
        stop("Year must be either 2010 or 2015")
    }

The style of the following if statement could be improved by simplifying.

    if (!(Year %in% c(2010, 2011, 2012, 2015))) {
        stop("Year must be either 2010 or 2015")
    }
    else if (Year %in% c(2010, 2011, 2012)) {
        IndicatorID <- 338
    }
    else if (Year == 2015) {
        IndicatorID <- 91872
    }
    if (AreaTypeID == 101) {
        if (Year %in% c(2010, 2015)) {
            AreaFilter <- "District & UA"
        }
        else if (Year %in% c(2011, 2012)) {
            stop("Year must be either 2010 or 2015 for AreaTypeID of 101 or 102")
        }
    }
    else if (AreaTypeID == 102) {
        if (Year %in% c(2010, 2015)) {
            AreaFilter <- "County & UA"
        }
        else if (Year %in% c(2011, 2012)) {
            stop("Year must be either 2010 or 2015 for AreaTypeID of 101 or 102")
        }
    }
    else if (AreaTypeID == 7) {
        AreaFilter <- "GP"
    }
    else {
        stop("AreaTypeID must be either 101 (Local authority districts and Unitary Authorities), 102 (Counties and Unitary Authorities) or 7 (General Practice).")
    }

Here is a shorter version that I find easier to follow.

    if (!(Year %in% c(2010, 2011, 2012, 2015))) {
        stop("Year must be either 2010 or 2015")
    }
    if (!(AreaTypeID %in% c(101, 102, 7))) {
        stop("AreaTypeID must be either 101 (Local authority districts and Unitary Authorities), 102 (Counties and Unitary Authorities) or 7 (General Practice).")
    }
    if ((AreaTypeID %in% c(101, 102)) && !(Year %in% c(2010, 2015))) {
        stop("Year must be either 2010 or 2015 for AreaTypeID of 101 or 102")
    }
    if (Year %in% c(2010, 2011, 2012)) IndicatorID <- 338
    if (Year == 2015) IndicatorID <- 91872
    if (AreaTypeID == 101) AreaFilter <- "District & UA"
    if (AreaTypeID == 102) AreaFilter <- "County & UA"
    if (AreaTypeID == 7) AreaFilter <- "GP"

The output of pander::pandoc.table() doesn't work in the HTML rendering of the vignette. It gives the following output, which is markdown syntax but in a code block instead of being rendered as HTML.

##
##
## |  &nbsp;  | IndicatorID |              IndicatorName              | ParentCode |
## |:--------:|:-----------:|:---------------------------------------:|:----------:|
## | **5827** |    90362    | 0.1i - Healthy life expectancy at birth | E12000005  |
## | **5828** |    90362    | 0.1i - Healthy life expectancy at birth | E12000006  |
## | **5829** |    90362    | 0.1i - Healthy life expectancy at birth | E12000008  |
## | **5830** |    90362    | 0.1i - Healthy life expectancy at birth | E12000005  |
## | **5831** |    90362    | 0.1i - Healthy life expectancy at birth | E12000008  |
## | **5832** |    90362    | 0.1i - Healthy life expectancy at birth | E12000005  |
...

Interactively selecting indicators vignette

Does the word 'significantly' in this sentence imply that a statistical significance test has been applied? If not then less loaded term may be better, e.g. 'notably', or an explanation of how the indicators are chosen.

highlighting the areas for each indicator that are deteriorating and are already significantly worse than the England (or parent) value

library(fingertipsR)
library(ggplot2)
inds <- select_indicators()

Have you considered wrapping the geom_tile() plot as a built-in plot function? If so, it would be worth refining the design. Points might be easier to read than tiles.

Function documentation

In general it would be helpful to make it clear that the named government departments are in the UK.

area_types
category_types
deprivation_decile
fingertips_data
fingertips_redred
indicator_areatypes
indicator_metadata
indicators
profiles
select_indicators

area_types()

Needs an example that uses the second argument AreaTypeID. This would be a good place to explain how area types relate to parent area types, or at least link to an explanation.

category_types

Please could it return its result visibly (i.e. category_types() in the console should print its output to the console).

Should have an example, even if it seems trivial.

deprivation_decile

The restriction on the Year argument to the years 2010 and 2015 isn't reflected in the code, which also accepts 2011 and 2012.

It would be helpful to add a Details section to explain what this function actually does (downloads a dataset the usual way, filters it, renames and recalculates a couple of columns). Presumably there is a good reason not to just download the deprivation data oneself?

fingertips_data

The different IDs are a bit overwhelming. Is there a way to explain how they relate to one another? Perhaps a drawing would help.

Would you reconsider the stringsAsFactors argument? I am not sure that it is useful for strings to be converted to factors in this case. If it is useful, then it would be more useful (and less dangerous) if they were converted to a standard sets of factor levels that were guaranteed to included levels that might be missing from the data. For example, if an area were missing from a particular dataset, it should still have a factor level, so that the data could safely be joined to another dataset of the same areas.

What do the defaults 6, 101 and 102 translate to?

Typo in the examples: trailing '#.

     fingdata <- fingertips_data(DomainID = doms)#'

The matter of polarity could be explained more. I think it's something to do with whether a bigger number is better or worse, but I don't see why that leads to duplicates in the example.

fingertips_redred

Does the word 'significantly' imply a statistical test is performed? If not, a less loaded term would be better, e.g. 'notably', or 'quite a bit'.

If Comparator = "Parent", does the spot test still relate to "England" as per the text?

The @return heading is not being rendered correctly.

indicator_areatypes

This function loads a local dataset. Is it not available from the API? Otherwise it would make more sense (to me!) to explort the dataset than to wrap it in this function. Though I can see that it is consistent with the other functions that return datasets (and that accept arguments).

indicator_metadata

Good.

indicators

Is the title correct? "Profiles".

profiles

Good.

select_indicators

Nice function :)

Community guidelines / Code of Conduct

Please could you add guidelines or a code of conduct in the README or a CONTRIBUTING file or a CODE_OF_CONDUCT.md file, and a Maintainer: tag in the DESCRIPTION file. See the
packaging_guide.

We recommend that you use a code of conduct such as the Contributor Covenant in developing your package. You can document your code of conduct in a CODE_OF_CONDUCT.md or CONDUCT.md file in the package root directory, and linking to this file from the README.md file. devtools::use_code_of_conduct() will add the Contributor Covenant template to your package.

NEWS

Please follow the packaging guide advice on putting CRAN release dates in the NEWS file, and tagging the release on GitHub.

DESCRIPTION

If the UK Government owns the copyright on this package then that should be mentioned in the DESCRIPTION.

Tests

Please could more tests be added? Some lines of code are untested so far, as the following code will show. This is fair enough for the shiny function select_indicators(), but would be straightforward to hit 100% of the other functions. The select_indicators() function can be excluded from coverage statistics by using the comments # nocov start and # nocov end.

x <- covr::package_coverage()
covr::zero_coverage(x)

goodpractice

There are a lot of long lines, and I am one of those people who likes to fit other windows on the screen besides the editor ;) Sometimes it's unavoiadable because of the API calls, but in most cases function arguments could flow onto another line, or inline sequences of piped functions could be wrapped in their own function.

General comments

It would be well worth asking a copy editor, or someone of that ilk, to edit the docs and vignettes. It just needs a bit of tidying up to make it easier to read, e.g. changing Other data extract functions to Other data extraction functions.

fingertips_redred(), indicator_metadata() return tibbles, which is very nice. Would you consider also returning tibbles from the other functions? area_types(), category_types(), indicator_areatypes(), deprivation_decile(), fingertips_data(), areatypes_by_indicators, indicators() and profiles()?

area_types and areatypes are both used in function names. Maybe not worth changing now because backwards compatibility is important and the package is already widely used.

Please can you remove the commented-out code in retrieve_data.R.

Not a question about the package, but for Public Health England -- why isn't http://fingertips.phe.org.uk/ on a gov.uk domain?

I hope this review is helpful. Please ask any questions you have.

Time spent reviewing

6 hours

@carlganz

This comment has been minimized.

carlganz commented Jan 3, 2018

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
    * Readme should probably include a more detailed statement of need.
  • Installation instructions: for the development version of package and any non-standard dependencies in README
    * As @nacnudus stated, devtools::install_github should probably include build vignette option.
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
    * Missing documentation for indicators2. What is this function?
    * select_indicator should probably have a note about the long loading time
    * Typo https://github.com/PublicHealthEngland/fingertipsR/blob/78c09694d36e7bcda0dd71b05017f556020b5404/R/enhancements.R#L5
  • Examples for all exported functions in R Help that run successfully locally
    * Missing examples for category_types, indicator_areatypes; also category_types doesn't work
    * Typo https://github.com/PublicHealthEngland/fingertipsR/blob/master/R/fingertips_data.R#L34
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
    * Again echoing @nacnudus, please include contributing guidelines, and the copyright holder in the description

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
    • Echoing @nacnudus, please use code coverage. There seem to be enough tests, but without code coverage it is difficult to tell.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 5


Review Comments

The fingerTipsR package does an excellent job providing easy access to a substantial amount of data. It even provides a nifty Shiny gadget for interactive selection of data. I think there are a few areas where the package can be improved, but overall the package is high quality.

Documentation

The readme is a little sparse. Please add a statement of need. A more detailed description of what the Fingertips data source is would be nice. How is the data collected? What is scope of dataset? Ideally, a person who has never heard of the Fingertips dataset should be able to read the readme and then have a basic understanding of what the dataset is.

Vignettes do a good job explaining how to use the package. Echoing @nacnudus flipping the "Deprivation" and "Extracting the data" sections from the "Life expectancy by deprivation" vignette would make the document flow better.

The indicator2 function has no documentation.

category_types and indicator_areatypes do not have examples.

For functions that take a long time to run, it would be wise to add a note to documentation saying so.

Functionality

The Fingertips dataset is pretty substantial. This makes certain functionality pretty slow. There are a few ways the code could be optimized. These are simply suggestions on how to improve the code. I will still recommend approving this package even if you don't incorporate these changes.

  • Remove data.table dependency

data.table::rbindlist is functionally equivalent to dplyr::bind_rows. Since dplyr is already imported may as well use bind_rows

  • Remove purrr::map_df(rbind)

This is also functionally equivalent to dplyr::bind_rows, but is much slower. This would also remove the purrr dependency, but I'm about to recommend using purrr for something else.

  • Don't build data structures in for loop

@noamross has a good write up about why this idiom is memory inefficient in section 4 here

Here is one example in the package: https://github.com/PublicHealthEngland/fingertipsR/blob/master/R/indicators.R#L42

There are a few different ways of dealing with this. The purrr based solution would look something like this:

df <- DomainId %>% map_dfr(function(dom) {
          dfRaw <- paste0(path,"indicator_metadata/by_group_id?group_ids=",dom) %>%
                        GET %>%
                        content("text") %>%
                        fromJSON(flatten = TRUE)
                if (length(dfRaw) != 0){
                        dfRaw <- unlist(dfRaw, recursive = FALSE)
                        dfIDs <- dfRaw[grepl("IID", names(dfRaw))]
                        names(dfIDs) <- gsub(".IID","",names(dfIDs))
                        dfDescription <- unlist(dfRaw[grepl("Descriptive", names(dfRaw))],
                                                recursive = FALSE)
                        dfDescription <- dfDescription[grepl("NameLong", names(dfDescription))]
                        names(dfDescription) <- gsub(".Descriptive.NameLong","",names(dfDescription))
                        commonNames <- intersect(names(dfIDs), names(dfDescription))
                        dfIDs <- dfIDs[commonNames]
                        dfDescription <- dfDescription[commonNames]
                        
                        data.frame(IndicatorID = unlist(dfIDs),
                                         IndicatorName = unlist(dfDescription),
                                         DomainID = dom,
                                         row.names=NULL)
                }
})
        

This could also be done with lapply.

  • Fix category_types

Doesn't return anything.

Hopefully this review was helpful.

Kind Regards,

Carl Ganz

@noamross

This comment has been minimized.

Collaborator

noamross commented Jan 3, 2018

Thank you for your thorough reviews, @nacnudus and @carlganz! @sebastian-fox, at this point we generally aim for a response to reviewers within two weeks. This can be fully implementing recommended changes or just a written response of the changes you plan to make and a time frame (example).

I would re-emphasize @nacnudus's comment on thinking about the README and other documentation as first points of entry for users. We try to follow this mentality throughout package documentation. I also strongly suggest you follow @carlganz's recommendations on performance and reducing dependencies.

@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Jan 12, 2018

Hi @noamross , @carlganz and @nacnudus ,

Firstly, a massive thank you for taking your time to look at this package. I can see you have given it an extremely thorough review and I've learnt a lot from your comments, so thank you.

I'll respond to question first, then what I have/haven't addressed:

Question

Why isn't Fingertips on a .gov.uk website? I'm afraid I don't know the answer (I'm not in any of the teams that would be involved in those discussions). My guess is that it is a legacy issue. PHE was formed relatively recently and Fingertips was popular before then, so there are probably challenges to move it across.

Addressed

General
  • All typos have been corrected
  • stopped dependency on purrr and data.table (thanks for the bind_rows tip - I've not used that before)
  • replace all "for loops" with lapply functions except in retrieve_data.R file as that's a little more complex (in my eyes at least) as I don't currently have the time. I've added it as an issue
  • all "significants" are now preceded by "statistically" for clarity
  • code coverage should now be around 100% (it says I'm not testing 1 line when I follow the covr instructions @nacnudus provided, but I'm sure I am testing it!)
  • Package tagged on GitHub
  • attempted to shorten lines where possible (and where I've spotted them)
  • commented out fields have been removed
  • tibbles are now returned instead of data.frames
  • functions that take time to run now have a comment in the description with that warning
README
  • Improved explanation of the Fingertips website, its scope, who it is aimed at, what it contains
  • Improved package installation instructions
  • Code of conduct included
  • Example has been improved and the text explaining it should be a little clearer
DESCRIPTION
  • Maintainer: added
  • Copyright added
NEWS
  • CRAN release dates added
LE Vignette
  • All feedback from both reviewers has been incorporated except the advice not to display one of the tables using pander::pandoc.table(). I chose this because I felt this was the clearest way of displaying this dataset with so many fields. I wanted to avoid left-right scrolling, but that was just my style. Any other suggestions are welcomed though
Interactively selecting indicators vignette
  • geom_tile changed to geom_point - nice suggestion. I'm not keen to provide a function for that chart at the moment. Perhaps if the package evolves to start displaying standard charts
area_types
  • added an AreaTypeID example
  • added reference to the Life Expectancy vignette for further explanation of how areas can map to each other
category_types
  • example added
  • now returns something! (whoops)
deprivation_decile
  • Details section added
  • The years/error messages have been updated - well spotted
fingertips_data
  • I couldn't see the defaults 6 and 101 there to explain them - apologies if I've overlooked them
  • stringsAsFactors removed
  • Fuller explanation of the IDs in the Life Expectancy vignette. I hope this makes things a little clearer.
  • Polarity explained in Details section
fingertips_redred
  • @return renders correctly now
  • Text corrected based on different comparator type
indicator_areatypes
  • function updated to pull data from API (this wasn't available on the API until recently)
  • example added
indicators
  • title corrected
select_indicators
  • thanks for positive feedback. It was a colleague's idea, I'll pass that onto him :)
indicator2
  • I can't find this function. I've used "Find in files" but can't see it anywhere. @carlganz - could you point me towards it? It shouldn't exist so I need to delete it.
@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Jan 12, 2018

I should have mentioned - all the changes have been made to v0.1.3.9000. I should submit the package to CRAN again soon because of the change in the indicator_areatypes() function now pulling data from the API. I'll wait for any further feedback from you before doing so.

@nacnudus

This comment has been minimized.

nacnudus commented Jan 13, 2018

@sebastian-fox thanks for making all those changes -- that's a lot of work! There are a couple of small leftovers, but in general I think this is ready to go now.

  • Coverage: The one line that isn't covered (159 of R/fingertips_data.R) is one that I don't think will ever be executed, because an earlier, equivalent test in the same function will fail first. It could be tidied up, but I don't think it's a blocker for publication.
  • The Travis badge in the README should link to https://travis-ci.org/PublicHealthEngland/fingertipsR to take people who click it to the test results, rather than to the image.
  • The new README blurb is very nice. It has one typo ;) accesibility should be accessibility.
  • The CRAN installation code would render correctly if wrapped as a chunk, rather than as inline code, e.g.
    ```{r CRAN-install, eval=FALSE}
    install.packages("fingertipsR")
    ```
  • Your use of pandoc in the vignette: fair enough!
  • deprivation_decile() the test tests/testthat/test-deprivation.R needs updating too, and then the build will succeed. Line 32 should be
        expect_error(deprivation_decile(Year = 2014), "Year must be either 2010, 2011, 2012 or 2015")
  • The 101, 102 and 7 defaults have now been explained (perhaps 6 was changed
    to 7?). Anyway, it makes much more sense to me now, thanks.
@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Jan 14, 2018

Hi @nacnudus ,

Thanks for the further comments - all really helpful again.
I've made all the changes now:

  • removed line 159 from fingertips_data() as it was being tested on line 91
  • updated link to travis
  • corrected accessibility typo
  • fixed the CRAN install README chunk to render properly
  • corrected the deprivation_decile() test

And this time I've checked that the package builds.

Thanks again

@noamross

This comment has been minimized.

Collaborator

noamross commented Jan 15, 2018

Thanks for your revisions and response, @sebastian-fox and @nacnudus. @carlganz, let us know whether the changes address your comments.

@nacnudus

This comment has been minimized.

nacnudus commented Jan 15, 2018

Thanks @sebastian-fox, that's everything from me.

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

@carlganz

This comment has been minimized.

carlganz commented Jan 17, 2018

Final approval (post-review)
The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Congrats @sebastian-fox

@noamross

This comment has been minimized.

Collaborator

noamross commented Jan 18, 2018

Approved! 🎉 Thank you for submitting and @nacnudus and @carlganz for thorough and timely reviews. To-dos:

  • Transfer the repo to the rOpenSci organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.

  • Add the rOpenSci footer to the bottom of your README

[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)

  • Fix any links in badges for CI and coverage to point to the ropensci URL. (We'll turn on the services on our end as needed)

Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. ((https://ropensci.org/blog/). If you are, @stefaniebutland will be in touch about content and timing.

@stefaniebutland

This comment has been minimized.

stefaniebutland commented Jan 18, 2018

Hello @sebastian-fox. I'm rOpenSci's Community Manager. Happy to answer any questions you might have about contributing a post for the rOpenSci blog.

@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Jan 18, 2018

Thanks all - I can see a lot of work goes into this from your end so this is really appreciated.

I will transfer the package over as soon as I can. I'm don't have admin access currently to my repo so I can't see the settings option. I need to negotiate admin rights, so hopefully that won't take too long.

Thanks again

@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Jan 19, 2018

Hi @noamross - I've completed your checklist. Thanks for your help. I'll start thinking about blogs :)

@noamross

This comment has been minimized.

Collaborator

noamross commented Jan 19, 2018

Great! Welcome aboard! I will close this issue, you'll find instructions on submitting a blog post over at https://github.com/ropensci/roweb2#contributing-a-blog-post .

@noamross noamross closed this Jan 19, 2018

@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Apr 20, 2018

Hello, just wondering how to get the rOpenSci badge status updated from "Under review" to "peer reviewed" for fingertipsR on the README file?

@karthik karthik changed the title from fingertipsR to tcgughkclrticbfingertipsR Apr 20, 2018

@karthik karthik changed the title from tcgughkclrticbfingertipsR to fingertipsR Apr 20, 2018

@noamross

This comment has been minimized.

Collaborator

noamross commented Apr 20, 2018

We see it as "Peer Reviewed". You may need to clear your browser cache.

@sebastian-fox

This comment has been minimized.

Member

sebastian-fox commented Apr 23, 2018

Thanks - that did the trick :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment