Skip to content

[REVIEW]: Welcome to the tidyverse #1686

@whedon

Description

@whedon

Submitting author: @hadley (Hadley Wickham)
Repository: http://github.com/tidyverse/tidyverse
Version: v1.3.0
Editor: @karthik
Reviewer: @ldecicco-USGS, @jeffreyhanson
Archive: 10.5281/zenodo.3547813

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/3457fa44beec99c5585e1edb601cee61"><img src="https://joss.theoj.org/papers/3457fa44beec99c5585e1edb601cee61/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/3457fa44beec99c5585e1edb601cee61/status.svg)](https://joss.theoj.org/papers/3457fa44beec99c5585e1edb601cee61)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@ldecicco-USGS & @jeffreyhanson, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @karthik know.

Please try and complete your review in the next two weeks

Review checklist for @ldecicco-USGS

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@hadley) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @jeffreyhanson

Conflict of interest

Code of Conduct

General checks

If I understand correctly, my role is to perform these checks for each of the twenty tidyverse packages and also the tidyverse R package. Therefore I have provided information for each item on the checklist below for each package. Please let me know if I have misunderstood anything.

  • Repository: Is the source code for this software available at the repository url?
    • tidyverse
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
    • broom
    • dplyr
    • feather
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2
  • Contribution and authorship: Has the submitting author (@hadley) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
    • tidyverse

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
    • broom
    • dplyr
    • feather
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2
  • Functionality: Have the functional claims of the software been confirmed?
    • broom
    • dplyr
    • feather
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)
    • broom
    • dplyr
    • feather
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
    • broom
    • dplyr
    • feather
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
    • broom
    • dplyr
    • feather
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
    • broom
    • dplyr
    • feather (documentation for write_feather---aliased as the documentation for read_feather---is missing an example that uses the write_feather function)
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
    • broom
    • dplyr
    • feather
    • forcats
    • ggplot2
    • haven
    • hms
    • httr
    • jsonlite
    • lubridate
    • modelr
    • purr
    • readr
    • rvest
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse
    • xml2
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
    • Although all packages contain automated tests, some of the packages do not appear to have relatively high coverage (i.e. greater than 80%). This might be due to issues due with using continuous integration services (e.g. limited run time, disk space, or bandwidth). I have listed below some functions in tidyverse packages that might benefit from unit tests in case this is helpful.
    • broom
    • dplyr
    • feather (coverage not reported but contains tests for main functions)
    • forcats
    • ggplot2
    • haven
    • hms
    • httr (48% coverage coverage): it might be useful to add tests for the functions: cache_info, rerequest, httr_options, curl_docs, set_config, reset_config, with_config, cookies, set_cookies, httr_dr, set_envvar, get_envvar, BROWSE, oauth_endpoints (endpoints for linkedin, twitter, vimeo, facebook, github, and azure), progress, use_proxy, http_type, user_agent, verbose, write_disk, write_memory, write_stream.
    • jsonlite (63% coverage coverage): it might be useful to add tests for the functions: stream_in, stream_out (also verifies internal apply_by_pages and stream_out_page functions), serializeJSON (also verifies pack and fixNativeSymbol), flatten, loadpkg, rbind_pages, unbox.
    • lubridate (72% coverage coverage): it might be useful to add tests for the functions: fit_to_timeline, pretty_dates (with seconds, minutes, hours, and months for coverage of internal pretty_sec, pretty_min, pretty_hour, pretty_month functions), semester, dst, Date.
    • modelr
    • purr
    • readr
    • rvest (43% coverage): it might useful to add tests for functions for forms (set_values, submit_form, google_form), sessions (html_session, jump_to, follow_link, session_history, back, *.session methods), and encoding (guess_encoding, repair_encoding).
    • readxl
    • stringr
    • tibble
    • tidyr
    • tidyverse 58% coverage: perhaps it might be useful to add tests for: tidyverse_deps, tidyverse_logo.
    • xml2
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support
    I have checked the CODE_OF_CONDUCT, DESCRIPTION, README, .github/*.md and vignette files for these three pieces of information, and listed below which information might be missing for each tidyverse package. The left check box corresponds to (1), the middle check box to (2) and the third check box to (3). Please let me know if I have missed anything.
    • [x] [x] broom
    • [x] [x] dplyr
    • [x] [ ] feather
    • [x] [x] forcats
    • [x] [x] ggplot2
    • [x] [x] haven
    • [x] [x] hms
    • [x] [ ] httr
    • [x] [ ] jsonlite
    • [x] [x] lubridate
    • [x] [x] modelr
    • [x] [x] purr
    • [x] [x] readr
    • [x] [x] rvest
    • [x] [x] readxl
    • [x] [x] stringr (information available via files in tidyverse/.github when opening new issue)
    • [x] [x] tibble
    • [x] [x] tidyr
    • [x] [x] tidyverse
    • [x] [ ] xml2

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions