Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Review #9

Open
ablack3 opened this issue Dec 8, 2022 · 0 comments
Open

Code Review #9

ablack3 opened this issue Dec 8, 2022 · 0 comments

Comments

@ablack3
Copy link

ablack3 commented Dec 8, 2022

Overview

Overall the package is really useful and the code looks good to me. I made some suggestions for documentation updates and a few implementation notes.

Documentation

Overall the documentation is good. I've made some suggestions below but these are minor suggestions that I think might help improve the user experience.

The readme describes basic usage well. However it might be worth putting the information in a vignette as well with some more context (e.g. what are "permitted packages"?)

The package title and description in the DESCRIPTION file are a bit vague: "Tool Suite to Investigate Other Packages". Maybe the title should be something like "Tool Suite to Investigate R package dependencies".

Some of the prose in the readme could be improved. For example here is my suggested edit of one of the paragraphs.

original

Function use in all .R files can be investigated using the sumariseFunctionUse function. It assumes the function is runs inside an R-project and will automatically look through the .R files in the /R folder. Functions are grouped by the package they are imported from. When no package can be identified are grouped under unknown package. Usually base functions are not used as base::mean(). Therefore the function goes through all the base functions to bin them under the base package rather than unknown.

edit

Function use in all .R files can be investigated using the summariseFunctionUse function. sumariseFunctionUse is intended to be called from inside an R-project and will automatically look through the .R files in the /R folder. Functions used in .R files are are grouped together by the package they are imported from. Functions whose package cannot be identified are grouped together under the unknown label.

Make sure to spell check the package using spelling::spell_check_package(). Once all corrections are made you can update the package wordlist (words that should be added to the dictionary) with spelling::update_wordlist().

It looks like DependencyReviewer imports a few packages that need to be whitelisted.
image

Function descriptions should describe what they return or what side-effect they have. The documentation for checkDependencies should say something about its side effect of printing a report of dependencies that have not been approved.
image

Some arguments like in_package (in screenshot below) need a description. in_package is used in two exported functions so you could centralize the documentation for this argument. I also think you might be able to handle both relative and absolute paths in the files argument with normalizePath() and possibly deprecate the argument altogether. That just a thought though.

image

Argument names mix camel case and snake case. I wouldn't change that at this point. CDMConnector also does that. In Darwin we seem to be mixing the two styles.

Consider not exporting lower level functions like funsUsedInLine. Actually it looks like you already made this change in the main branch.

getGraphData documentation needs to be filled out a bit more. What is the input and output. What graph is being produced?
image

darwinLintScore takes in a linting function. However that linting function needs to return a very specific data format. It's not clear what other lint functions could be used with this function or if the user would want to have the flexibility to pass in any lint function. There should be some more description about what lint functions are available and what the requirements for a lint function that is accepted by darwinLintScore are.

Consider creating a package website with pkgdown and hosting it on github.

Implementation

Namespace and exports look good. For GGally consider only importing the one function you need (i.e. @importFrom GGally ggnet2) to silence the message "Registered S3 method overwritten by 'GGally': method from +.gg ggplot2" that is printed when the package is loaded.

The package lacks testing and I'd suggest that improving test coverage should be a focus on the next sprint for DependencyReviewer. However I can see how it is tricky to test these functions.

Below is alternative option for the bind_rows(lapply()) pattern that uses tidyr::unnest.

library(dplyr, warn.conflicts = F) # needed for %>% 
not_permitted <- c("dplyr", "pak")

example <-
  dplyr::bind_rows(
    lapply(
      X = not_permitted,
      FUN = function(pkg) {
        dplyr::tibble(pak::pkg_search(pkg)) %>%
          dplyr::filter(.data$package == pkg) %>%
          dplyr::select(.data$package, version, date, .data$downloads_last_month,
                        license, url)
      }))
#> Warning: Use of .data in tidyselect expressions was deprecated in tidyselect 1.2.0.
#> ℹ Please use `"package"` instead of `.data$package`
#> Warning: Use of .data in tidyselect expressions was deprecated in tidyselect 1.2.0.
#> ℹ Please use `"downloads_last_month"` instead of `.data$downloads_last_month`
example
#> # A tibble: 2 × 6
#>   package version    date                downloads_last_month license      url  
#>   <chr>   <pckg_vrs> <dttm>                             <int> <chr>        <chr>
#> 1 dplyr   1.0.10     2022-09-01 08:20:06              1873114 MIT + file … http…
#> 2 pak     0.3.1      2022-09-08 20:30:02                39420 GPL-3        http…

# could be rewritten as

example <-
  dplyr::tibble(pkg = not_permitted) %>% 
  dplyr::mutate(pkg_search  = purrr::map(pkg, pak::pkg_search)) %>% 
  tidyr::unnest(pkg_search) %>% 
  dplyr::filter(.data$package == .data$pkg) %>%
  dplyr::select(package, version, date, downloads_last_month,
                license, url)

example
#> # A tibble: 2 × 6
#>   package version    date                downloads_last_month license      url  
#>   <chr>   <pckg_vrs> <dttm>                             <int> <chr>        <chr>
#> 1 dplyr   1.0.10     2022-09-01 08:20:06              1873114 MIT + file … http…
#> 2 pak     0.3.1      2022-09-08 20:30:02                39420 GPL-3        http…

Also note that you no longer need to use .data in dplyr::select. (https://tidyselect.r-lib.org/news/index.html#lifecycle-changes-1-2-0)

One comment about using "sapply": sapply is not type stable meaning that it sometimes simplifies it's output to a vector and sometimes it will return a list. From what I've heard it is safer to use vapply or lapply and explicitly specify the type of output.

# for example this is guaranteed to return a character vector
tidyversePackages <- vapply(
  X = tidyverse::tidyverse_packages(include_self = TRUE),
  FUN = function(pkg) {
    as.character(utils::packageVersion(pkg))
  },
  FUN.VALUE = character(1L)
)

When you call sapply for a side effect and not the return value (example, example) I think it does not matter since the return value is discarded.

Shiny app

This app is very nice. My only suggestion is that on the function review tab the selection of rows in the table should have some kind of effect. Currently it seems that selecting rows in the function use table does not have any effect on the app. I'm thinking that if the user selects a row then they should be taken to the corresponding row in the code viewer. Also users should only be allowed to select on row at a time.
You might also consider increasing the width of the checkBoxGroup input on the package review page to conserve the vertical screen real estate.
image

The shiny app code looks good to me. Overall its a great tool!

Miscellaneous

In

"Error was caught during the linting of your package. The package
might be to large to lint all together. Use: darwinLintFile(fileName)"

"to" should be "too".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant