-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade package based on dplyr 1.1.0 release #190
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done, quick fix and nicely implemented. I have a few minor questions, notes. I have commented in line.
Code Coverage
All tests pass, but code coverage is insufficient, I'm not suggesting this is fixed in the scope of this PR. Several functions are untested. This should be addressed in a future PR.
NAMESPACE
devtools::check()
results in a warning for me:
Consider adding
importFrom("utils", "data")
to your NAMESPACE file.
.data
is imported from dplyr. Where do we use utils::data
?
Style
Currently we mix tidy evaluation with standard evaluation (passing column names both as strings and as symbols). It would be more neat to stick to either or.
@@ -292,7 +293,7 @@ get_record_table <- function(package = NULL, | |||
dplyr::filter(.data$delta.time.secs == max(.data$delta.time.secs) & | |||
.data$row_number == max(.data$row_number)) %>% | |||
dplyr::ungroup() %>% | |||
dplyr::select(-.data$row_number) | |||
dplyr::select(-"row_number") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: You mix standard and non standard evalutation (tidy evaluation), in this case dplyr::select(-row_number)
would be tidy evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, doesn't break anything ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #190 (comment) for more details. It doesn't break anything but it generates a note:
Undefined global functions or variables:
row_number
@@ -99,7 +99,7 @@ read_camtrap_dp <- function(file = NULL, | |||
obs_col_names <- names(observations) | |||
if (all(c("X22", "X23", "X24") %in% names(observations))) { | |||
observations <- observations %>% | |||
dplyr::rename(speed = X22, radius = X23, angle = X24) | |||
dplyr::rename(speed = "X22", radius = "X23", angle = "X24") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not covered by test-read_camtrap_dp.R
, there are more cases of insufficient coverage. So we can postpone this to a separate pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a fast fix (see #185). These fields will disappear in new version of camtrap-dp. To test this part I think we have to add a new data package to inst/extdata
and I thought it was not really worth.
Great review, @PietrH, thanks a lot. Before I get to specific comments, I would like to discuss with you an important point which comes quite often in your review and it's all due to the new version of dplyr, I am afraid. As I mentioned in my second bullet point above, this is how to solve R CMD check notes: https://dplyr.tidyverse.org/articles/programming.html#eliminating-r-cmd-check-notes. This means using SE evaluation for tidyselect functions (select, rename, ...) and continuing using I tested all ways to work with columns in a data.frame within a dummy package I have just created, (dummyr) with jsut one dummy function So, if we use https://github.com/damianooldoni/dummyr/blob/88a5725d9ce3ea461af513263ae5bc4dd2b7f674/R/check_tidyselect.R#L14-L19 in the dummyr package, we get a R CMD CHECK note for undefined global variables: ❯ checking R code for possible problems ... NOTE
check_tidyselect: no visible binding for global variable 'col_b'
check_tidyselect: no visible binding for global variable 'col_a'
check_tidyselect: no visible binding for global variable 'col_c'
Undefined global functions or variables:
col_a col_b col_c If we use https://github.com/damianooldoni/dummyr/blob/3f507f6027291a28a86776c2ff1c4d3469bbf687/R/check_tidyselect.R#L14-L19, we are using SE for ❯ checking R code for possible problems ... NOTE
check_tidyselect: no visible binding for global variable 'col_b'
check_tidyselect: no visible binding for global variable 'col_c'
Undefined global functions or variables:
col_b col_c And using SE in ❯ checking R code for possible problems ... NOTE
check_tidyselect: no visible binding for global variable 'col_b'
Undefined global functions or variables:
col_b This doesn't surprise me as I learned few days ago that Finale step, as shown in the blogpost of dplyr I mentioned above, we need to add The point is that using NSE in @PietrH: Do you agree on this? |
Answer about the question in NAMESPACE. Actually we are not using |
@damianooldoni interesting summary in #190 (comment) Do I understand correctly that you cannot use quotes column names in
|
It enables compatibility with R 3.6 Co-Authored-By: Pieter Huybrechts <48065851+PietrH@users.noreply.github.com>
Hi @peterdesmet ✋ I forgot indeed to make an example with SE (= using quotes) in I like your list and I will copy paste it here below giving some extra comments:
It doesn't give errors back, "only" the annoying note about "undefined global variables". But I think you meant this.
This gives a warning as
Indeed, see above. But note also that Standard Evaluation (= quotes) it doesn't work for other functions such as
Indeed. As noted above it's mort than just ConclusionPersonally I don't find the solution proposed by the blogpost and implemented in this PR a great solution and I hope tidyverse (and/or R) will come to a better one. Still, there is no better solution at the moment as all other solutions fail or generate warnings/notes at R CMD check level. |
As per the blogpost you are following best practice. Like you I am looking forward to a different solution proposed by the community. So for the time being the mix of NSE and SE is just something we'll have to live with. |
Indeed @PietrH. Completely agree. |
One more small note: we should consider having an R-CMD Check version running R 3.6 to catch things like the new R4.0 I can make an issue if you like |
I was already thinking about this. But indeed, I wanted to close this PR first. So, making a issue is good: see #194 |
This PR aims to follow the changes in dplyr 1.1.0.
dplyr::na_if()
for replacing 0 with NA over all the data.frame (see bullet point referring to this function in https://dplyr.tidyverse.org/news/index.html#vctrs-1-1-0 and Rewritena_if()
using vctrs tidyverse/dplyr#6329).data
in tidyselect expressions (select()
,rename()
,relocate()
) as it generates warnings. Use"colname"
syntax instead. In this way no notes about "no visible binding for global variable" arise. See https://dplyr.tidyverse.org/articles/programming.html#eliminating-r-cmd-check-notes.multiple = "all"
in aleft_join()
to avoid warning (we expect multiple matches of course). See https://www.tidyverse.org/blog/2023/01/dplyr-1-1-0-joins/#multiple-matchesDESCRIPTION
by explicit version 1.1.0 of dplyr dependency (064bfee), adding @PietrH as contributor to the list of authors (955162b) and bumping version number (fb1ac9b)