Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generic count() #5633

Merged
merged 7 commits into from
Dec 7, 2020
Merged

generic count() #5633

merged 7 commits into from
Dec 7, 2020

Conversation

romainfrancois
Copy link
Member

closes #5538

Questions:

  1. I guess it would be better to define count.data.frame instead of count.default() so that we could simplify this:
  # Ensure grouping is transient
  if (is.data.frame(x)) {
    out <- dplyr_reconstruct(out, x)
  }
  out

but then we get:

Error (test-count-tally.r:65:3): works with dbplyr
Error: no applicable method for 'count' applied to an object of class "c('tbl_SQLiteConnection', 'tbl_dbi', 'tbl_sql', 'tbl_lazy', 'tbl')"
Backtrace:
 1. db %>% count(x) %>% as_tibble() test-count-tally.r:65:2
 3. dplyr::count(., x)

so perhaps we can release with .default first, update dbplyr to include a count.tbl_dbi() and then switch it to count.data.frame().

  1. is the signature ok ? function(x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = group_by_drop_default(x)) I'm not sure we can change it because of the early ...

  2. Do we need a generic tally() too ? This seems to suggest it: count and tally methods for grouped tsibbles would be helpful tidyverts/tsibble#232

@davidtedfordholt
Copy link
Contributor

A generic tally() would certainly be helpful, as well.

@romainfrancois
Copy link
Member Author

Same issue with tally can only define tally.default until dbplyr implements methods.

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also make tally() generic.

Can we use count.data.frame() and also provide count.tbl_sql() that is conditionally registered on load only if dbplyr doesn't have a count method? That might be too complicated, but I think it's worth spending an hour or so investigating.

R/count-tally.R Outdated
@@ -62,7 +62,11 @@
#' df %>% add_count(gender, wt = runs)
#' df %>% add_tally(wt = runs)
count <- function(x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = group_by_drop_default(x)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .dropshouldn't be in the generic because it only applies to grouped_df? The others seem reasonable, especially when I think about what dbplyr::count() would do.

@romainfrancois
Copy link
Member Author

That seems to do the trick and keep the existing behavior. Should this however have a slightly different implementation for count.tbl_sql so that:

  • it would not have .drop
  • count.data.frame() is simplified (i.e. remove the if(is.data.frame(x)) thing

(I've only tested it with a dbplyr that does not have the methods, and without dbplyr): not tested with a dbplyr that has the methods.

R/zzz.r Outdated Show resolved Hide resolved
Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me. We can remove the workarounds and similar .data.frame after next dbplyr.

@romainfrancois romainfrancois merged commit 9e29304 into master Dec 7, 2020
@romainfrancois romainfrancois deleted the count_generic branch December 7, 2020 09:10
@romainfrancois romainfrancois added this to the 1.0.3 milestone Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make count generic
3 participants