@edoardomichielon commented on Jun 20, 2018, 11:10 AM UTC:
When I use a connection to a sqlite on disk, the verb n_distinct() returns an error if there are two or more columns. Same code, if I select just one column or collect data before counting record, it works properly.
# remove objects
rm(list = ls())
# require packages
require(dplyr)
require(dbplyr)
require(RSQLite)
# create a sqlite db and connect to it
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
# copy data into sqlite
copy_to(con, mtcars)
# point to table
my_tbl <- tbl(con, "mtcars")
# n_distinct with one column (THIS WORKS)
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg))
# if I use the local (or collected) data it is ok (THIS WORKS)
my_tbl %>% collect() %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl))
# n_distinct with two columns (THIS DOES NOT WORKS)
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl))
## Error in result_create(conn@ptr, statement) :
## wrong number of arguments to function COUNT()
The code is correctly translated into Sql
# Show query
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl)) %>% show_query()
## SELECT `gear`, COUNT(DISTINCT `mpg`, `cyl`) AS `n_distinct(mpg, cyl)`
## FROM `mtcars`
## GROUP BY `gear`
This issue was moved by batpigandme from tidyverse/dplyr/issues/3687.
@edoardomichielon commented on Jun 20, 2018, 11:10 AM UTC:
When I use a connection to a sqlite on disk, the verb n_distinct() returns an error if there are two or more columns. Same code, if I select just one column or collect data before counting record, it works properly.
The code is correctly translated into Sql
This issue was moved by batpigandme from tidyverse/dplyr/issues/3687.