Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n_distinct() on sqlite with more columns returns an error #101

Closed
move bot opened this issue Jun 20, 2018 · 1 comment
Milestone

Comments

@move
Copy link

@move move bot commented Jun 20, 2018

@edoardomichielon commented on Jun 20, 2018, 11:10 AM UTC:


When I use a connection to a sqlite on disk, the verb n_distinct() returns an error if there are two or more columns. Same code, if I select just one column or collect data before counting record, it works properly.

# remove objects
rm(list =  ls())

# require packages
require(dplyr)
require(dbplyr)
require(RSQLite)

# create a sqlite db and connect to it
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")

# copy data into sqlite
copy_to(con, mtcars)

# point to table 
my_tbl <- tbl(con, "mtcars")

# n_distinct with one column (THIS WORKS)
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg))

# if I use the local (or collected) data it is ok (THIS WORKS)
my_tbl %>% collect() %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl))

# n_distinct with two columns (THIS DOES NOT WORKS)
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl))

## Error in result_create(conn@ptr, statement) : 
## wrong number of arguments to function COUNT()

The code is correctly translated into Sql

# Show query
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl)) %>% show_query()

## SELECT `gear`, COUNT(DISTINCT `mpg`, `cyl`) AS `n_distinct(mpg, cyl)`
## FROM `mtcars`
## GROUP BY `gear`

This issue was moved by batpigandme from tidyverse/dplyr/issues/3687.

@hadley

This comment has been minimized.

Copy link
Member

@hadley hadley commented Jan 2, 2019

Minimal reprex:

library(dplyr, warn.conflicts = FALSE)

mf <- dbplyr::memdb_frame(x = c(1, 1), y = c(2, 2), z = 1:2)
mf %>% group_by(x) %>% summarise(n_distinct(y, z))
#> Error in result_create(conn@ptr, statement): wrong number of arguments to function COUNT()

Created on 2019-01-02 by the reprex package (v0.2.1)

@hadley hadley added this to the v1.4.0 milestone Jan 10, 2019
@hadley hadley closed this in 1b58b68 Jan 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.