Skip to content

n_distinct() on sqlite with more columns returns an error #101

@ghost

Description

@edoardomichielon commented on Jun 20, 2018, 11:10 AM UTC:


When I use a connection to a sqlite on disk, the verb n_distinct() returns an error if there are two or more columns. Same code, if I select just one column or collect data before counting record, it works properly.

# remove objects
rm(list =  ls())

# require packages
require(dplyr)
require(dbplyr)
require(RSQLite)

# create a sqlite db and connect to it
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")

# copy data into sqlite
copy_to(con, mtcars)

# point to table 
my_tbl <- tbl(con, "mtcars")

# n_distinct with one column (THIS WORKS)
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg))

# if I use the local (or collected) data it is ok (THIS WORKS)
my_tbl %>% collect() %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl))

# n_distinct with two columns (THIS DOES NOT WORKS)
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl))

## Error in result_create(conn@ptr, statement) : 
## wrong number of arguments to function COUNT()

The code is correctly translated into Sql

# Show query
my_tbl %>% group_by(gear) %>% summarise(n_distinct(mpg, cyl)) %>% show_query()

## SELECT `gear`, COUNT(DISTINCT `mpg`, `cyl`) AS `n_distinct(mpg, cyl)`
## FROM `mtcars`
## GROUP BY `gear`

This issue was moved by batpigandme from tidyverse/dplyr/issues/3687.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviordplyr verbs 🤖Translation of dplyr verbs to SQL

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions