Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n_distinct gives unexpected result for identical lists #5115

Closed
karldw opened this issue Apr 15, 2020 · 1 comment
Closed

n_distinct gives unexpected result for identical lists #5115

karldw opened this issue Apr 15, 2020 · 1 comment

Comments

@karldw
Copy link
Contributor

karldw commented Apr 15, 2020

I noticed that n_distinct(mylist) gives a different value than length(unique(mylist)) when mylist is a list of lists. This might be a bug?

I think this is a duplicate of #3699 and therefore also #2355, but those discussions are closed and marked as fixed.

library(dplyr)
(mylist <- c(list(1L), list(1L)))
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 1

identical(mylist[1], mylist[2])
#> [1] TRUE
length(unique(mylist))
#> [1] 1
n_distinct(mylist)
#> [1] 2

# The behavior is the same if we're using a list column in a tibble.
tibble(l = mylist) %>% mutate(n = n_distinct(l))
#> # A tibble: 2 x 2
#>   l             n
#>   <list>    <int>
#> 1 <int [1]>     2
#> 2 <int [1]>     2

Using dplyr v0.8.5, rlang 0.4.5, and vctrs 0.2.4.

@romainfrancois
Copy link
Member

This appears to be fixed in the development version, soon to be released as 1.0.0

library(dplyr, warn.conflicts = FALSE)
(mylist <- c(list(1L), list(1L)))
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 1
identical(mylist[1], mylist[2])
#> [1] TRUE
length(unique(mylist))
#> [1] 1
n_distinct(mylist)
#> [1] 1

# The behavior is the same if we're using a list column in a tibble.
tibble(l = mylist) %>% 
  mutate(n = n_distinct(l))
#> # A tibble: 2 x 2
#>   l             n
#>   <list>    <int>
#> 1 <int [1]>     1
#> 2 <int [1]>     1

Created on 2020-04-16 by the reprex package (v0.3.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants