Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

length(unique(x)) and n_distinct(x) return different answers for some lists #2222

Closed
jgabry opened this issue Oct 31, 2016 · 5 comments
Closed

Comments

@jgabry
Copy link

jgabry commented Oct 31, 2016

The documentation for n_distinct says that it's an "equivalent of length(unique(x))"
but it gives a different answer than length(unique(x)) if x is a list with multiple (identical) elements.

library(dplyr)  # I checked versions 0.5.0 and 0.5.0.9000
x <- list(1,1)
length(unique(x)) # result: 1
n_distinct(x) # result: 2

Is this a bug or is there just a missing caveat in the doc?

@krlmlr
Copy link
Member

krlmlr commented Nov 7, 2016

This is because values in list columns are compared by reference, so n_distinct() treats them as different unless they really point to the same object:

a <- 1
n_distinct(list(a,a))
## [1] 1

Would you like to contribute documentation?

@jgabry
Copy link
Author

jgabry commented Nov 7, 2016

Thanks for the explanation. That makes sense. I'd be happy to submit a PR
that clarifies this in the doc and adds an example demonstrating. Can
hopefully do this today.

On Monday, November 7, 2016, Kirill Müller <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

This is because values in list columns are compared by reference, so
n_distinct() treats them as different unless they really point to the same
object:

a <- 1
n_distinct(list(a,a))## [1] 1

Would you like to contribute documentation?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2222 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AHb4Q3nYXgpKdCguZGs0RwuFYhnxJTnaks5q70GUgaJpZM4KlbbQ
.

@hadley
Copy link
Member

hadley commented Jan 31, 2017

I think this is more of a bug than something we need to document

@jgabry
Copy link
Author

jgabry commented Feb 1, 2017 via email

@hadley
Copy link
Member

hadley commented Feb 2, 2017

Now part of #2355

@hadley hadley closed this as completed Feb 2, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants