Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

group_by fails with numerical variables with NA's #401

Closed
emiliotorres opened this issue Apr 20, 2014 · 1 comment
Closed

group_by fails with numerical variables with NA's #401

emiliotorres opened this issue Apr 20, 2014 · 1 comment
Assignees
Labels
Milestone

Comments

@emiliotorres
Copy link

@emiliotorres emiliotorres commented Apr 20, 2014

Dear Sir,
It is possible that there is a bug in the group_by function when it uses numeric variables with several missing values.
Best regards
Emilio

library(dplyr)
packageVersion("dplyr") # 0.1.3.0.99

x <- as.numeric(c(NA,NA,NA,10:1,10:1))
w <- c(20,30,40,1:10,1:10)*10

n_distinct(x) # 11 OK
data.frame(x=x,w=w) %>% group_by(x) %>% summarise(n=n()) # Wrong: NA appears three times

## Source: local data frame [13 x 2]

##     x n
## 1   1 2
## 2   2 2
## 3   3 2
## 4   4 2
## 5   5 2
## 6   6 2
## 7   7 2
## 8   8 2
## 9   9 2
## 10 10 2
## 11 NA 1
## 12 NA 1
## 13 NA 1
@hadley hadley added this to the v0.2 milestone Apr 23, 2014
@hadley hadley added the bug label Apr 23, 2014
@hadley
Copy link
Member

@hadley hadley commented Apr 23, 2014

@romainfrancois can you take a look please?

@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants