Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type error when all values in group are NA when using group_by + mutate + ifelse #1463

Closed
andrewheiss opened this issue Oct 20, 2015 · 3 comments
Closed
Labels
Milestone

Comments

@andrewheiss
Copy link

@andrewheiss andrewheiss commented Oct 20, 2015

This is probably related a bunch of issues (#958, #1432, #1381, for example) that produce "incompatible types" errors. When making a logical test in a mutate(x = ifelse(...)) statement after defining a group, if every value of x in that group is NA, dplyr will complain about variable types. This does not happen outside of groups:

example.df <- data_frame(group_id = rep(1:5, each=10),
                         year = rep(2001:2010, times=5),
                         x1 = rep(c(rnorm(9), NA), times=5)) %>%
  bind_rows(data_frame(group_id = 6, year = 2001:2010, x1 = NA))

# This works:
example.df %>%
  mutate(x2 = ifelse(x1 > 1, 1, 0))

# This doesn't;
example.df %>%
  group_by(group_id) %>%
  mutate(x2 = ifelse(x1 > 1, 1, 0))
# Yields: "Error: incompatible types, expecting a numeric vector"

The best workaround for now is to either avoid mutate(ifelse()) calls inside groups, or to explicitly wrap ifelse statements with as.numeric, like so:

# This works:
example.df %>%
  group_by(group_id) %>%
  mutate(x2 = as.numeric(ifelse(x1 > 1, 1, 0)))
@hadley
Copy link
Member

@hadley hadley commented Oct 20, 2015

Minimal reprex:

library(dplyr)

df <- data_frame(
  x = c(1, 2),
  y = c(1, NA)
)

df %>% 
  group_by(x) %>%
  mutate(z = ifelse(y > 1, 1, 2))

The root of the problem is this:

typeof(ifelse(NA, 1, 2))
# logical

when it should be integer.

@hadley hadley added this to the 0.5 milestone Oct 21, 2015
@hadley
Copy link
Member

@hadley hadley commented Oct 21, 2015

@romainfrancois one simple fix for this would be to follow the same coercion rules as bind_rows(), i.e. NA is always silently converted up. Should look at as part of #594

@jerryfuyu0104
Copy link

@jerryfuyu0104 jerryfuyu0104 commented Jul 4, 2017

i used dplyr 0.5.0, the following code using data.frame yields: "Error: incompatible types, expecting a numeric vector" :
dat <- data.frame(a = c("c", "c", "c", "d", "d"),b = c("c", "c", "c", "d", "d"))
dat %>% group_by(a) %>% mutate(is_c = ifelse(b=="c", a, "a"))
but using data_frame can work correctly:
dat <- data_frame(a = c("c", "c", "c", "d", "d"),b = c("c", "c", "c", "d", "d"))
dat %>% group_by(a) %>% mutate(is_c = ifelse(b=="c", a, "a"))
i'm confused about it.

@lock lock bot locked as resolved and limited conversation to collaborators Jun 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants