Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case_when formula not working as expected #2927

Closed
gorkang opened this issue Jun 28, 2017 · 1 comment
Closed

case_when formula not working as expected #2927

gorkang opened this issue Jun 28, 2017 · 1 comment
Labels
Milestone

Comments

@gorkang
Copy link

@gorkang gorkang commented Jun 28, 2017

When using case_when with a formula such as Age == 25 & Brochure == "New", its behavior seems not to be equivalent to filter(Age == 25 & Brochure == "New").

This is not expected (at least I didn't expected it) and can cause problems when there are NAs and we do an arithmetic operation as the replacement value.

Please see a reprex where a minimal case is shown.

library(tidyverse)

Age = c(25, 25, NA)
Brochure = c("New", "Old", "New")
PPV_Cond1 = c(NA, 1, NA)
PPV_Cond2 = c(1, NA, 1)
DF = data.frame(Age, Brochure, PPV_Cond1, PPV_Cond2)
DF
#>   Age Brochure PPV_Cond1 PPV_Cond2
#> 1  25      New        NA         1
#> 2  25      Old         1        NA
#> 3  NA      New        NA         1

# One case with Brochure == "New" have NA in Age. If we filter using Age == 25 & Brochure == "New",
# that row disappears as expected
DF %>% filter(Age == 25 & Brochure == "New")
#>   Age Brochure PPV_Cond1 PPV_Cond2
#> 1  25      New        NA         1

# Despite using the same "filter" here, the NA causes problems
DF %>% mutate(Variable =
                case_when(Age == 25 & Brochure == "Old" ~ PPV_Cond1 - 50,
                          Age == 25 & Brochure == "New" ~ PPV_Cond2 - 50))
#> Error in mutate_impl(.data, dots): Evaluation error: NAs are not allowed in subscripted assignments.

# If we add the !is.na(Age) to the case_when condition, everything works as expected
DF %>% mutate(Variable =
                case_when(Age == 25 & Brochure == "Old" ~ PPV_Cond1 - 50,
                     !is.na(Age) & Age == 25 & Brochure == "New" ~ PPV_Cond2 - 50))
#>   Age Brochure PPV_Cond1 PPV_Cond2 Variable
#> 1  25      New        NA         1      -49
#> 2  25      Old         1        NA      -49
#> 3  NA      New        NA         1       NA
@krlmlr
Copy link
Member

@krlmlr krlmlr commented Jul 12, 2017

Thanks, confirmed. Simpler reprex:

dplyr::case_when(c(1:3, NA) == 1 ~ 1:4)
#> Error in x[i] <- val[i]: NAs are not allowed in subscripted assignments

@krlmlr krlmlr added this to the 0.7.3 milestone Aug 16, 2017
@krlmlr krlmlr added this to the 0.7.3 milestone Aug 16, 2017
@krlmlr krlmlr closed this in #3046 Aug 23, 2017
krlmlr added a commit to krlmlr/dplyr that referenced this issue Aug 23, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jun 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants