Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if_else throws false must be a vector, not NULL. error for dplyr 1.1.0 #6730

Closed
chrarnold opened this issue Feb 16, 2023 · 2 comments
Closed

Comments

@chrarnold
Copy link

I posted an issue yesterday (#6728) but it was closed immediately after the first reply, which I felt was unjustified because my response and argument is not being replied to anymore because the issue is closed. I therefore would like to reopen it here again for a discussion, and I am hoping for a more fruitful solution.

Between dplyr 1.0.0 and 1.1.0, an undocumented (unless I oversaw it) and breaking change has been included into if_else that I find not very useful personally and which forced me and others to go back to base R ifelse, and I wanted to learn what was the motivation for it. In essence, it seems "The true and false arguments are always evaluated, regardless of whether their values are seen in the result, which causes an error in dplyr 1.1.0 when NULL is involved. I'd appreciate a discussion why this is a useful change, because it worked differently and "as expected" until dplyr 1.0.0. Thank you

@DavisVaughan
Copy link
Member

if_else() was fairly broken / unstable with NULL inputs in 1.0.10

dplyr::if_else(TRUE, NULL, NULL)
#> NULL

dplyr::if_else(TRUE, 1, NULL)
#> [1] 1

dplyr::if_else(TRUE, NULL, 1)
#> Error in `dplyr::if_else()`:
#> ! `false` must be NULL, not a double vector.

In 1.1.0, these all error consistently:

dplyr::if_else(TRUE, NULL, NULL)
#> Error in `dplyr::if_else()`:
#> ! `true` must be a vector, not `NULL`.

dplyr::if_else(TRUE, 1, NULL)
#> Error in `dplyr::if_else()`:
#> ! `false` must be a vector, not `NULL`.

dplyr::if_else(TRUE, NULL, 1)
#> Error in `dplyr::if_else()`:
#> ! `true` must be a vector, not `NULL`.

I don't think we ever intended for true or false to be NULL. We definitely never documented them as such, and didn't have any tests for it. Here is the documentation of true and false for 1.0.10, which I think is fairly clear about the fact that they should be vectors of the same type:

Values to use for TRUE and FALSE values of condition. They must be either the same length as condition, or length 1. They must also be the same type: if_else() checks that they have the same type and same class. All other attributes are taken from true.

Also note that ifelse() only works in your case because you only have TRUE values. If you happen to hit a FALSE value, you'd get an error

# Works
ifelse(TRUE, 1, NULL)
#> [1] 1

# Doesn't work
ifelse(FALSE, 1, NULL)
#> Warning in rep(no, length.out = len): 'x' is NULL so the result will be NULL
#> Error in ans[npos] <- rep(no, length.out = len)[npos]: replacement has length zero

ifelse(c(TRUE, FALSE), 1, NULL)
#> Warning in rep(no, length.out = len): 'x' is NULL so the result will be NULL
#> Error in ans[npos] <- rep(no, length.out = len)[npos]: replacement has length zero

This makes ifelse()'s output stability dependent on condition, which is very painful from a programmatic perspective (i.e. it works sometimes but not others with inputs that look nearly identical). We avoid this in if_else() by doing up front checks of vectorness, types, and sizes, at the cost of laziness.

I've added a retrospective NEWS bullet about NULL inputs to if_else() here 0db2cb2, and I tweaked the if_else() docs to explicitly mention vector inputs here 68ae922.

I think all of this together is sufficient justification for this change, but I appreciate that you followed up to get more clarification

@chrarnold
Copy link
Author

Thanks a lot for the detailed reply, this is what I tried to achieve - a discussion on why, things are more clear now, and I understand the reasoning of why things have been changed. In the contexts I used it so far, even when NULL was involved, I didnt come across all these edge cases, it basically worked as I intuitively expected, therefore my surprise now with the new(er) release.

Thanks for your effort and retrospective change, I think this will be useful in the future for those who wonder as well.
Cheers,
Christian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants