New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter() and conditions of different length #2605
Comments
Yeah, and it's surprising given that the first entry is checked. |
Well, base |
Postponed, because it's better to include this in a bigger release IMO. |
This is the same as
|
This happens on the R side with #' @export
filter.tbl_df <- function(.data, ...) {
dots <- quos(...)
if (any(have_name(dots))) {
bad <- dots[have_name(dots)]
bad_eq_ops(bad, "must not be named, do you need `==`?")
} else if (is_empty(dots)) {
return(.data)
}
quo <- all_exprs(!!!dots, .vectorised = TRUE)
filter_impl(.data, quo)
} Too late to do anything about this in the C++ side, because the expression is legit then. |
We need to change the R -> C++ interface to fix this. |
|
iirc expressions used to be passed down to the C++ side, but the code was consequently more complicated. |
I'd suggest to shelve this until after #2311. |
#2311 is done as far as |
I think that's the best way to proceed: pass down quosures and evaluate them one by one. There's room for optimization: If the first predicate is |
You can't really do that without analyzing the expressions and this is dangerous. When I'm done with #3526 perhaps we'll have new hybrid handler for things we typically see in filter, |
I just meant looking at the value after computing the first predicate, not looking at expressions. Why would you want to analyze the expressions? |
The only thing that could be interesting is if all the values are |
As discussed: tbl %>%
filter(pred1(...), lag(col1) < col1) and tbl %>%
filter(pred1(...)) %>%
filter(lag(col1) < col1) are not equivalent. |
Let's shelve this for now — I think this is probably best seen a special case of as applying the tidyverse recycling rules to all binary operators. Because if we implement with for And then if we apply the tidyverse recycling rules, should we also apply the new tidyverse coercion rules? |
Would it be too pretentious to override `&` <- function(e1, e2) {
if (length(e1) != length(e2)) {
if (length(e1) != 1 || length(e2) != 1) stop("Recycling")
}
base::`&`(e1, e2)
}
TRUE & FALSE
#> [1] FALSE
c(TRUE, FALSE) & c(FALSE, TRUE, FALSE)
#> Error in c(TRUE, FALSE) & c(FALSE, TRUE, FALSE): Recycling Created on 2018-11-07 by the reprex package (v0.2.1.9000) |
Or can we implement something like |
We have yet to formalize what "tidy recycling" is, it probably would have to depend on |
We should tackle at the same time as #3937 |
Now tracking at tidyverse/funs#33 — my sense is that it would be better to have some way to activate this globally so you don't end up with different results inside and outside of dplyr functions. |
filter() currently works by appending the conditions to each other using
&
. This is unsafe: In the following example I'd rather see an error than any result:Fixing this also allows reporting error positions for offending expressions, look for
expect_error
intest-filter.r
.The text was updated successfully, but these errors were encountered: