New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter() does not accept scalar values as logical expression #217

Closed
holgerbrandl opened this Issue Jan 27, 2014 · 7 comments

Comments

Projects
None yet
3 participants
@holgerbrandl

holgerbrandl commented Jan 27, 2014

Example

filter(mtcars, T)

gives an error
Error: unknown column : T

Or lightly different error for:

filter(mtcars, TRUE)

Error: incompatible expression in filter

In fact what I would want to do is to filter groups by using a scalar logical expression per group. The corresponding ddply code would be

ddply(mtcars, .(cyl), function(x) subset(x, min(mpg)>20))

which gives just those groups with the same cylinder-count where all cars of the group run at least 20 mpg.

subset() also allows for scalar expressions but can not do it in a grouped way.

Currently I have to force vectorized logical expressions to coinvince filter() to work

filter(group_by(mtcars, cyl),  min(mpg)>rep(20, length(mpg))

but it looks odd.

@romainfrancois

This comment has been minimized.

Member

romainfrancois commented Jan 27, 2014

I currently get:

> filter(group_by(mtcars, cyl),  min(mpg)>20 )
Erreur : incorrect length (1), expecting: 14

@hadley should this give the same as

>  filter(group_by(mtcars, cyl),  rep( min(mpg)>20, length(mpg) ) )
Source: local data frame [11 x 11]
Groups: cyl

    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
1  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
2  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
3  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
4  32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
5  30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
6  33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
7  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
8  27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
9  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
10 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
11 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
@hadley

This comment has been minimized.

Member

hadley commented Jan 27, 2014

@romainfrancois yes, if the output is length one, it should apply to the entire group.

@ghost ghost assigned romainfrancois Jan 27, 2014

@romainfrancois

This comment has been minimized.

Member

romainfrancois commented Jan 27, 2014

Thanks. I'll pick it up then

@romainfrancois

This comment has been minimized.

Member

romainfrancois commented Jan 27, 2014

It is mostly there, however we have this test case :

test_that("filter gives useful error message when given incorrect input", {
  expect_error( filter(tbl_df(mtcars), x ), "unknown column" )
  expect_error( filter(tbl_df(mtcars), TRUE), "incompatible expression in filter" )
})

which sort of suggest that filter(., TRUE) is invalid. is it ?

@hadley

This comment has been minimized.

Member

hadley commented Jan 27, 2014

That seems like it should be ok to me. Given the spacing around the parens, I think you probably wrote those tests ;)

@holgerbrandl

This comment has been minimized.

holgerbrandl commented Jan 27, 2014

Thanks a ton. Amazing how fast you did it. I've just installed it via devtools and filter(mtcars, TRUE) and my actual use-cases work nicely now. However (just fyi), filter(mtcars, T) still dies with an "Error: unknown column : T". T seems to be different from TRUE in filter().

Thanks for the quick fix again.

romainfrancois added a commit that referenced this issue Jan 28, 2014

@romainfrancois

This comment has been minimized.

Member

romainfrancois commented Jan 28, 2014

I've added some more special casing for this. Should be able now to do filter(mtcars, T)

@lock lock bot locked as resolved and limited conversation to collaborators Jun 11, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.