Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dplyr redefining summarise changes behavior of plyr #645

Closed
deaneckles opened this issue Sep 29, 2014 · 9 comments
Closed

dplyr redefining summarise changes behavior of plyr #645

deaneckles opened this issue Sep 29, 2014 · 9 comments
Assignees
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@deaneckles
Copy link

Reference to just-created variables in summarise no longer work in some cases because (I'd guess) of non-standard evaluation stuff.

In particular, run the following code either with just library(plyr) or with library(plyr); library(dplyr).
The former works fine. the latter fails with "Error: object 'r' not found"

tmp <- expand.grid(a = 1:3, b = 0:1, i = 1:10)
ddply(
  tmp, .(a), summarise, 
  r = sum(b),
  n = length(b),
  p = prop.test(r, n, p = 0.05)$conf.int[1]
  )

This seems to be caused by the index [1]. Without this (and replacing $conf.int with $p.value, so that there is only one value), this works fine with both sets of libraries.

@hadley
Copy link
Member

hadley commented Sep 29, 2014

Do you have both plyr and dplyr loaded? Attempting to use them in this way is likely to lead to problems.

@deaneckles
Copy link
Author

Yes, this is caused when having loaded both (but in the suggested order). I figured because of #29 that making them compatible is a goal.

I guess the best solution to this problem is to use plyr::summarise explicitly.

@hadley
Copy link
Member

hadley commented Sep 29, 2014

Ah, got it - I'll take a look.

@hadley hadley added the bug an unexpected problem or unintended behavior label Sep 30, 2014
@hadley hadley added this to the 0.3 milestone Sep 30, 2014
@hadley hadley self-assigned this Sep 30, 2014
@hadley
Copy link
Member

hadley commented Sep 30, 2014

Ok, a slightly improved reproducible example:

tmp <- expand.grid(a = 1:3, b = 0:1, i = 1:10)

# FAILS
plyr::ddply(tmp, "a", dplyr::summarise, 
  r = sum(b),
  n = length(b),
  p = prop.test(r, n, p = 0.05)$conf.int[1]
)

# WORKS
plyr::ddply(tmp, "a", dplyr::summarise, 
  r = sum(b),
  n = length(b),
  p = prop.test(r, n, p = 0.05)$p.value
)

@hadley
Copy link
Member

hadley commented Sep 30, 2014

Ok, turns out this is unrelated to plyr:

library(dplyr)

# FAILS
tmp %>% group_by(a) %>%
  summarise(
    r = sum(b),
    n = length(b),
    p = prop.test(r, n, p = 0.05)$conf.int[1]
  )

# WORKS
tmp %>% group_by(a) %>%
  summarise(
    r = sum(b),
    n = length(b),
    p = prop.test(r, n, p = 0.05)$p.value
  )

@hadley
Copy link
Member

hadley commented Sep 30, 2014

@deaneckles did this work in dplyr 0.2?

@hadley hadley assigned romainfrancois and unassigned hadley Sep 30, 2014
@hadley
Copy link
Member

hadley commented Sep 30, 2014

@romainfrancois can you please take a look? Looks like a hybrid eval problem.

@romainfrancois
Copy link
Member

Sure. I think it is more or less the same problem as in #421

@romainfrancois
Copy link
Member

I think I fixed that class of problems, i.e. expressions of the general form foo( bar = yada)$bah( bling = 2) i.e. calls appearing either on the left or right of $.

The previous implementation was trying to do too much too early.

@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants