Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dplyr behaves differently if plyr is loaded #347

Closed
vsbuffalo opened this issue Mar 21, 2014 · 13 comments
Closed

dplyr behaves differently if plyr is loaded #347

vsbuffalo opened this issue Mar 21, 2014 · 13 comments

Comments

@vsbuffalo
Copy link

@vsbuffalo vsbuffalo commented Mar 21, 2014

Something I noticed today (after a bit of hair pulling) :-)

library(dplyr)
> iris %.% group_by(Species) %.% summarize(p=mean(Petal.Length))
Source: local data frame [3 x 2]

       Species     p
 1     setosa 1.462
 2 versicolor 4.260
 3  virginica 5.552
> library(plyr)
iris %.% group_by(Species) %.% summarize(p=mean(Petal.Length))
      p
1 3.758

My sessionInfo() is below. I'd be happy to take a look at debuggin this and supplying a patch if that would help, but it may be something quite simple. Thanks for dplyr, I love it!

> sessionInfo() R version 3.0.3 (2014-03-06)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] plyr_1.8.1     dplyr_0.1.3    devtools_1.4.1

loaded via a namespace (and not attached):
 [1] assertthat_0.1 digest_0.6.4   evaluate_0.5.1 httr_0.2       memoise_0.1
 [6] parallel_3.0.3 Rcpp_0.11.1    RCurl_1.95-4.1 stringr_0.6.2  tools_3.0.3
[11] whisker_0.3-2
@vsbuffalo
Copy link
Author

@vsbuffalo vsbuffalo commented Mar 21, 2014

Whoops, I just realized my search for issues relating to plyr ignored the first page of issues. This issue (#29) has come up before, by maybe these particular side effects were not known.

@vsbuffalo
Copy link
Author

@vsbuffalo vsbuffalo commented Mar 21, 2014

Also, this looks like purely a namespace clash between plyr's summarize and dplyr's summarize:

> iris %.% group_by(Species) %.% dplyr:::summarize(p=mean(Petal.Length))

Source: local data frame [3 x 2]

     Species     p
1     setosa 1.462
2 versicolor 4.260
3  virginica 5.552

@HarlanH
Copy link

@HarlanH HarlanH commented Mar 21, 2014

I think if you load plyr first, and then dplyr, the masking will work the way you want, in that you'll still be able to use plyr's unique functions (rename), but dplyr won't break. This said, shouldn't dplyr specify the namespace internally to prevent this behavior?

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Apr 2, 2014

This really is a duplicate of #29. This is generic conflict between what is exported from namespaces. If you have this in a package that imports dplyr, that should not be a problem.

Unless you start importing both dplyr and plyr, in which case you probably have to choose what you import from plyr and dplyr.

Maybe we could modify %.% so that it prefers functions from dplyr over functions from plyr but that might complicate things.

@vsbuffalo
Copy link
Author

@vsbuffalo vsbuffalo commented Apr 30, 2014

I'm sorry to reopen this, but I think this issue needs more attention.

I was aware of this issue, and it just silently bit me in code in which one library I import imports plyr, and my code imports dplyr. I had forgotten that this other package imports plyr, and the functionality of my code broke because of something far upstream from my development. I have a hard time thinking that fixing this — even if fixing this is a load warning — shouldn't be a top priority.

@hadley hadley reopened this May 1, 2014
@hadley
Copy link
Member

@hadley hadley commented May 1, 2014

A thought occurred to me: this is hard to fix, but it should be possible to at least warn people about the problem by setting up a package load hook in dplyr. If plyr is loaded after dplyr, issue a warning

@hadley hadley closed this in 912b46b May 1, 2014
@hadley
Copy link
Member

@hadley hadley commented May 1, 2014

@vsbuffalo does this warn in the situation you encountered? If not, can you please provide more details?

@vsbuffalo
Copy link
Author

@vsbuffalo vsbuffalo commented May 1, 2014

It does! Thanks so much Hadley — much appreciated!!

@mattwg
Copy link

@mattwg mattwg commented Jun 4, 2015

I know this is an old thread but I still get caught out even with the warning. I silence the library function so that when I call it in an RMarkdown document it doesn't spill the warnings onto the screen - when I forget both are loaded I get the summarise from plyr even when loading dplyr second. It would be very helpful to put all the *_ply() functions into dplyr - they are the things I guess most folks are using from plyr - in particular the m_ply() functions - is there is a way to do the equivalent of mdply() this in dplyr?

@svannoy
Copy link

@svannoy svannoy commented Jul 24, 2015

I've also been getting bit by this, partly for the same reason as mattwg above (trying to keep things quiet in RMarkdown), and also because of the multiple packages that load plyr.

I've decided now to always qualify summarize and mutate with dplyr::summarize and dplyr::mutate, is there any danger or disadvantage (other than extra typing) to doing this?

@hadley
Copy link
Member

@hadley hadley commented Jul 24, 2015

@svannoy it's only a problem when you do library(plyr) not when other packages important plyr functions for their own purposes

@svannoy
Copy link

@svannoy svannoy commented Jul 24, 2015

Thanks, much easier to get rid of those!


@stevenvannoy
stevenvannoy.wordpress.com

On Jul 24, 2015, at 12:40 PM, Hadley Wickham notifications@github.com wrote:

@svannoy https://github.com/svannoy it's only a problem when you do library(plyr) not when other packages important plyr functions for their own purposes


Reply to this email directly or view it on GitHub #347 (comment).

@Alectoria
Copy link

@Alectoria Alectoria commented Aug 17, 2015

+1 for merging plyr and dplyr or rename their functions. I run multiple scripts in a session that import plyr and/or dplyr and am constantly running into conflicts (warning are easily overlooked when scripts are processed). I have started to explicitely detach both of them at the end of a script but it would be nicer to have one conflict-free solution.

krlmlr pushed a commit to krlmlr/dplyr that referenced this issue Mar 2, 2016
jl-costa added a commit to jl-costa/error_bars2 that referenced this issue Mar 19, 2017
- The function has been renamed to error_bars2() to avoid any confusion

- Rewrote the section on creating df.summary. Instead of a pipe operator with data.frame(), I'm using the data.table package for faster processing. 

- Added a check at the beginning to verify that the two df columns are labeled correctly and are in the correct order. With the previous version I was being allowed to enter a df with this structure: (y,x), but the function was hardcoded to take df[, 1] as the x column. 

- Related to the previous point, I modified the 'x' object assignment by making it take the column labeled 'x', and not the column with index 1. This shouldn't be necessary with the additional check that I just outlined in the previous point, but it still makes for more robust code IMO.

- There are known issues when mixing plyr and dplyr (example: tidyverse/dplyr#347), and I noticed that with the count() function when assigning df.summary$ynum. I added dplyr:: to ensure that the correct package is used to call that function. Otherwise the $n column won't be found, as count() from dplyr behaves differently and returns a different output.
@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants