New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strange issue using pipes and qplot #62
Comments
|
This is indeed weird! In In debug mode, evaluating the rhs of that expression works fine, but including the (and using @hadley do you know why this happens? |
|
Here's a minimal example that replicates the behaviour without magrittr: changing the last line to makes it work. I guess in the magrittr case it is a pretty easy fix, but I'm not sure whether it is the best place to do it. Perhaps... |
|
Actually, you don't even need |
|
That fix in value <- withVisible(function_list[[k]](value))
if (value[["visible"]]) value[["value"]] else invisible(value[["value"]])with ans <- withVisible(function_list[[k]](value))
if (ans[["visible"]]) ans[["value"]] else invisible(ans[["value"]])unnecessarily allocates the memory twice, which may be a problem for big objects, but what about with(withVisible(function_list[[k]](value)),
if (visible) value else invisible(value))? Would that have some memory overhead as well? Or some other shady side-effect of using |
|
@casallas why do you think there are two allocations happening there? I can only see one (depending on which branch of the if is taken) |
|
I guess a major drawback in the |
|
@hadley I meant that in the proposed ans <- withVisible(function_list[[k]](value))
if (ans[["visible"]]) ans[["value"]] else invisible(ans[["value"]])you would have both |
Fixes tidyverse#62
|
OK never mind R is smarter than I thought, the memory doesn't seem to be allocated twice in any of the variants, and they seem to take roughly the same time—at least based on my rudimentary benchmark. Sorry for the noise. |
|
Maybe a recursive approach for freduce is nicer altogether... |
|
A recursive version is a little aesthetically nicer, but marginally slower. Here's an implementation: freduce <- function (value, function_list)
{
k <- length(function_list)
if (k == 1L) {
result <- withVisible(function_list[[1L]](value))
if (result[["visible"]])
result[["value"]]
else
invisible(result[["value"]])
} else {
Recall(function_list[[1L]](value), function_list[-1L])
}
} |
|
that one looks great! I didn't see any performance issues and it also takes care of the case where rnorm(1000) %>% qplot %>% printwhich wasn't possible with the iterative solution |
|
(FWIW |
|
I don't believe you ;-) |
- Code by @smbache - Fixes tidyverse#62
|
[This was closed some time ago, but my question is really related to the recursive version.] Do you have an estimate or just a guess what this means for memory allocation along the pipe? To elaborate, without promises, the recursive version allocates all Now, we of course have promises, so quite often this does not happen, and simply a promise is pushed through the pipe, and With the for version, there are no promises, but there is no nesting either, so no memory accumulation along the pipe. [@casallas is this what you meant my allocating the memory twice?] |
|
No I'm not sure how costly it is, although surely recursion is inefficient, but this is a problem linear in the number of pipes (and almost always negligible). However there were examples where the loop version failed for subtle reasons. I'm sure @kmillar and crew gets the recursion overhead down ;) Did you experience trouble with the recursive version? |
|
@smbache No, I haven't seen trouble, so, yes, this is theoretical. I am not worried about the recursion per se, that's fine with me. I am "worried" about having a Think about a typical use case of putting a (largish) data frame into a pipe, and manipulating and transforming it at each step. If you write it without pipes: result <- manipulate1(DF)
result <- manipulate2(result)
result <- manipulate3(result)
...then there is only one copy of However if you write it with pipes, then, because of the recursion, there will be a separate Of course this is usually not true because of the promises. If there are no side effects, R is just passing So my question is: how often can we push a promise to the end of the pipe? In other words, I am fairly sure that by introducing side effects, I can make R store the data as many times as many stages I have in the pipe, so I can "break" it. But does it really happen in real uses cases? Do people have side effects in the pipe stages? I guess it might be also a good idea to promote pipey code without side effects. E.g. if the manual, readmes, vignettes, etc. have code with side effects, maybe we can reconsider them. |
casallas commentedDec 30, 2014
All of the following calls
Yield the following error
Strangely, however, the following works
but that breaks the pipe magic... something else that works is
debug(qplot), then callingstr(x)once debugging, which makes me think there may be some sort of lazy evaluation happening (perhaps inggplot2?) which doesn't play nicely with the.placeholder?I can confirm that this wasn't an issue in magrittr 1.1.0
The text was updated successfully, but these errors were encountered: