-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the syntax only for side effect #30
Comments
A practical case is to make a plot of a linear model before which the partition is set. mtcars %>>%
(~ par(mfrow=c(2,2))) %>>% # only for side effect (par() returns the arg list)
(lm(mpg ~ cyl + wt, data = .)) %>>%
plot() |
Other examples: mtcars %>>%
(~ par(mfrow=c(1,2))) %>>%
(~ plot(mpg ~ cyl, data = .)) %>>%
(~ plot(mpg ~ wt, data = .)) %>>%
(lm(mpg ~ cyl + wt, data = .)) %>>%
summary() %>>%
(coefficients) Pipe(mtcars)$
.(~ par(mfrow=c(1,2)))$
.(~ plot(mpg ~ cyl, data = .))$
.(~ plot(mpg ~ wt, data = .))$
.(lm(mpg ~ cyl + wt, data = .))$
summary()$
.(coefficients) Do you think it is useful? |
Is the question whether to have this functionality at all or what syntax is best? I will very clearly demonstrate my ignorance here, but just to make sure I am clear this would accomplish the objective of the magrittr Although I use it rarely, it is very nice to have in those rare use cases even beyond logging. I will try to work up the examples where I find it handy and see how this syntax looks. Also, as I work through many examples, see how sticky it is? I am assuming that deprecation of lambda #31 will be mandatory to prevent confusion. |
Yes, it is like magrittr's magrittr introduces a new operator to do this and more operators to do other things. At early times, I saw only one or two operators in magrittr, and now I see 5 or 6. Instead of introducing new operators, I would like to carefully introduce new syntax that is not confusing and not easily abused. The feature has been committed to branch 0.4. Would you please try it and give some suggestions? Thanks a lot! |
I updated to the newest |
Thanks! If you think there is better syntax, please let me know. If the feature costs more than the value it brings, it would not go to |
Figured I would borrow some code from a package that uses
|
Another use similar to logging/documenting that I had not considered would be to test in the pipeline with |
Here's a pseudo computing example :) Pipe(1:3)$
.(~ cat("connect",length(.),"elements with 2 more\n"))$
.(~ Sys.sleep(1))$
c(4,5)$
.(~ cat("calculating mean\n"))$
.(~ Sys.sleep(1))$
mean() |
hardest thing for me so far has been |
It's sad that
Neither does
And for
|
In the syntax I designed, |
btw, I like the computing example... I found the lattice plot in the
and then if I have it right as a
|
Another step-by-step plotting example: m <- data.frame(x=1:100,y=rnorm(100))
par(mfrow=c(2,2))
Pipe(m)$
.(~ plot(y ~ x, data = .))$
transform(z = y^2)$
.(~ plot(y ~ z, data = .))$
transform(w = (y + z))$
.(~ plot(y ~ w, data = .))$
transform(q = sin(x)+cos(y))$
.(~ plot(y ~ q, data = .)) |
I consider the main use of this feature is to
The |
lambda conflict (which I think you have decided to deprecate/eliminate) really is the only side effect of the side effect that I have thought of |
I cannot think of much more to throw at it than this monstrosity replicating a post I had done previously.
|
another use similar to logging would be to write a file with results for reproducibility. |
more plotting examples
a little different look at it with a focus on
|
A mix for all features: library(pipeR)
mtcars %>>%
(~ cat("data:",ncol(.),"columns\n")) %>>%
subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
summary() %>>%
(coefficients) %>>%
((coe) ~ cat("coefficients:",class(coe),"\n")) %>>%
((coe) ~ print(coe)) %>>%
(coe ~ coe[-1,1]) %>>%
barplot(main = "coefficients") I think the I'm considering take the syntax of the following:
The syntax looks more uniform and makes more sense to me. And luckily it can be parsed in desired way.
What do you think? |
It's very interesting that my expression analyzer directly support the syntax of
The following code will run without having to change any code: mtcars %>>%
(~ cat("data:",ncol(.),"columns\n")) %>>%
subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
summary() %>>%
(coefficients) %>>%
(~ coe ~ cat("coefficients:",class(coe),"\n")) %>>%
(~ coe ~ print(coe)) %>>%
(coe ~ coe[-1,1]) %>>%
barplot(main = "coefficients") |
I was wondering why this would be treated as a "side effect". I prefer to look it directly as the final return value of the whole pipe expression of (A %>>% fun). Since the default return value of a pipe expression is the rhs, why don't you define another operator for such "returning lhs" requirement, which I think may leave the pipe itself more clear. For example:
|
A more comprehensive example could be like this:
I think this should be more clear than:
|
Thanks @yanlinlin82 for your opinion. You just pointed out the core problem in this issue: more operators or more syntax? Let's see the example with mtcars %<<%
(cat("data:",ncol(.),"columns\n")) %>>%
subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
(lm(mpg ~ cyl + disp + wt + factor(vs), data = .)) %>>%
summary() %>>%
(coefficients) %<<%
(coe ~ cat("coefficients:",class(coe),"\n") ) %<<%
(coe ~ print(coe)) %>>%
(coe ~ coe[-1,1]) %>>%
barplot(main = "coefficients") I feel I must scan the code very carefully to understand which line is forward piping and which line is only side effect. In this line-by-line example, only when I look back and find which operator is used can I assure whether it is a side effect or not. Neither can I quickly find the input of the "normal" lines without carefully back-looking at the code. I think the same problem exists with magrittr's mtcars %T>%
(l(. ~ cat("data:",ncol(.),"columns\n"))) %>%
subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>%
lm(mpg ~ cyl + disp + wt + factor(vs), data = .) %>%
summary() %$%
coefficients %T>%
(l(. ~ cat("coefficients:",class(.),"\n"))) %>%
print %>%
(l(coe ~ coe[-1,1])) %>%
barplot(main = "coefficients") Do you feel you can quickly understand which object is piped to where and quickly pick out the important "really-doing-stuff" lines? Frankly speaking, I can't, because a little operator is too small to distinguish and in line-by-line piping, the operator must be written in the previous line which determines how the next piping works. Look at the new syntax where one wants to do some logging between pipes: library(pipeR)
mtcars %>>%
(~ cat("data:",ncol(.),"columns\n")) %>>%
subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
summary() %>>%
(coefficients) %>>%
(~ coe ~ cat("coefficients:",class(coe),"\n")) %>>%
(~ coe ~ print(coe)) %>>%
(coe ~ coe[-1,1]) %>>%
barplot(main = "coefficients") I feel rather clear when I simply take a glimpse at the code if I know That's why I feel there are too many operators and hard to distinguish at a first glimpse. But with syntax, it should be much much easier to understand the code at first glimpse. That's why I make A typical case is that one does not use this feature that heavily but rarely. Therefore, it should be like Pipe(mtcars)$
.(~ cat("data:",ncol(.),"columns\n"))$
subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95))$
.(lm(mpg ~ cyl + disp + wt + factor(vs), data = .))$
summary() Just take a glimpse at the code, and it should be easy to find all lines that start with If you want to find out the input of a normal line, it should be pretty easy if you only look at the header of each line and look back until a line that does not start with |
I agree I also vote against |
I finally see your opinion. You are using "side effect" syntax to ignore |
By the way, it just occurred to me that will it always have a main stream in a pipe, with or without other branches. That is to say, if a data set is to be processed by different procedures simultaneously, and if we want them all in a pipe, then we need to arbitrarily make one procedure be primary, and other procedures be branches, it this right? For example:
Then it could be written like this:
|
@yanlinlin82 That's a very interesting insight! I have not yet considered much about "branching" in pipeline. It looks quite interesting. For example, m <- data.frame(x=1:10)
par(mfrow=c(2,2))
m %>>%
(~ . %>>% transform(y=x) %>>% plot(type="l")) %>>%
(~ . %>>% transform(y=x^2) %>>% plot(type="l")) %>>%
(~ . %>>% transform(y=sin(x/2)) %>>% plot(type="l")) %>>%
(~ . %>>% transform(y=cos(x/2)) %>>% plot(type="l")) which has four branches to manipulate one piece of data :) |
Consider the following syntax:
or
where
(~ expr)
or((x) ~ expr)
indicates that the output of this will be ignored and the input will be returned, thus only for side effect (only one side is stressed in the formula, also looks likeexpr
is evaluated as a side branch)Note that all syntax in
()
automatically applies to.()
in Pipe, therefore,The text was updated successfully, but these errors were encountered: