New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the syntax only for side effect #30

Closed
renkun-ken opened this Issue Aug 18, 2014 · 28 comments

Comments

Projects
None yet
3 participants
@renkun-ken
Owner

renkun-ken commented Aug 18, 2014

Consider the following syntax:

x %>>% (~ expr)         # evaluate expr with . = x and return x
x %>>% ((m) ~ expr)     # evaluate expr with m = x and return x
mtcars %>>%
  (~ cat("Number of columns:",ncol(.),"\n")) %>>%
  (mpg) %>>%
  summary
Number of columns: 11 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.40   15.42   19.20   20.09   22.80   33.90 

or

mtcars %>>%
  ((x) ~ cat("Number of columns:",ncol(x),"\n")) %>>%
  (mpg) %>>%
  summary
Number of columns: 11 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.40   15.42   19.20   20.09   22.80   33.90 

where (~ expr) or ((x) ~ expr) indicates that the output of this will be ignored and the input will be returned, thus only for side effect (only one side is stressed in the formula, also looks like expr is evaluated as a side branch)

Note that all syntax in () automatically applies to .() in Pipe, therefore,

Pipe(mtcars)$
  .(~ cat("Number of columns:",ncol(.),"\n"))$
  .(mpg)$
  summary()
Number of columns: 11 
$value : summaryDefault table 
------
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.40   15.42   19.20   20.09   22.80   33.90 

@renkun-ken renkun-ken self-assigned this Aug 18, 2014

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

A practical case is to make a plot of a linear model before which the partition is set.

mtcars %>>%
  (~ par(mfrow=c(2,2))) %>>%    # only for side effect (par() returns the arg list)
  (lm(mpg ~ cyl + wt, data = .)) %>>%
  plot()
Owner

renkun-ken commented Aug 18, 2014

A practical case is to make a plot of a linear model before which the partition is set.

mtcars %>>%
  (~ par(mfrow=c(2,2))) %>>%    # only for side effect (par() returns the arg list)
  (lm(mpg ~ cyl + wt, data = .)) %>>%
  plot()
@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

Other examples:

mtcars %>>%
  (~ par(mfrow=c(1,2))) %>>%
  (~ plot(mpg ~ cyl, data = .)) %>>%
  (~ plot(mpg ~ wt, data = .)) %>>%
  (lm(mpg ~ cyl + wt, data = .)) %>>%
  summary() %>>%
  (coefficients)
Pipe(mtcars)$
  .(~ par(mfrow=c(1,2)))$
  .(~ plot(mpg ~ cyl, data = .))$
  .(~ plot(mpg ~ wt, data = .))$
  .(lm(mpg ~ cyl + wt, data = .))$
  summary()$
  .(coefficients)

Do you think it is useful?
Do you think it looks ambiguous even if you know the rule and what it means?

@timelyportfolio @ramnathv @yanlinlin82

Owner

renkun-ken commented Aug 18, 2014

Other examples:

mtcars %>>%
  (~ par(mfrow=c(1,2))) %>>%
  (~ plot(mpg ~ cyl, data = .)) %>>%
  (~ plot(mpg ~ wt, data = .)) %>>%
  (lm(mpg ~ cyl + wt, data = .)) %>>%
  summary() %>>%
  (coefficients)
Pipe(mtcars)$
  .(~ par(mfrow=c(1,2)))$
  .(~ plot(mpg ~ cyl, data = .))$
  .(~ plot(mpg ~ wt, data = .))$
  .(lm(mpg ~ cyl + wt, data = .))$
  summary()$
  .(coefficients)

Do you think it is useful?
Do you think it looks ambiguous even if you know the rule and what it means?

@timelyportfolio @ramnathv @yanlinlin82

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

Is the question whether to have this functionality at all or what syntax is best?

I will very clearly demonstrate my ignorance here, but just to make sure I am clear this would accomplish the objective of the magrittr %T>% tee operator? I looked quickly for an equivalent in F#, but could not find any readily available discussions or examples. Are there parallels in F# or other languages where we could borrow the syntax?

Although I use it rarely, it is very nice to have in those rare use cases even beyond logging. I will try to work up the examples where I find it handy and see how this syntax looks. Also, as I work through many examples, see how sticky it is?

I am assuming that deprecation of lambda #31 will be mandatory to prevent confusion.

timelyportfolio commented Aug 18, 2014

Is the question whether to have this functionality at all or what syntax is best?

I will very clearly demonstrate my ignorance here, but just to make sure I am clear this would accomplish the objective of the magrittr %T>% tee operator? I looked quickly for an equivalent in F#, but could not find any readily available discussions or examples. Are there parallels in F# or other languages where we could borrow the syntax?

Although I use it rarely, it is very nice to have in those rare use cases even beyond logging. I will try to work up the examples where I find it handy and see how this syntax looks. Also, as I work through many examples, see how sticky it is?

I am assuming that deprecation of lambda #31 will be mandatory to prevent confusion.

renkun-ken added a commit that referenced this issue Aug 18, 2014

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

Yes, it is like magrittr's %T>% operator for side effect and it is not from F# or any other language as far as I know. It helps avoid breaking pipes in some cases where we want some side effects in between, sometimes helpful for me.

magrittr introduces a new operator to do this and more operators to do other things. At early times, I saw only one or two operators in magrittr, and now I see 5 or 6. Instead of introducing new operators, I would like to carefully introduce new syntax that is not confusing and not easily abused.

The feature has been committed to branch 0.4. Would you please try it and give some suggestions? Thanks a lot!

Owner

renkun-ken commented Aug 18, 2014

Yes, it is like magrittr's %T>% operator for side effect and it is not from F# or any other language as far as I know. It helps avoid breaking pipes in some cases where we want some side effects in between, sometimes helpful for me.

magrittr introduces a new operator to do this and more operators to do other things. At early times, I saw only one or two operators in magrittr, and now I see 5 or 6. Instead of introducing new operators, I would like to carefully introduce new syntax that is not confusing and not easily abused.

The feature has been committed to branch 0.4. Would you please try it and give some suggestions? Thanks a lot!

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

I updated to the newest 0.4 and will test. In the past, I used most with reference classes (R5).

timelyportfolio commented Aug 18, 2014

I updated to the newest 0.4 and will test. In the past, I used most with reference classes (R5).

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

Thanks! If you think there is better syntax, please let me know. If the feature costs more than the value it brings, it would not go to master.

Owner

renkun-ken commented Aug 18, 2014

Thanks! If you think there is better syntax, please let me know. If the feature costs more than the value it brings, it would not go to master.

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

Figured I would borrow some code from a package that uses reference classes so I arbitrarily chose lme4. Here is a small snippet where I try to overuse the side effect functionality.

#think this will be useful for reference classes (R5)
install.packages('lme4')
library(lme4)

#borrow from lme4 vignette to test side effects operator
#str(sleepstudy)
#fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
sleepstudy %>>% 
  ( ~ str(.) ) %>>%  #note ( ~ str ) does not print str but still passes through
  #found this in the .Rnw but code is not in final vignette output  
  (~ 
     print(lattice::xyplot(Reaction ~ Days | Subject, ., aspect = "xy",
                    layout = c(9, 2), type = c("g", "p", "r"),
                    index.cond = function(x, y) coef(lm(y ~ x))[2],
                    xlab = "Days of sleep deprivation",
                    ylab = "Average reaction time (ms)",
                    as.table = TRUE))
  ) %>>%
  { lmer( Reaction ~ Days + ( Days | Subject ), . ) } %>>%
  ( ~assign( "fm1", ., envir = .GlobalEnv ) )


#the hard way to accomplish the fm1 above
# formula module
#   parsedFormula <- lFormula(formula = Reaction ~ Days + (Days|Subject),
#                                data = sleepstudy)
# 
#   # objective function module
#   devianceFunction <- do.call(mkLmerDevfun, parsedFormula)
# 
#   # optimization module
#   optimizerOutput <- optimizeLmer(devianceFunction)
# 
#   # output module
#   mkMerMod( rho = environment(devianceFunction),
#                opt = optimizerOutput,
#                reTrms = parsedFormula$reTrms,
#                fr = parsedFormula$fr)

#probably not a likely candidate for pipelining but do it nevertheless
#don't know enough yet about lme4 design to recode yet
sleepstudy %>>%
  ( ~ print("# formula module")) %>>%
  { 
    lFormula (
      formula = Reaction ~ Days + (Days|Subject)
      , data = .
    )
  } %>>%
  ( ~ assign( "parsedFormula", ., envir = .GlobalEnv ) ) %>>%
  ( ~ cat( "test parsedFormula$frame == fm1@frame" ) )%>>%
  ( ~ testthat::is_identical_to(fm1@frame,parsedFormula$fr) ) %>>%
  ( ~ print( "optimization module" ) ) %>>%
  { do.call( mkLmerDevfun, . ) } %>>%
  ( ~ assign( "devianceFunction", ., envir = .GlobalEnv ) ) %>>%
  ( ~ print( "output module" ) ) %>>%
  optimizeLmer %>>%
  {
    mkMerMod (
      rho = environment( devianceFunction )
      ,opt = .
      ,reTrms = parsedFormula$reTrms,
      ,fr = parsedFormula$fr
    )
  }

timelyportfolio commented Aug 18, 2014

Figured I would borrow some code from a package that uses reference classes so I arbitrarily chose lme4. Here is a small snippet where I try to overuse the side effect functionality.

#think this will be useful for reference classes (R5)
install.packages('lme4')
library(lme4)

#borrow from lme4 vignette to test side effects operator
#str(sleepstudy)
#fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
sleepstudy %>>% 
  ( ~ str(.) ) %>>%  #note ( ~ str ) does not print str but still passes through
  #found this in the .Rnw but code is not in final vignette output  
  (~ 
     print(lattice::xyplot(Reaction ~ Days | Subject, ., aspect = "xy",
                    layout = c(9, 2), type = c("g", "p", "r"),
                    index.cond = function(x, y) coef(lm(y ~ x))[2],
                    xlab = "Days of sleep deprivation",
                    ylab = "Average reaction time (ms)",
                    as.table = TRUE))
  ) %>>%
  { lmer( Reaction ~ Days + ( Days | Subject ), . ) } %>>%
  ( ~assign( "fm1", ., envir = .GlobalEnv ) )


#the hard way to accomplish the fm1 above
# formula module
#   parsedFormula <- lFormula(formula = Reaction ~ Days + (Days|Subject),
#                                data = sleepstudy)
# 
#   # objective function module
#   devianceFunction <- do.call(mkLmerDevfun, parsedFormula)
# 
#   # optimization module
#   optimizerOutput <- optimizeLmer(devianceFunction)
# 
#   # output module
#   mkMerMod( rho = environment(devianceFunction),
#                opt = optimizerOutput,
#                reTrms = parsedFormula$reTrms,
#                fr = parsedFormula$fr)

#probably not a likely candidate for pipelining but do it nevertheless
#don't know enough yet about lme4 design to recode yet
sleepstudy %>>%
  ( ~ print("# formula module")) %>>%
  { 
    lFormula (
      formula = Reaction ~ Days + (Days|Subject)
      , data = .
    )
  } %>>%
  ( ~ assign( "parsedFormula", ., envir = .GlobalEnv ) ) %>>%
  ( ~ cat( "test parsedFormula$frame == fm1@frame" ) )%>>%
  ( ~ testthat::is_identical_to(fm1@frame,parsedFormula$fr) ) %>>%
  ( ~ print( "optimization module" ) ) %>>%
  { do.call( mkLmerDevfun, . ) } %>>%
  ( ~ assign( "devianceFunction", ., envir = .GlobalEnv ) ) %>>%
  ( ~ print( "output module" ) ) %>>%
  optimizeLmer %>>%
  {
    mkMerMod (
      rho = environment( devianceFunction )
      ,opt = .
      ,reTrms = parsedFormula$reTrms,
      ,fr = parsedFormula$fr
    )
  }
@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

Another use similar to logging/documenting that I had not considered would be to test in the pipeline with testthat or rtype.

timelyportfolio commented Aug 18, 2014

Another use similar to logging/documenting that I had not considered would be to test in the pipeline with testthat or rtype.

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

Here's a pseudo computing example :)

Pipe(1:3)$
    .(~ cat("connect",length(.),"elements with 2 more\n"))$
    .(~ Sys.sleep(1))$
    c(4,5)$
    .(~ cat("calculating mean\n"))$
    .(~ Sys.sleep(1))$
    mean()
Owner

renkun-ken commented Aug 18, 2014

Here's a pseudo computing example :)

Pipe(1:3)$
    .(~ cat("connect",length(.),"elements with 2 more\n"))$
    .(~ Sys.sleep(1))$
    c(4,5)$
    .(~ cat("calculating mean\n"))$
    .(~ Sys.sleep(1))$
    mean()
@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

hardest thing for me so far has been ~ inside of () rather than ~(), but the ( ~ ) makes more sense to me. just harder to type for some reason (probably muscle memory).

timelyportfolio commented Aug 18, 2014

hardest thing for me so far has been ~ inside of () rather than ~(), but the ( ~ ) makes more sense to me. just harder to type for some reason (probably muscle memory).

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

It's sad that ~(expr) will be parsed to break the evaluation order which does not allow chaining.

> as.list(quote(a %>>% ~(x) %>>% y()))
[[1]]
`%>>%`

[[2]]
a

[[3]]
~(x) %>>% y()

Neither does ~ expr work:

> as.list(quote(a %>>% ~x %>>% y()))
[[1]]
`%>>%`

[[2]]
a

[[3]]
~x %>>% y()

And for (~expr):

> as.list(quote(a %>>% (~x) %>>% y()))
[[1]]
`%>>%`

[[2]]
a %>>% (~x)

[[3]]
y()
Owner

renkun-ken commented Aug 18, 2014

It's sad that ~(expr) will be parsed to break the evaluation order which does not allow chaining.

> as.list(quote(a %>>% ~(x) %>>% y()))
[[1]]
`%>>%`

[[2]]
a

[[3]]
~(x) %>>% y()

Neither does ~ expr work:

> as.list(quote(a %>>% ~x %>>% y()))
[[1]]
`%>>%`

[[2]]
a

[[3]]
~x %>>% y()

And for (~expr):

> as.list(quote(a %>>% (~x) %>>% y()))
[[1]]
`%>>%`

[[2]]
a %>>% (~x)

[[3]]
y()
@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

In the syntax I designed, () is the feature-hub that supports more than one features dependent on the inner syntax, which may bring potential confusion though. But so far it looks unlikely that someone mistakenly use a feature if one does not know it.

Owner

renkun-ken commented Aug 18, 2014

In the syntax I designed, () is the feature-hub that supports more than one features dependent on the inner syntax, which may bring potential confusion though. But so far it looks unlikely that someone mistakenly use a feature if one does not know it.

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

btw, I like the computing example...

I found the lattice plot in the lme4 vignette by digging in the .Rnw source file, so I added to the example above, but thought it would be good to paste separately to see how it looks in isolation.

sleepstudy %>>% 
  ( ~ str(.) ) %>>%  #note ( ~ str ) does not print str but still passes through
  #found this in the .Rnw but code is not in final vignette output  
  (~ 
    print(lattice::xyplot(Reaction ~ Days | Subject, ., aspect = "xy",
                    layout = c(9, 2), type = c("g", "p", "r"),
                    index.cond = function(x, y) coef(lm(y ~ x))[2],
                    xlab = "Days of sleep deprivation",
                    ylab = "Average reaction time (ms)",
                    as.table = TRUE))
  ) %>>%
  { lmer( Reaction ~ Days + ( Days | Subject ), . ) } %>>%
  ( ~ assign( "fm1", ., envir = .GlobalEnv ) ) %>>%
  #test some nested calls with the profile from vignette conclusion
  ( ~ profile( . ) %>>% { print(lattice::splom(.) ) } )

and then if I have it right as a Pipe.

Pipe(sleepstudy)$
  .( ~ str(.) )$
  .( ~ 
     print(lattice::xyplot(Reaction ~ Days | Subject, ., aspect = "xy",
                           layout = c(9, 2), type = c("g", "p", "r"),
                           index.cond = function(x, y) coef(lm(y ~ x))[2],
                           xlab = "Days of sleep deprivation",
                           ylab = "Average reaction time (ms)",
                           as.table = TRUE))
  )$
  .( lmer( Reaction ~ Days + ( Days | Subject ), . ) )$
  .( ~ assign( "fm1", ., envir = .GlobalEnv ) )$
  .( ~ profile( . ) %>>% { print(lattice::splom(.)) } )[]

timelyportfolio commented Aug 18, 2014

btw, I like the computing example...

I found the lattice plot in the lme4 vignette by digging in the .Rnw source file, so I added to the example above, but thought it would be good to paste separately to see how it looks in isolation.

sleepstudy %>>% 
  ( ~ str(.) ) %>>%  #note ( ~ str ) does not print str but still passes through
  #found this in the .Rnw but code is not in final vignette output  
  (~ 
    print(lattice::xyplot(Reaction ~ Days | Subject, ., aspect = "xy",
                    layout = c(9, 2), type = c("g", "p", "r"),
                    index.cond = function(x, y) coef(lm(y ~ x))[2],
                    xlab = "Days of sleep deprivation",
                    ylab = "Average reaction time (ms)",
                    as.table = TRUE))
  ) %>>%
  { lmer( Reaction ~ Days + ( Days | Subject ), . ) } %>>%
  ( ~ assign( "fm1", ., envir = .GlobalEnv ) ) %>>%
  #test some nested calls with the profile from vignette conclusion
  ( ~ profile( . ) %>>% { print(lattice::splom(.) ) } )

and then if I have it right as a Pipe.

Pipe(sleepstudy)$
  .( ~ str(.) )$
  .( ~ 
     print(lattice::xyplot(Reaction ~ Days | Subject, ., aspect = "xy",
                           layout = c(9, 2), type = c("g", "p", "r"),
                           index.cond = function(x, y) coef(lm(y ~ x))[2],
                           xlab = "Days of sleep deprivation",
                           ylab = "Average reaction time (ms)",
                           as.table = TRUE))
  )$
  .( lmer( Reaction ~ Days + ( Days | Subject ), . ) )$
  .( ~ assign( "fm1", ., envir = .GlobalEnv ) )$
  .( ~ profile( . ) %>>% { print(lattice::splom(.)) } )[]
@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

Another step-by-step plotting example:

m <- data.frame(x=1:100,y=rnorm(100))
par(mfrow=c(2,2))
Pipe(m)$
  .(~ plot(y ~ x, data = .))$
  transform(z = y^2)$
  .(~ plot(y ~ z, data = .))$
  transform(w = (y + z))$
  .(~ plot(y ~ w, data = .))$
  transform(q = sin(x)+cos(y))$
  .(~ plot(y ~ q, data = .))
Owner

renkun-ken commented Aug 18, 2014

Another step-by-step plotting example:

m <- data.frame(x=1:100,y=rnorm(100))
par(mfrow=c(2,2))
Pipe(m)$
  .(~ plot(y ~ x, data = .))$
  transform(z = y^2)$
  .(~ plot(y ~ z, data = .))$
  transform(w = (y + z))$
  .(~ plot(y ~ w, data = .))$
  transform(q = sin(x)+cos(y))$
  .(~ plot(y ~ q, data = .))
@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 18, 2014

Owner

I consider the main use of this feature is to

  • avoid breaking the pipe when I suddenly want to do something between two pipes
  • do some logging
  • show some intermediate results (numbers, plots, etc.)

The (~ expr) syntax seems to be easy to distinguish from non-side effect use and unlikely to be mistakenly used. Can't imagine a user type this syntax without knowing what it means.

Owner

renkun-ken commented Aug 18, 2014

I consider the main use of this feature is to

  • avoid breaking the pipe when I suddenly want to do something between two pipes
  • do some logging
  • show some intermediate results (numbers, plots, etc.)

The (~ expr) syntax seems to be easy to distinguish from non-side effect use and unlikely to be mistakenly used. Can't imagine a user type this syntax without knowing what it means.

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

lambda conflict (which I think you have decided to deprecate/eliminate) really is the only side effect of the side effect that I have thought of

timelyportfolio commented Aug 18, 2014

lambda conflict (which I think you have decided to deprecate/eliminate) really is the only side effect of the side effect that I have thought of

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

I cannot think of much more to throw at it than this monstrosity replicating a post I had done previously.

# from timelyportfolio lme4 error bar post
# http://timelyportfolio.github.io/rCharts_errorbar/ucla_melogit.html
"http://www.ats.ucla.edu/stat/data/hdp.csv" %>>%
  read.csv %>>%
  within( {
    Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
    DID <- factor(DID)
    HID <- factor(HID)
  } ) %>>%
  {
    glmer(remission ~ Age + LengthofStay + FamilyHx + IL6 + CRP +
          CancerStage + Experience + (1 | DID) + (1 | HID),
          data = ., family = binomial, nAGQ=1)
  } %>>%  # show the dotplot as a reference
  (~
     print(lattice::dotplot(
       ranef(., which = "DID", postVar = TRUE),
       scales = list(y = list(alternating = 0))
     ))
  ) %>>%
  { ranef(object  = ., which = "DID", postVar = TRUE)$DID } %>>%
  {
    data.frame(
      "id" = rownames(.),  #this will be our x
      "intercept" = .[,1],            #this will be our y
      "se" = as.numeric(attr( ., "postVar" ))  #this will be our se
    )
  } %>>%  #had not thought of this use to add library
  (~ library(rCharts) ) %>>%  
  #rCharts good ref class reference for side effect helpfulness
  {
    setRefClass(
      "rChartsError"
      ,contains="rCharts"
      ,methods=list(
        initialize = function(){
          callSuper()
        }
        ,getPayload = function(chartId){
          list(chartParams = toJSON2(params), chartId = chartId, lib = basename(lib), liburl = LIB$url)
        }
      )
    )$new() %>>%
        (~ .$setLib("http://timelyportfolio.github.io/rCharts_errorbar") ) %>>%
        (~ .$setTemplate (
          script = "http://timelyportfolio.github.io/rCharts_errorbar/layouts/chart.html"
          ,chartDiv = "<div></div>"
        ) ) %>>%
        (~ .$set(
          data = get(".",parent.env(environment())),  #ugly but don't know better way
          height = 500,
          width = 1000,
          margin = list(top = 10, bottom = 10, right = 50, left = 100),
          x = "id",
          y = "intercept",
          radius = 2,
          sort = list( var = "intercept" ),
          whiskers = "#!function(d){return [d.intercept - 1.96 * d.se, d.intercept + 1.96 * d.se]}!#",
          tooltipLabels = c("id","intercept","se") 
        ))
  }

timelyportfolio commented Aug 18, 2014

I cannot think of much more to throw at it than this monstrosity replicating a post I had done previously.

# from timelyportfolio lme4 error bar post
# http://timelyportfolio.github.io/rCharts_errorbar/ucla_melogit.html
"http://www.ats.ucla.edu/stat/data/hdp.csv" %>>%
  read.csv %>>%
  within( {
    Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
    DID <- factor(DID)
    HID <- factor(HID)
  } ) %>>%
  {
    glmer(remission ~ Age + LengthofStay + FamilyHx + IL6 + CRP +
          CancerStage + Experience + (1 | DID) + (1 | HID),
          data = ., family = binomial, nAGQ=1)
  } %>>%  # show the dotplot as a reference
  (~
     print(lattice::dotplot(
       ranef(., which = "DID", postVar = TRUE),
       scales = list(y = list(alternating = 0))
     ))
  ) %>>%
  { ranef(object  = ., which = "DID", postVar = TRUE)$DID } %>>%
  {
    data.frame(
      "id" = rownames(.),  #this will be our x
      "intercept" = .[,1],            #this will be our y
      "se" = as.numeric(attr( ., "postVar" ))  #this will be our se
    )
  } %>>%  #had not thought of this use to add library
  (~ library(rCharts) ) %>>%  
  #rCharts good ref class reference for side effect helpfulness
  {
    setRefClass(
      "rChartsError"
      ,contains="rCharts"
      ,methods=list(
        initialize = function(){
          callSuper()
        }
        ,getPayload = function(chartId){
          list(chartParams = toJSON2(params), chartId = chartId, lib = basename(lib), liburl = LIB$url)
        }
      )
    )$new() %>>%
        (~ .$setLib("http://timelyportfolio.github.io/rCharts_errorbar") ) %>>%
        (~ .$setTemplate (
          script = "http://timelyportfolio.github.io/rCharts_errorbar/layouts/chart.html"
          ,chartDiv = "<div></div>"
        ) ) %>>%
        (~ .$set(
          data = get(".",parent.env(environment())),  #ugly but don't know better way
          height = 500,
          width = 1000,
          margin = list(top = 10, bottom = 10, right = 50, left = 100),
          x = "id",
          y = "intercept",
          radius = 2,
          sort = list( var = "intercept" ),
          whiskers = "#!function(d){return [d.intercept - 1.96 * d.se, d.intercept + 1.96 * d.se]}!#",
          tooltipLabels = c("id","intercept","se") 
        ))
  }
@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

another use similar to logging would be to write a file with results for reproducibility.

timelyportfolio commented Aug 18, 2014

another use similar to logging would be to write a file with results for reproducibility.

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 18, 2014

more plotting examples

pdf("test.pdf")
data.frame( x = 1:10, y = 1:10 ) %>>%
  ( ~ plot( x = .[,"x"], y = .[,"y"], type = "b" ) ) %>>%
  ( ~ library(latticeExtra) ) %>>%
  ( ~ xyplot( y ~ x, data = ., type = c("p","l") ) %>>% print %>>%  ( ~ asTheEconomist(.) %>>% print ) ) %>>%
  ( ~ library(ggplot2) ) %>>% 
  ( ~ ggplot( ., aes( x = x, y = y) )  %>>% + geom_line() %>>% + geom_point() %>>% print )
dev.off()

a little different look at it with a focus on ggplot2

data.frame( x = 1:10, y = 1:10 ) %>>%
    ggplot( aes(x=x,y=y) ) %>>%
    ((g1) ~ print( g1  + geom_point()) ) %>>%
    ( ~ print( . + geom_line() )) %>>%
    str

timelyportfolio commented Aug 18, 2014

more plotting examples

pdf("test.pdf")
data.frame( x = 1:10, y = 1:10 ) %>>%
  ( ~ plot( x = .[,"x"], y = .[,"y"], type = "b" ) ) %>>%
  ( ~ library(latticeExtra) ) %>>%
  ( ~ xyplot( y ~ x, data = ., type = c("p","l") ) %>>% print %>>%  ( ~ asTheEconomist(.) %>>% print ) ) %>>%
  ( ~ library(ggplot2) ) %>>% 
  ( ~ ggplot( ., aes( x = x, y = y) )  %>>% + geom_line() %>>% + geom_point() %>>% print )
dev.off()

a little different look at it with a focus on ggplot2

data.frame( x = 1:10, y = 1:10 ) %>>%
    ggplot( aes(x=x,y=y) ) %>>%
    ((g1) ~ print( g1  + geom_point()) ) %>>%
    ( ~ print( . + geom_line() )) %>>%
    str
@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 19, 2014

Owner

A mix for all features:

library(pipeR)
mtcars %>>%
  (~ cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  ( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
  summary() %>>%
  (coefficients) %>>%
  ((coe) ~ cat("coefficients:",class(coe),"\n")) %>>%
  ((coe) ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")

I think the ((x) ~ expr) part is not quite clear for distinction or not obvious to regard as side effect.

I'm considering take the syntax of the following:

  • x %>>% ( ~ expr) for side effect with . = x
  • x %>>% ( ~ p ~ expr ) for side effect with p = x

The syntax looks more uniform and makes more sense to me. And luckily it can be parsed in desired way.


> as.list(quote(~ x ~ x + 1))
[[1]]
`~`

[[2]]
~x

[[3]]
x + 1

What do you think?

Owner

renkun-ken commented Aug 19, 2014

A mix for all features:

library(pipeR)
mtcars %>>%
  (~ cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  ( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
  summary() %>>%
  (coefficients) %>>%
  ((coe) ~ cat("coefficients:",class(coe),"\n")) %>>%
  ((coe) ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")

I think the ((x) ~ expr) part is not quite clear for distinction or not obvious to regard as side effect.

I'm considering take the syntax of the following:

  • x %>>% ( ~ expr) for side effect with . = x
  • x %>>% ( ~ p ~ expr ) for side effect with p = x

The syntax looks more uniform and makes more sense to me. And luckily it can be parsed in desired way.


> as.list(quote(~ x ~ x + 1))
[[1]]
`~`

[[2]]
~x

[[3]]
x + 1

What do you think?

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 19, 2014

Owner

It's very interesting that my expression analyzer directly support the syntax of ~ x ~ expr. In fact any syntax where lhs is length 2 will indicate that the 2nd element in lhs will be regarded as the symbol for side effect expression.

  • (x) is (,x
  • ~x is ~,x
  • f(x) is f, x

The following code will run without having to change any code:

mtcars %>>%
  (~ cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  ( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
  summary() %>>%
  (coefficients) %>>%
  (~ coe ~ cat("coefficients:",class(coe),"\n")) %>>%
  (~ coe ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")
Owner

renkun-ken commented Aug 19, 2014

It's very interesting that my expression analyzer directly support the syntax of ~ x ~ expr. In fact any syntax where lhs is length 2 will indicate that the 2nd element in lhs will be regarded as the symbol for side effect expression.

  • (x) is (,x
  • ~x is ~,x
  • f(x) is f, x

The following code will run without having to change any code:

mtcars %>>%
  (~ cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  ( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
  summary() %>>%
  (coefficients) %>>%
  (~ coe ~ cat("coefficients:",class(coe),"\n")) %>>%
  (~ coe ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")
@yanlinlin82

This comment has been minimized.

Show comment
Hide comment
@yanlinlin82

yanlinlin82 Aug 19, 2014

I was wondering why this would be treated as a "side effect".

I prefer to look it directly as the final return value of the whole pipe expression of (A %>>% fun). Since the default return value of a pipe expression is the rhs, why don't you define another operator for such "returning lhs" requirement, which I think may leave the pipe itself more clear.

For example:

1:10 %>>% mean # return mean(1:10)
1:10 %<<% mean # calculate mean(1:10) but only return lhs, i.e. 1:10

yanlinlin82 commented Aug 19, 2014

I was wondering why this would be treated as a "side effect".

I prefer to look it directly as the final return value of the whole pipe expression of (A %>>% fun). Since the default return value of a pipe expression is the rhs, why don't you define another operator for such "returning lhs" requirement, which I think may leave the pipe itself more clear.

For example:

1:10 %>>% mean # return mean(1:10)
1:10 %<<% mean # calculate mean(1:10) but only return lhs, i.e. 1:10

@yanlinlin82

This comment has been minimized.

Show comment
Hide comment
@yanlinlin82

yanlinlin82 Aug 19, 2014

A more comprehensive example could be like this:

x <- 1:10 # First I have a data set
print(x %>>% plot) # Plot the data set, and the whole pipe expression returns NULL
x %<<% plot %>>% mean # What if I want to calculate mean() while plotting it

I think this should be more clear than:

x %>>% (~ plot) %>>% mean
Because in the latter scenario, I need to understand the pipe expression first and then found that it is a "side effect".

yanlinlin82 commented Aug 19, 2014

A more comprehensive example could be like this:

x <- 1:10 # First I have a data set
print(x %>>% plot) # Plot the data set, and the whole pipe expression returns NULL
x %<<% plot %>>% mean # What if I want to calculate mean() while plotting it

I think this should be more clear than:

x %>>% (~ plot) %>>% mean
Because in the latter scenario, I need to understand the pipe expression first and then found that it is a "side effect".

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 19, 2014

Owner

Thanks @yanlinlin82 for your opinion. You just pointed out the core problem in this issue: more operators or more syntax?

Let's see the example with %<<% being the side-effect operator or simply use magrittr's %T>%.

mtcars %<<%
  (cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  (lm(mpg ~ cyl + disp + wt + factor(vs), data = .)) %>>%
  summary() %>>%
  (coefficients) %<<%
  (coe ~ cat("coefficients:",class(coe),"\n") ) %<<%
  (coe ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")

I feel I must scan the code very carefully to understand which line is forward piping and which line is only side effect. In this line-by-line example, only when I look back and find which operator is used can I assure whether it is a side effect or not. Neither can I quickly find the input of the "normal" lines without carefully back-looking at the code.

I think the same problem exists with magrittr's %T>%:

mtcars %T>%
  (l(. ~ cat("data:",ncol(.),"columns\n"))) %>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>%
  lm(mpg ~ cyl + disp + wt + factor(vs), data = .) %>%
  summary() %$%
  coefficients %T>%
  (l(. ~ cat("coefficients:",class(.),"\n"))) %>%
  print %>%
  (l(coe ~ coe[-1,1])) %>%
  barplot(main = "coefficients")

Do you feel you can quickly understand which object is piped to where and quickly pick out the important "really-doing-stuff" lines? Frankly speaking, I can't, because a little operator is too small to distinguish and in line-by-line piping, the operator must be written in the previous line which determines how the next piping works.

Look at the new syntax where one wants to do some logging between pipes:

library(pipeR)
mtcars %>>%
  (~ cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  ( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
  summary() %>>%
  (coefficients) %>>%
  (~ coe ~ cat("coefficients:",class(coe),"\n")) %>>%
  (~ coe ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")

I feel rather clear when I simply take a glimpse at the code if I know (~ expr) or (~ x ~ expr) indicates side-effect (it is only one side) and I don't have to care about the operator anymore thus not have to look back, because there's only one.

That's why I feel there are too many operators and hard to distinguish at a first glimpse. But with syntax, it should be much much easier to understand the code at first glimpse. That's why I make () more special because it's an alert that something special happens, and can be seen directly inline rather than an operator located in previous line.

A typical case is that one does not use this feature that heavily but rarely. Therefore, it should be like

Pipe(mtcars)$
  .(~ cat("data:",ncol(.),"columns\n"))$
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95))$
  .(lm(mpg ~ cyl + disp + wt + factor(vs), data = .))$
  summary()

Just take a glimpse at the code, and it should be easy to find all lines that start with (~, if you want to understand the code quickly, just ignore all these lines and see what's being done and piped. But if the code uses more operators, I believe you won't understand it or scan it so quickly because you have to carefully look at the little symbol in the end of each line.

If you want to find out the input of a normal line, it should be pretty easy if you only look at the header of each line and look back until a line that does not start with (~, that is the line whose output is the input you want to know.

Owner

renkun-ken commented Aug 19, 2014

Thanks @yanlinlin82 for your opinion. You just pointed out the core problem in this issue: more operators or more syntax?

Let's see the example with %<<% being the side-effect operator or simply use magrittr's %T>%.

mtcars %<<%
  (cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  (lm(mpg ~ cyl + disp + wt + factor(vs), data = .)) %>>%
  summary() %>>%
  (coefficients) %<<%
  (coe ~ cat("coefficients:",class(coe),"\n") ) %<<%
  (coe ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")

I feel I must scan the code very carefully to understand which line is forward piping and which line is only side effect. In this line-by-line example, only when I look back and find which operator is used can I assure whether it is a side effect or not. Neither can I quickly find the input of the "normal" lines without carefully back-looking at the code.

I think the same problem exists with magrittr's %T>%:

mtcars %T>%
  (l(. ~ cat("data:",ncol(.),"columns\n"))) %>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>%
  lm(mpg ~ cyl + disp + wt + factor(vs), data = .) %>%
  summary() %$%
  coefficients %T>%
  (l(. ~ cat("coefficients:",class(.),"\n"))) %>%
  print %>%
  (l(coe ~ coe[-1,1])) %>%
  barplot(main = "coefficients")

Do you feel you can quickly understand which object is piped to where and quickly pick out the important "really-doing-stuff" lines? Frankly speaking, I can't, because a little operator is too small to distinguish and in line-by-line piping, the operator must be written in the previous line which determines how the next piping works.

Look at the new syntax where one wants to do some logging between pipes:

library(pipeR)
mtcars %>>%
  (~ cat("data:",ncol(.),"columns\n")) %>>%
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95)) %>>%
  ( lm(mpg ~ cyl + disp + wt + factor(vs), data = .) ) %>>%
  summary() %>>%
  (coefficients) %>>%
  (~ coe ~ cat("coefficients:",class(coe),"\n")) %>>%
  (~ coe ~ print(coe)) %>>%
  (coe ~ coe[-1,1]) %>>%
  barplot(main = "coefficients")

I feel rather clear when I simply take a glimpse at the code if I know (~ expr) or (~ x ~ expr) indicates side-effect (it is only one side) and I don't have to care about the operator anymore thus not have to look back, because there's only one.

That's why I feel there are too many operators and hard to distinguish at a first glimpse. But with syntax, it should be much much easier to understand the code at first glimpse. That's why I make () more special because it's an alert that something special happens, and can be seen directly inline rather than an operator located in previous line.

A typical case is that one does not use this feature that heavily but rarely. Therefore, it should be like

Pipe(mtcars)$
  .(~ cat("data:",ncol(.),"columns\n"))$
  subset(mpg >= quantile(mpg, 0.05) & mpg <= quantile(mpg,0.95))$
  .(lm(mpg ~ cyl + disp + wt + factor(vs), data = .))$
  summary()

Just take a glimpse at the code, and it should be easy to find all lines that start with (~, if you want to understand the code quickly, just ignore all these lines and see what's being done and piped. But if the code uses more operators, I believe you won't understand it or scan it so quickly because you have to carefully look at the little symbol in the end of each line.

If you want to find out the input of a normal line, it should be pretty easy if you only look at the header of each line and look back until a line that does not start with (~, that is the line whose output is the input you want to know.

@timelyportfolio

This comment has been minimized.

Show comment
Hide comment
@timelyportfolio

timelyportfolio Aug 19, 2014

I agree x %>>% ( ~ p ~ expr ) for side effect with p = x is clearer to me.

I also vote against %<<%.

timelyportfolio commented Aug 19, 2014

I agree x %>>% ( ~ p ~ expr ) for side effect with p = x is clearer to me.

I also vote against %<<%.

@yanlinlin82

This comment has been minimized.

Show comment
Hide comment
@yanlinlin82

yanlinlin82 Aug 19, 2014

I finally see your opinion.​ You are using "side effect" syntax to ignore
branch steps to make it easy to find the main pipe stream. Then I admit it
is better than involving another operator.

yanlinlin82 commented Aug 19, 2014

I finally see your opinion.​ You are using "side effect" syntax to ignore
branch steps to make it easy to find the main pipe stream. Then I admit it
is better than involving another operator.

renkun-ken added a commit that referenced this issue Aug 19, 2014

renkun-ken added a commit that referenced this issue Aug 19, 2014

renkun-ken added a commit that referenced this issue Aug 19, 2014

@yanlinlin82

This comment has been minimized.

Show comment
Hide comment
@yanlinlin82

yanlinlin82 Aug 19, 2014

By the way, it just occurred to me that will it always have a main stream in a pipe, with or without other branches. That is to say, if a data set is to be processed by different procedures simultaneously, and if we want them all in a pipe, then we need to arbitrarily make one procedure be primary, and other procedures be branches, it this right?

For example:

x <- c(... some data ...)
proc1: foo1A(x); foo1B(x); foo1C(x); ...
proc2: foo2A(x); foo2B(x); foo2C(x); ...
proc3: foo3A(x); foo3B(x); foo3C(x); ...

Then it could be written like this:

x %>>%
(~ foo1A %>>% foo1B %>>% foo1C %>>% ...) %>>%
(~ foo2A %>>% foo2B %>>% foo2C %>>% ...) %>>%
foo3A %>>% foo3B %>>% foo3C ...

yanlinlin82 commented Aug 19, 2014

By the way, it just occurred to me that will it always have a main stream in a pipe, with or without other branches. That is to say, if a data set is to be processed by different procedures simultaneously, and if we want them all in a pipe, then we need to arbitrarily make one procedure be primary, and other procedures be branches, it this right?

For example:

x <- c(... some data ...)
proc1: foo1A(x); foo1B(x); foo1C(x); ...
proc2: foo2A(x); foo2B(x); foo2C(x); ...
proc3: foo3A(x); foo3B(x); foo3C(x); ...

Then it could be written like this:

x %>>%
(~ foo1A %>>% foo1B %>>% foo1C %>>% ...) %>>%
(~ foo2A %>>% foo2B %>>% foo2C %>>% ...) %>>%
foo3A %>>% foo3B %>>% foo3C ...

@renkun-ken

This comment has been minimized.

Show comment
Hide comment
@renkun-ken

renkun-ken Aug 19, 2014

Owner

@yanlinlin82 That's a very interesting insight! I have not yet considered much about "branching" in pipeline. It looks quite interesting. For example,

m <- data.frame(x=1:10)
par(mfrow=c(2,2))
m %>>%
  (~ . %>>% transform(y=x) %>>% plot(type="l")) %>>%
  (~ . %>>% transform(y=x^2) %>>% plot(type="l")) %>>%
  (~ . %>>% transform(y=sin(x/2)) %>>% plot(type="l")) %>>%
  (~ . %>>% transform(y=cos(x/2)) %>>% plot(type="l"))

which has four branches to manipulate one piece of data :)

Owner

renkun-ken commented Aug 19, 2014

@yanlinlin82 That's a very interesting insight! I have not yet considered much about "branching" in pipeline. It looks quite interesting. For example,

m <- data.frame(x=1:10)
par(mfrow=c(2,2))
m %>>%
  (~ . %>>% transform(y=x) %>>% plot(type="l")) %>>%
  (~ . %>>% transform(y=x^2) %>>% plot(type="l")) %>>%
  (~ . %>>% transform(y=sin(x/2)) %>>% plot(type="l")) %>>%
  (~ . %>>% transform(y=cos(x/2)) %>>% plot(type="l"))

which has four branches to manipulate one piece of data :)

@renkun-ken renkun-ken closed this Aug 22, 2014

@renkun-ken renkun-ken added this to the 0.4-2 milestone Sep 12, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment