Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider syntax for assigning intermediate value to symbol #37

Closed
renkun-ken opened this issue Aug 28, 2014 · 9 comments
Closed

Consider syntax for assigning intermediate value to symbol #37

renkun-ken opened this issue Aug 28, 2014 · 9 comments
Assignees
Milestone

Comments

@renkun-ken
Copy link
Owner

It's a common demand that an intermediate result be assigned to a symbol in the current environment (often global environment) for further use. This clearly is one type of side effect that the current environment is changed.

Currently, there's no easy syntax that supports the assignment operation but manually call assign() like

mtcars %>>%
  subset(mpg <= mean(mpg)) %>>%
  (~ assign("x", ., envir = .GlobalEnv)) %>>%
  plot

The code works but it is only easy for global environment or some named environment. For local environment, it does not work with parent.frame().

Consider a syntax that derives from side-effect syntax that performs assignment operation like this.

@renkun-ken renkun-ken self-assigned this Aug 28, 2014
@renkun-ken
Copy link
Owner Author

A draft syntax is like

x %>>% (~ symbol)
x %>>% (~ f(.) ~ symbol)
x %>>% (~ x ~ f(x) ~ symbol)

It can be best described by Start with ~ for side effect and end with a symbol for assignment.

An example is

mtcars %>>%
  subset(mpg <= mean(mpg)) %>>%
  (~ smtcars) %>>%
  (~ dim(.) ~ dim_mtcars) %>>%
  subset(select = c(mpg, wt, qsec)) %>>%
  lm(formula = mpg ~ .) %>>%
  summary %>>%
  (~ summ) %>>%
  (coefficients)
              Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 17.0183914  5.1411954  3.310201 0.0047583237
wt          -2.9781345  0.6044032 -4.927397 0.0001823504
qsec         0.6033051  0.3053237  1.975953 0.0668509086

Inspect the environment after evaluating the code above.

> ls.str()
dim_mtcars :  int [1:2] 18 11
smtcars : 'data.frame': 18 obs. of  11 variables:
 $ mpg : num  18.7 18.1 14.3 19.2 17.8 16.4 17.3 15.2 10.4 10.4 ...
 $ cyl : num  8 6 8 6 6 8 8 8 8 8 ...
 $ disp: num  360 225 360 168 168 ...
 $ hp  : num  175 105 245 123 123 180 180 180 205 215 ...
 $ drat: num  3.15 2.76 3.21 3.92 3.92 3.07 3.07 3.07 2.93 3 ...
 $ wt  : num  3.44 3.46 3.57 3.44 3.44 ...
 $ qsec: num  17 20.2 15.8 18.3 18.9 ...
 $ vs  : num  0 1 0 1 1 0 0 0 0 0 ...
 $ am  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ gear: num  3 3 3 4 4 3 3 3 3 3 ...
 $ carb: num  2 1 4 4 4 3 3 3 4 4 ...
summ : List of 11
 $ call         : language lm(formula = mpg ~ ., data = .)
 $ terms        :Classes 'terms', 'formula' length 3 mpg ~ wt + qsec
 $ residuals    : Named num [1:18] 1.658 -0.813 -1.643 1.386 -0.376 ...
 $ coefficients : num [1:3, 1:4] 17.018 -2.978 0.603 5.141 0.604 ...
 $ aliased      : Named logi [1:3] FALSE FALSE FALSE
 $ sigma        : num 1.79
 $ df           : int [1:3] 3 15 3
 $ r.squared    : num 0.623
 $ adj.r.squared: num 0.573
 $ fstatistic   : Named num [1:3] 12.4 2 15
 $ cov.unscaled : num [1:3, 1:3] 8.217 -0.178 -0.437 -0.178 0.114 ...

@renkun-ken
Copy link
Owner Author

Given all syntax with (~ ...), operator ~ can be viewed in this context to be branching operator, which indicates that the following expression will be a side effect. It can either branch the left-hand side value to an expression (side-effect evaluation), or branch it to a symbol (assignment). After all, there's no point to evaluate a symbol for side effect (no side effect at all). Therefore this syntax seems not to create additional confusion or work at the expense of possible actions allowed in cases without this feature.

renkun-ken added a commit that referenced this issue Aug 28, 2014
renkun-ken added a commit that referenced this issue Aug 30, 2014
@renkun-ken
Copy link
Owner Author

Consider the = syntax suggested by @yanlinlin82.
See #38.

@renkun-ken
Copy link
Owner Author

The following code adopts the = syntax.

mtcars %>>%
  subset(mpg <= mean(mpg)) %>>%
  (~ smtcars) %>>%   # side-effect assign
  (~ dim_mtcars = dim(.)) %>>%   # side-effect assign
  subset(select = c(mpg, wt, qsec)) %>>%
  lm(formula = mpg ~ .) %>>%
  (sum_lm = summary(.)) %>>%   # eval and assign
  (coefficients)

@timelyportfolio
Copy link

Definitely prefer this. I think this is much clearer, intuitive, and more readable.

@renkun-ken
Copy link
Owner Author

Think so too. Thanks @yanlinlin82 for the great suggestion. I'll implement it at branch feature/assign soon and see how it works.

renkun-ken added a commit that referenced this issue Aug 31, 2014
@renkun-ken
Copy link
Owner Author

The latest commit at feature/assign uses symbolic call to perform the assignment, which allows the following usage:

> z <- list()
> 1:10 %>>% (~ z$a = length(.)) %>>% mean
[1] 5.5
> z
$a
[1] 10

That is, the assignment no longer calls assign() but builds a symbolic call to perform the assignment, which does not require the expression on lhs of = be a symbol and allows the usage like names(a) = ....

@yanlinlin82
Copy link

That is more powerful!​

@renkun-ken
Copy link
Owner Author

In v0.5, <- and -> will no longer be interpreted as lambda expression and are allowed to perform assignment in a pipeline, which makes the code even more readable in some cases.

@renkun-ken renkun-ken reopened this Sep 12, 2014
@renkun-ken renkun-ken added this to the 0.4-3 milestone Sep 12, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants