New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Friendlier compose() #366

Closed
egnha opened this Issue Aug 9, 2017 · 21 comments

Comments

Projects
None yet
3 participants
@egnha
Copy link
Contributor

egnha commented Aug 9, 2017

It would be nice to have a "friendlier" version of compose() that:

  • sets the formals to that of the first called function
  • is more consistent with tidyverse semantics (i.e., supports splicing and formula syntax)
  • has a more informative print method
  • works as a drop-in replacement

To satisfy these desiderata (the second of which I presume is planned), I have something like the following in mind, which is a minor tweaking of the current implementation:

library(rlang)

compose <- function(...) {
  fs <- lapply(dots_list(...), as_function)
  n <- length(fs)
  
  last <- as_closure(fs[[n]])
  `__call_last` <- function() {
    call_last <- mut_node_car(sys.call(-1), last)
    eval_bare(call_last, parent.frame(2))
  }
  `__rest` <- rev(fs[-n])
  
  set_attrs(
    `formals<-`(
      value = formals(last),
      function() {
        out <- `__call_last`()
        for (f in `__rest`)
          out <- f(out)
        out
      }
    ),
    class = "composite_function"
  )
}

decompose <- function(x) {
  if (!inherits(x, "composite_function"))
    abort("Not a composite function")
    
  environment(x)$fs
}

print.composite_function <- function(x, ...) {
  cat("<composite_function>\n")
  cat("(Listed in calling order)\n")
  
  for (f in rev(decompose(x))) {
    cat("\n")
    print(f)
  }
  
  invisible(x)
}

(The method for formals setting is the same as in #349.)

Example:

foo <- compose(log, ~abs(.) + 1)

foo
#> <composite_function>
#> (Listed in calling order)
#> 
#> function (..., .x = ..1, .y = ..2, . = ..1) 
#> abs(.) + 1
#> 
#> function (x, base = exp(1))  .Primitive("log")

args(foo)
#> function (..., .x = ..1, .y = ..2, . = ..1) 
#> NULL

decompose(foo)
#> [[1]]
#> function (x, base = exp(1))  .Primitive("log")
#> 
#> [[2]]
#> function (..., .x = ..1, .y = ..2, . = ..1) 
#> abs(.x) + 1

egnha added a commit to egnha/purrr that referenced this issue Aug 11, 2017

@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 11, 2017

Invoking mut_node_car() in __call_last() is problematic, because a function in the composition could inspect the call stack. Instead, we should copy sys.call(-1):

yes if you're going to modify a call from the stack you should always rlang::duplicate() it.

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 11, 2017

I see, thanks for the tip!

If I've used duplicate() correctly, then I think the following should now be safe.

compose <- function(...) {
  fs <- lapply(dots_list(...), rlang::as_function)
  n <- length(fs)

  last <- as_closure(fs[[n]])
  `__call_last` <- function() {
    call <- duplicate(sys.call(-1))
    eval_bare(mut_node_car(call, last), parent.frame(2))
  }
  `__rest` <- rev(fs[-n])

  set_attrs(
    `formals<-`(
      value = formals(last),
      function() {
        out <- `__call_last`()
        for (f in `__rest`)
          out <- f(out)
        out
      }
    ),
    class = "composite_function"
  )
}

(Not 100% sure there isn't some "pathologically" impure function that could spoil this ... )

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 11, 2017

The proposed compose(), like purrr's current one, implicitly assumes referential transparency of the functions composed (since the function evaluations, aside from the the first, don't happen in the calling frame). Would be helpful for the doc to mention that (perhaps under “Details”?).

@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 11, 2017

That's a general assumptions of function operators, not sure what is the right place to document this. Also introspective and even reflexive functions might still work, there's just no guarantees.

By the way could you use more intermediary results in your code? They help the reader, especially with well chosen names. Also don't call formals<- with prefix form ;)

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 11, 2017

I was trying to keep the formals adjacent to the function declaration, but I do agree that it's unconventional. Consider it gone. ;)

Does this read better? Any further suggestions?

compose <- function(...) {
  fs <- lapply(dots_list(...), rlang::as_function)
  n <- length(fs)

  last <- as_closure(fs[[n]])
  `__call_last` <- function() {
    call <- duplicate(sys.call(-1))
    eval_bare(mut_node_car(call, last), parent.frame(2))
  }
  `__rest` <- rev(fs[-n])

  composite <- function() {
    out <- `__call_last`()
    for (f in `__rest`)
      out <- f(out)
    out
  }
  formals(composite) <- formals(last)
  class(composite) <- "composite_function"

  composite
}
@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 11, 2017

You could use the fn particle for functions as in rlang. Also I would write the call mutation like this:

mut_node_car(call, last)
eval_bare(call, parent.frame(2))

to make it clearer that some side effects are going on.

I wonder if there's a better way to forward the arguments. The capture of the function call + evaluation in the parent frame seems brittle to me.

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 11, 2017

Thanks for those suggestions. Makes sense. I'll incorporate them.

This pass-arguments-then-call-in-parent-frame pattern is not uncommon for these kinds of functional operators. It does feel to me, too, like a bit of a contortion for something so simple. If there's a more robust, or succinct, way to do it, I would certainly be interested in that.

I trust that you are right when you say that the current method is potentially brittle, but I don't quite understand how ... :/

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 11, 2017

By the way, another change—improvement?—to compose() that might be worth considering, as well, would be to allow compose() itself to be more composable, by avoiding the accumulation of nested calls (composition being associative).

Say, something like this: instead of

fs <- lapply(dots_list(...), rlang::as_function)

one could try unpacking any nested compositions

fs <- flatten_fns(...)

where

flatten_fns <- function(...) {
  fns <- lapply(dots_list(...), as_fn_decomposed)
  flatten(fns)
}
as_fn_decomposed <- function(x) {
  if (inherits(x, "composite_function"))
    decompose(x)
  else
    rlang::as_function(x)
}

This won't unpack compositions beyond the first level, however. (Would need to decompose recursively.)

@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 11, 2017

I like the forwarding of formals but I think there's not much gain from a better print method here. If we really needed it it would probably be better to use expr_interp() on the returned function so that its source code appears like this:

compose(log, ~abs(.) + 1)
compose(log, ~abs(.) + 1)
#> function(x, base = exp(1))
#> {
#>     x <- .Primitive("log")(x = x, base = base)
#>     x <- (function (..., .x = ..1, .y = ..2, . = ..1)
#>       abs(.) + 1)(x)
#>
#>     x
#> }

A better source could be produced by specifying names to functions and storing them in the enclosure of the composite function.

@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 11, 2017

One more thought, there is no need to duplicate the whole call so you can use shallow = TRUE. Or this would be my preference: new_language(last, node_cdr(call)).

You only need to duplicate the parents of the node you're mutating, and here call is the root of the call tree.

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 12, 2017

I wasn't aware of new_language(), thanks for pointing that out.

Incorporating your recommendations, compose() now looks like this:

compose <- function(...) {
  fns <- flatten_fn_list(...)
  n <- length(fns)

  fn_last <- as_closure(fns[[n]])
  `__call_fn_last` <- function() {
    call <- new_language(fn_last, node_cdr(sys.call(-1)))
    eval_bare(call, parent.frame(2))
  }
  `__fns_rest` <- rev(fns[-n])

  fn_comp <- function() {
    out <- `__call_fn_last`()
    for (f in `__fns_rest`)
      out <- f(out)
    out
  }
  formals(fn_comp) <- formals(fn_last)
  class(fn_comp) <- "composite_function"

  fn_comp
}

with the small addition of flattening of nested compositions (same as above, hopefully with better names):

flatten_fn_list <- function(...) {
  fns <- lapply(dots_list(...), as_decomposed_function)
  unlist(fns)
}

as_decomposed_function <- function(x) {
  if (inherits(x, "composite_function"))
    decompose(x)
  else
    rlang::as_function(x)
}

(Contrary to my previous claim, the flattening happens all the way down.)

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 12, 2017

With flattening, the following all yield the same function.

compose(log, abs, sin, `+`)
compose(log, abs, sin, compose(`+`))
compose(log, abs, compose(sin,  compose(`+`)))
compose(log, compose(abs, compose(sin,  compose(`+`))))

In particular, the call modifier `__call_fn_last`() is only invoked once.

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 12, 2017

Your suggested print method is an interesting idea; it certainly more accurately reflects the underlying action than just listing the functions.

Another thought: If there are a lot of functions, or if their bodies are long, some kind of output truncation could be useful.

In any case, I think some kind of specialized print method is helpful for compositions, because otherwise you'd just see this:

print.default(compose(log, abs, sin, `+`))
#> function (.x, .y) 
#> {
#>   out <- `__call_fn_last`()
#>   for (f in `__fns_rest`) out <- f(out)
#>   out
#> }
#> <environment: 0x7fccae013e98>
#>   attr(,"class")
#> [1] "composite_function"
@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 12, 2017

I think we should inline the function literal so you would see the source. And optionally a symbol if the function was supplied with a name. This way we don't need a print method and s3 class.

@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 12, 2017

I also don't think we need a complicated decomposition / recomposition logic just for printing, I'd rather just have the user manipulate a list of functions beforehand.

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 12, 2017

If the decomposition logic were just for printing, then I agree, that'd be overkill.

But what about situations where you'd want to reuse certain parts of a composition/pipeline? Wouldn't decompose() be convenient for that?

For example (probably not the best):

serialize_as_df <- compose(as.data.frame, some_operation)
serialize_as_json <- compose(to_json, decompose(serialize_as_df)[-1])

The point is, the user might not have (direct) access to some_operation() (e.g., a non-exported function), or imagine some_operation standing for a list of functions. Of course, the user could create serialize_as_json without decompose(). But it would require the user to exploit an implementation detail of compose().

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 12, 2017

Currently the S3 class is being used for printing and for decomposing (morally, decompose() is generic). It would indeed be nice if this dual functionality could be maintained without relying on a class. But I don't see an easy way to do that ... :/

@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Aug 14, 2017

It would indeed be nice if this dual functionality could be maintained without relying on a class. But I don't see an easy way to do that ... :/

You could store the list of functions in a sentinel value in the closure env, e.g. __purrr_composed_fns.

The S3 class would allow handling of magrittr functional sequences.

@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Aug 14, 2017

Ah, now I think I understand what you're getting at. If I understood correctly, we should now have something like this (ignoring the printing issue):

compose <- function(...) {
  `__purrr_composed_fns` <- flatten_fn_list(...)
  n <- length(`__purrr_composed_fns`)

  fn_last <- as_closure(`__purrr_composed_fns`[[n]])
  `__call_fn_last` <- function() {
    call <- new_language(fn_last, node_cdr(sys.call(-1)))
    eval_bare(call, parent.frame(2))
  }
  `__fns_rest` <- rev(`__purrr_composed_fns`[-n])

  fn_comp <- function() {
    out <- `__call_fn_last`()
    for (f in `__fns_rest`)
      out <- f(out)
    out
  }
  formals(fn_comp) <- formals(fn_last)

  fn_comp
}

flatten_fn_list <- function(...) {
  fns <- lapply(dots_list(...), as_fn_decomposition)
  unlist(fns)
}

as_fn_decomposition <- function(x) {
  decompose(x) %||% rlang::as_function(x)
}

decompose <- function(x) {
  environment(x)$`__purrr_composed_fns`
}
@egnha

This comment has been minimized.

Copy link
Contributor

egnha commented Nov 6, 2017

Initially, I had defined an ad hoc composite_function class to enable a print method that would show the component functions of a composition.

Now that rlang has an fn class for streamlined printing of functions, could we use that here as well? That is, we rewrite the srcref attribute for the value of compose() to show the components, and give it class fn.

@lionel-

This comment has been minimized.

Copy link
Member

lionel- commented Nov 6, 2017

I don't think you should use srcref attributes for printing, they are for step-debugging.

Edit: I now think srcref attrs are for printing as well.

The compose() function might become more important with flowery transducers. It'd indeed be nice to have composed fns print nicely. Not sure about the right approach to do it though.

@hadley hadley added the feature label Feb 5, 2018

@hadley hadley added the adverb 📚 label May 5, 2018

lionel- added a commit to lionel-/lowliner that referenced this issue Dec 5, 2018

lionel- added a commit to lionel-/lowliner that referenced this issue Dec 6, 2018

@lionel- lionel- closed this in #594 Dec 6, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment