Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anonymous functions in pmap #203

Closed
1danjordan opened this issue Jun 15, 2016 · 10 comments
Closed

Anonymous functions in pmap #203

1danjordan opened this issue Jun 15, 2016 · 10 comments

Comments

@1danjordan
Copy link
Contributor

The documentation for pmap is bundled in with map2 and doesn't include any pmap specific examples. This makes it unclear how to access lists in .f, and after a lot of investigation I'm still stumped. In map2, the variables are simply .x and .y, and in the deprecated map3 were .x, .y and .z. The below example works for the first two lists, but I have no idea how access the third list.

a  <- list(name1 = 1, name2 = 1, name3 = 1)
b  <- list(name1 = 1, name2 = 10, name3 = 100)
c  <- list(name1 = 5, name2 = 50, name3 = 500)
pmap(list(a, b, c), ~ .x + .y)
$name1
[1] 2

$name2
[1] 11

$name3
[1] 101

pmap(list(a,b, c), ~ .x + .y + .z)
Error in .f(.l[[c(1L, i)]], .l[[c(2L, i)]], .l[[c(3L, i)]], ...) : 
  object '.z' not found

Obviously this pattern could not extend to a list of length n, but I can't work out what the convention could be. If this is me missing something blatant or pmap is not designed to work in this way, then I apologise!

@jennybc
Copy link
Member

jennybc commented Jun 15, 2016

The function should work on the tuple that comes from selecting elements i from all the lists. I almost always do this with data frames, in which case the function should operate on one row.

In your case, you could specify sum to just "add everything up".

pmap(list(a, b, c), sum)
#> $name1
#> [1] 7
#> 
#> $name2
#> [1] 61
#> 
#> $name3
#> [1] 601

Or you can use argument matching by position or name (in which case it's really important to name your input list!).

pmap(list(a, b, c), function(x, y, z) x^2 + exp(y) + (z - 1))
#> $name1
#> [1] 7.718282
#> 
#> $name2
#> [1] 22076.47
#> 
#> $name3
#> [1] 2.688117e+43

f <- function(c, b, a) c - b - a
pmap(dplyr::lst(a, b, c), f)
#> $name1
#> [1] 3
#> 
#> $name2
#> [1] 39
#> 
#> $name3
#> [1] 399
pmap(dplyr::lst(b, c, a), f)
#> $name1
#> [1] 3
#> 
#> $name2
#> [1] 39
#> 
#> $name3
#> [1] 399

One gotcha is that you have to use all members of the tuple. I'm not really sure why that is necessary.

pmap(dplyr::lst(a, b, c), function(x, y) x^2 + exp(y))
#> Error in .f(a = .l[[c(1L, i)]], b = .l[[c(2L, i)]], c = .l[[c(3L, i)]], : unused arguments (a = .l[[c(1, i)]], b = .l[[c(2, i)]], c = .l[[c(3, i)]])

@1danjordan
Copy link
Contributor Author

Thanks a million, you've totally cleared that up for me. I suppose then that the shorthand ~ doesn't work here because arguments must be explicit. I was hoping something like this might work

pmap(tibble::lst(a,b,c), ~ a / (b + c))

but I suppose as_function doesn't work this way. I think I automatically assumed pmap worked something like dplyr::mutate.

@csrvermaak
Copy link

csrvermaak commented May 12, 2017

Could extracting from the .l argument not perhaps work?

pmap(list(a,b, c), ~ .l$a + .l$b + .l$c)

or more generic, if there is a reason why .l$name won't work:

pmap(list(a,b, c), ~ .l[[1]] + .l[[2]] + .l[[3]])

@1danjordan
Copy link
Contributor Author

1danjordan commented May 12, 2017

The anonymous function is not evaluated in an environment where .l[[1]] + .l[[2]] + .l[[3]] exists or makes any sense. purrr's functions are wrappers around C code. So if we look at purrr::pmap source

> purrr::pmap
function (.l, .f, ...) 
{
    .f <- as_function(.f, ...)
    .Call(pmap_impl, environment(), ".l", ".f", "list")
}
<environment: namespace:purrr>

So you can see here, pmap just turns the 'anonymous function' .f into a real R function and dispatches the environment with list .l and function .f to pmap_impl, a C function. In here, .l is a symbolic expressions (SEXP), and not an R object that is subsettable.

@lionel-
Copy link
Member

lionel- commented May 12, 2017

This could simply be:

pmap <- function(.l, .f, ...) {
  .f <- rlang::as_function(.f, ...)
  .f <- rlang::env_bury(.f, .l = .l)
  .Call(pmap_impl, environment(), ".l", ".f", "list")
}

pmap(list(a = 1, b = 2, c = 3), ~ .l$a + .l$b + .l$c)
#> [[1]]
#> [1] 6

@lionel-
Copy link
Member

lionel- commented May 12, 2017

but should probably only be applied to formulas otherwise this could have surprising effects

@csrvermaak
Copy link

csrvermaak commented May 12, 2017

@dandermotj - thanks for clearing that up, I understand now why it won't work (given the current wrapper function.)

@lionel- that's a pretty incredible piece of code! Is a similarly "improved" wrapper function something that could be released in future? (perhaps with a relevant check for ~ formula ). It would make pmap more intuitive, and more in line with map and map2 and it's .x .y notation. (IMO)

@Chr96er
Copy link

Chr96er commented Jun 11, 2020

pmap <- function(.l, .f, ...) {
  .f <- rlang::as_function(.f, ...)
  .f <- rlang::env_bury(.f, .l = .l)
  .Call(pmap_impl, environment(), ".l", ".f", "list")
}

pmap(list(a = 1, b = 2, c = 3), ~ .l$a + .l$b + .l$c)

This doesn't do the job for me as soon as I replace the 1-dim list by a data.frame (or any other 2-dim structure):

pmap_gh <- function(.l, .f, ...) {
  .f <- rlang::as_function(.f, ...)
  .f <- rlang::env_bury(.f, .l = .l)
  .Call(purrr:::pmap_impl, environment(), ".l", ".f", "list")
}

# unexpected behaviour
pmap_gh(list(a = 1:2, b = 2:3, c = 3:4), ~ .l$a + .l$b + .l$c)
> [[1]]
[1] 6 9

[[2]]
[1] 6 9

# expected behaviour
pmap(list(a = 1:2, b = 2:3, c = 3:4), function(a, b, c) a + b + c)
> [[1]]
[1] 6

[[2]]
[1] 9

Unless I'm missing something I would like to open a new issue with the feature request to expose all arguments to the function without explicitely specifying them. I don't mind providing them for 3 arguments as above but pmap is from my point of view amazing when it comes to row-wise data.frame operations (often an anti-pattern but it has its applications) and I hate providing 40 arguments (all of which are dynamically being accessed in the function using dynget, else I would just provide the necessary columns and ...).

edit: I know I can use pmap within data.table (and I would assume mutate) context where you don't need to explicitely provide the arguments but can easily access the columns. However, I haven't managed to dynamically access the columns using get or dynget by e.g. iterating over a vector of column names (get works in data.table context but apparently not when additionally used within a function).

@lionel-
Copy link
Member

lionel- commented Jun 11, 2020

You can use slide() to map over data frames rowwise:

slider::slide(as.data.frame(x), ~ .x$a + .x$b + .x$c)
#> [[1]]
#> [1] 6
#>
#> [[2]]
#> [1] 9

We might provide a similar feature in purrr. No need to open an issue though.

@Chr96er
Copy link

Chr96er commented Jun 12, 2020

Excellent, works like a charm with the examples I had in mind.

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants