Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accumulate() issues with list output of one row #809

Closed
erictleung opened this issue Dec 5, 2020 · 2 comments · Fixed by #909
Closed

accumulate() issues with list output of one row #809

erictleung opened this issue Dec 5, 2020 · 2 comments · Fixed by #909
Labels
bug an unexpected problem or unintended behavior reduce 🔨 vctrs ♣️

Comments

@erictleung
Copy link

This is an odd "bug" I've encountered. It is not unexpected given the functions used. So I'm both asking if there is a better way to do this or if the functionality should be updated. If this is beyond the scope of purrr, feel free to close.

I wish to accumulate a list of unique values into a cell, but when we accumulate on a single row, it returns a single value. This single value doesn't fit with the other outputs because they are lists. This

library(purrr)
library(dplyr)

running_unique <- function(x, y) { unique(c(x, y)) }

# Example simple function use
running_unique(1, c(1, 12, 3))
#> [1]  1 12  3

# Example using accumulate
accumulate(1:3, running_unique)
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 1 2
#> 
#> [[3]]
#> [1] 1 2 3
accumulate(1, running_unique)
#> [1] 1


df <- tribble(
  ~a, ~b,
  1, 1,
  2, 1,
  2, 2
)
df
#> # A tibble: 3 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     1
#> 2     2     1
#> 3     2     2

df %>%
  group_by(a) %>%
  mutate(accum = accumulate(b, running_unique))
#> Error: Problem with `mutate()` input `accum`.
#> x Input `accum` must return compatible vectors across groups
#> i Input `accum` is `accumulate(b, running_unique)`.
#> i Result type for group 1 (a = 1): <double>.
#> i Result type for group 2 (a = 2): <list>.

Created on 2020-12-04 by the reprex package (v0.3.0)

If we just focus on groups with more than one row, then all is well. Notice that even with one value, it works.

df %>%
  filter(a == 2) %>%
  group_by(a) %>%
  mutate(accum = accumulate(b, running_unique))
#> # A tibble: 2 x 3
#> # Groups:   a [1]
#>       a     b accum    
#>   <dbl> <dbl> <list>   
#> 1     2     1 <dbl [1]>
#> 2     2     2 <dbl [2]>
@dabsingh
Copy link

dabsingh commented Feb 11, 2021

I faced the same issue with purrr::accumulate. It does not seem to be type stable when the accumulation function returns a single value. This is the same issue with base::Reduct(fn, x, accumulate=TRUE).

fn <- function(x,y) unique(c(x,y))
purr::accumulate(list("A", "A", "A"), fn)
#> [1] "A" "A" "A"
# The expected result is a list:
#> [[1]]
#> [1] "A"
#> [[2]]
#> [1] "A"
#> [[3]]
#> [1] "A"

I worked around it by:
purr::accumulate(list("A", "A", "A"), fn) %>% as.list

@hadley
Copy link
Member

hadley commented Aug 24, 2022

Somewhat more minimal reprex:

library(purrr)

str(accumulate(1:3, union))
#> List of 3
#>  $ : int 1
#>  $ : int [1:2] 1 2
#>  $ : int [1:3] 1 2 3
str(accumulate(1:2, union))
#> List of 2
#>  $ : int 1
#>  $ : int [1:2] 1 2
str(accumulate(1, union))
#>  num 1

Created on 2022-08-24 by the reprex package (v2.0.1)

This is caused by accumulate() automatic simplification:

library(purrr)

str(accumulate(1:3, `+`))
#>  int [1:3] 1 3 6
str(accumulate(1:2, `+`))
#>  int [1:2] 1 3
str(accumulate(1L, `+`))
#>  int 1

Created on 2022-08-24 by the reprex package (v2.0.1)

So we probably need to offer some way to opt-out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior reduce 🔨 vctrs ♣️
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants