Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply vctrs principles to map() and modify() #894

Merged
merged 27 commits into from
Sep 17, 2022
Merged

Apply vctrs principles to map() and modify() #894

merged 27 commits into from
Sep 17, 2022

Conversation

hadley
Copy link
Member

@hadley hadley commented Aug 27, 2022

Fixes #435. Fixes #928.

For the purposes of discussion. If we agree that these semantics are what we want, need to:

  • Add news bullet
  • Implement map2_vec()
  • Implement pmap_vec()
  • Update modify

@hadley hadley mentioned this pull request Aug 27, 2022
2 tasks
@hadley hadley changed the title Implement map_vec() Implement map_vec() Aug 27, 2022
Copy link
Member

@lionel- lionel- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem to solve is to generically map over classed lists like list_of() with full restoration behaviour. I think that's out of scope for map_vec() but could live in another function, perhaps list_map()?

R/map.R Outdated Show resolved Hide resolved
tests/testthat/test-map.R Show resolved Hide resolved
R/map.R Outdated Show resolved Hide resolved
R/map.R Outdated Show resolved Hide resolved
@hadley
Copy link
Member Author

hadley commented Aug 29, 2022

Updated the implementation (and tests) based on your comments. Do you think this looks good enough to finish off map2_vec() etc?

Copy link
Member

@lionel- lionel- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Do we want to document this and other _vec variants in a different topic marked as experimental?

R/map.R Outdated Show resolved Hide resolved
R/map.R Outdated Show resolved Hide resolved
R/map.R Outdated Show resolved Hide resolved
@hadley
Copy link
Member Author

hadley commented Sep 14, 2022

This will fail until it uses #942, but the implementation is now very simple/clear.

You might wonder why map_vec() gets its own function while map_c() does not. I think there are three reasons:

  • map_vec() is a real map in that map_vec(x, f)[[i]] == f(x[[i]])
  • We consider simplification to be an internal detail, and its done automatically in accumulate() and list_transpose()
  • We could one day implement this more efficiently, especially when .ptype is supplied.

@hadley hadley changed the title Implement map_vec() Apply vctrs principles to map() and modify() Sep 15, 2022
@hadley
Copy link
Member Author

hadley commented Sep 15, 2022

Sorry that this has become such a monstrous PR, but you should be able to mostly ignore the changes to the map2 and pmap tests since I felt obliged to update them to at least cover a few of the key invariants.

The most important stuff to hit:

  • Are we still confident in map_vec() now that the implementation is so simple?
  • Have I reasoned through modify() and friends correctly?

R/map2.R Show resolved Hide resolved
R/modify.R Show resolved Hide resolved
R/modify.R Outdated
vec_restore(out, .x)
} else if (vec_is(.x)) {
map_vec(.x, .f, ..., .ptype = .x)
} else if (is.null(.x) || is.list(.x)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange to go through that path for .x = NULL. Maybe branch into return(NULL)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this originally, but it also felt weird — it is nice that NULL just falls out as a special case of a zero length list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I also prefer the explicit is.null(x) branch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave it with list because (a) it's not that import and (b) it needs the same recycling behaviour as a zero-length list in modify2().

.x <- .x[rep(1L, length(out))]
if (vec_is_list(.x) || is.data.frame(.x)) {
out <- map2(vec_proxy(.x), .y, .f, ...)
vec_restore(out, .x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here regarding df restoration.

if (vec_is_list(.x) || is.data.frame(.x)) {
out <- vec_proxy(.x)
out[.where] <- map(out[.where], .f, ...)
vec_restore(out, .x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

df-restoration

R/pmap.R Outdated Show resolved Hide resolved
R/modify.R Outdated
.x[] <- map(.x, .f, ...)
.x
} else {
cli::cli_abort("Don't know how to modify {.obj_type_friendly {.x}}.")
Copy link
Member

@lionel- lionel- Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a "can't" form here?

! Can't modify a function.

Actually a "must" form works well I think:

! `.x` must be a vector, list, or data frame, not a function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think I should just ignore the fact it's a generic? It is unlikely that people have provided methods. Do you think I verify that there aren't methods on CRAN then remove the genericity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yes I think that is a good idea.

Copy link
Member Author

@hadley hadley Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are five packages with S3method(modify, in their namespace (https://cs.github.com/?scopeName=All+repos&scope=&q=org%3Acran+path%3A%2F%5ENAMESPACE%2F+%2FS3method%5C%28modify%2C%2F)

  • heemod — own modify generic
  • timbr — purrr generic
  • purrr — duh
  • tfarima — own generic
  • yamlet — own generic

timbr's modify.forest looks considerably more complex than our implementations, so it's probably ok for it to be its own function: https://github.com/UchidaMizuki/timbr/blob/main/R/purrr.R. Issue at UchidaMizuki/timbr#1

R/map.R Outdated
Comment on lines 49 to 50
#' * `_vec()` return an atomic or S3 vector, that is guaranteed to be
#' simpler than list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is guaranteed to be simpler than list.

Well, yes, but also: 😬

> map_vec(list(list(1:3), list(5)), identity)
[[1]]
[1] 1 2 3

[[2]]
[1] 5

R/map2.R Show resolved Hide resolved
R/modify.R Show resolved Hide resolved
R/modify.R Outdated
vec_restore(out, .x)
} else if (vec_is(.x)) {
map_vec(.x, .f, ..., .ptype = .x)
} else if (is.null(.x) || is.list(.x)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I also prefer the explicit is.null(x) branch

R/modify.R Show resolved Hide resolved
R/modify.R Show resolved Hide resolved
if (vec_is_list(.x) || is.data.frame(.x)) {
out <- map2(vec_proxy(.x), .y, .f, ...)
vec_restore(out, .x)
} else if (vec_is(.x)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simultaneously technically correct and also makes me sad that we generate the duplicated column names

modify2(data.frame(x = 1), 1:3, \(x, y) x)
#>   x x x
#> 1 1 1 1

tests/testthat/_snaps/map.md Show resolved Hide resolved
)
test_that("don't evaluate symbolic objects (#428)", {
map2(exprs(1 + 2), NA, ~ expect_identical(.x, quote(1 + 2)))
walk2(exprs(1 + 2), NA, ~ expect_identical(.x, quote(1 + 2)))
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a map2_vec() test about enforcing .ptype

expect_identical(pmap_int(named_list, identity), named(int()))
expect_identical(pmap_dbl(named_list, identity), named(dbl()))
expect_identical(pmap_chr(named_list, identity), named(chr()))
pwalk(list(exprs(1 + 2)), ~ expect_identical(.x, quote(1 + 2)))
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a pmap_vec() test about enforcing .ptype (which would catch what Lionel pointed out before about missing the ptype arg)

@hadley hadley merged commit db7bd02 into main Sep 17, 2022
@hadley hadley deleted the map-vec branch September 17, 2022 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use some vctrs principle in modify() and friends map_vec
3 participants