Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pluck() converts zero-length objects to NULL #480

Closed
VincentGuyader opened this issue Mar 27, 2018 · 14 comments · Fixed by #893
Closed

pluck() converts zero-length objects to NULL #480

VincentGuyader opened this issue Mar 27, 2018 · 14 comments · Fixed by #893
Labels
bug an unexpected problem or unintended behavior pluck 🍐
Milestone

Comments

@VincentGuyader
Copy link

HI,

Hello I noticed something that seems strange to me. Dont know if it's a bug, but I dont understand the result.
map("result") return somthing wrong (to me), but map(~.x$result) is OK

this is a minimal example :

library(purrr)
> Warning: le package 'purrr' a été compilé avec la version R 3.4.4
list(list(result=data.frame(),error=1)) %>% map("result") 
> [[1]]
> NULL   # WHY ?
list(list(result=data.frame(),error=1)) %>%  map(~`$`(.x,"result"))
> [[1]]
> data frame with 0 columns and 0 rows
list(list(result=data.frame(),error=1)) %>%  map(~.x$result)
> [[1]]
> data frame with 0 columns and 0 rows

Any idea why ?

Regards

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252   
[3] LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
[5] LC_TIME=French_France.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] shiny_1.0.5     bindrcpp_0.2    forcats_0.3.0   stringr_1.3.0   dplyr_0.7.4    
 [6] purrr_0.2.4     readr_1.1.1     tidyr_0.8.0     tibble_1.4.2    ggplot2_2.2.1  
[11] tidyverse_1.2.1 tictoc_1.0      cleanser_0.1   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.16      lubridate_1.7.2   lattice_0.20-35   rprojroot_1.3-2  
 [5] assertthat_0.2.0  digest_0.6.15     psych_1.7.8       mime_0.5         
 [9] R6_2.2.2          cellranger_1.1.0  plyr_1.8.4        backports_1.1.2  
[13] reprex_0.1.2      evaluate_0.10.1   httr_1.3.1        pillar_1.2.1     
[17] rlang_0.2.0       settings_0.2.4    lazyeval_0.2.1    readxl_1.0.0     
[21] miniUI_0.1.1      rstudioapi_0.7    callr_2.0.2       whisker_0.3-2    
[25] R.utils_2.6.0     R.oo_1.21.0       rmarkdown_1.9     devtools_1.13.5  
[29] shinyjs_1.0       foreign_0.8-69    munsell_0.4.3     broom_0.4.3      
[33] validate_0.2.0    compiler_3.4.3    httpuv_1.3.6.2    modelr_0.1.1     
[37] pkgconfig_2.0.1   mnormt_1.5-5      htmltools_0.3.6   crayon_1.3.4     
[41] withr_2.1.2       R.methodsS3_1.7.1 grid_3.4.3        nlme_3.1-131     
[45] jsonlite_1.5      xtable_1.8-2      gtable_0.2.0      magrittr_1.5     
[49] scales_0.5.0      debugme_1.1.0     cli_1.0.0         stringi_1.1.7    
[53] reshape2_1.4.3    xml2_1.2.0        tools_3.4.3       glue_1.2.0       
[57] hms_0.4.2         parallel_3.4.3    yaml_2.1.18       colorspace_1.3-2 
[61] simplemagic_0.1.0 rvest_0.3.2       memoise_1.1.0     knitr_1.20       
[65] bindr_0.1.1       haven_1.1.1 
@jennybc
Copy link
Member

jennybc commented Mar 27, 2018

I think this a bug in the C code called by pluck(), which is ultimately called by your first expression:

list(list(result=data.frame(),error=1)) %>% map("result") 

Your 0 row, 0 column data frame has length 0 here and therefore gets the default value of NULL.

return (Rf_length(x) == 0) ? missing : x;

All your other approaches use a different method to extract the element named "result". If your data frame has at least one column, map("result")does what you expect.

library(purrr)
list(list(result=data.frame(1),error=1)) %>% map("result") 
#> [[1]]
#>   X1
#> 1  1
list(list(result=data.frame(X1 = character()),error=1)) %>% map("result") 
#> [[1]]
#> [1] X1
#> <0 rows> (or 0-length row.names)
as_mapper("result")
#> function (x, ...) 
#> pluck(x, list("result"), .default = NULL)
#> <environment: 0x7fc673890298>

Created on 2018-03-27 by the reprex package (v0.2.0).

@hadley
Copy link
Member

hadley commented Mar 28, 2018

I'm not sure whether this is a bug or not - pluck() consistently considers a zero-length component in the same way as an absent component.

@jennybc
Copy link
Member

jennybc commented Mar 28, 2018

Ok yes maybe 'bug" is wrong word. I think the surprise here comes from the fact that a user probably doesn't contemplate whether map(x, "result") is achieved via pluck(.x, list("result")) or .x[["result"]]. The latter is more what I would expect.

I really hadn't internalized this before:

library(purrr)
x <- list(result = character())
pluck(x, list("result"))
#> NULL
x[["result"]]
#> character(0)

Created on 2018-03-28 by the reprex package (v0.2.0).

@hadley
Copy link
Member

hadley commented Mar 28, 2018

Yeah, I think we need better pluck docs and then make it more clear that's what map() uses.

@jennybc
Copy link
Member

jennybc commented Mar 28, 2018

It does seem sort of troubling that pluck() won't allow you to retain, say, character(0) vs NULL inside some workflow. Yet I don't have any current compelling example where this is harmful.

@VincentGuyader
Copy link
Author

VincentGuyader commented Mar 28, 2018

Hi,

This may not be a bug but in my understanding map("result") is "just" a nicer way to extract the content of all "result" element. It is rather disturbing that map modifies the content of the list.

For me map(~.x$result) should have the same behavior as map("result")

Regards

@hadley
Copy link
Member

hadley commented Mar 28, 2018

@VincentGuyader map(x, "result") is a short cut for map(x, pluck, "result"), and pluck deliberately treats all zero-length, NULL, and absent elements in exactly the same way.

@hadley
Copy link
Member

hadley commented Apr 2, 2018

Ok, now I get it:

library(purrr)
x <- list(
  list(x = letters[1:3]),
  list(x = character()),
  list(x = "z")
)
map(x, "x")
map(x, list("x", 1))

The second result should be NULL in the second case, but not the first.

@hadley
Copy link
Member

hadley commented Apr 2, 2018

Most minimal reprex:

purrr::pluck(list(numeric()), 1)
#> NULL

This should return numeric() not NULL.

@hadley hadley changed the title data.frame() become NULL by passing through purrr::map("result") pluck() coerces zero-length objects to NULL Apr 2, 2018
@hadley hadley changed the title pluck() coerces zero-length objects to NULL pluck() converts zero-length objects to NULL Apr 2, 2018
@hadley hadley added bug an unexpected problem or unintended behavior pluck 🍐 labels May 5, 2018
@lionel-
Copy link
Member

lionel- commented Nov 28, 2018

The reason for treating empty as absent is for the atomic map variants:

test_that(".default replaces elements with length 0", {
  x <- list(
    list(a = 1),
    list(a = NULL),
    list(a = numeric())
  )
  expect_equal(map_dbl(x, "a", .default = NA), c(1, NA, NA))
})

It makes sense to treat both NULL and empty as absent in that context.

I wonder if this should just be an option to pluck()? But then map_xxx() would need to somehow signal to pluck that they're plucking in non recursive context. Even if pluck had such an argument, how can we pass it through? Should typed maps call as_mapper(.f, ..., .newoption = TRUE)?

Maybe the cleanest way to solve this is to also take .default in map_xxx() and apply the default from there. This also has the advantage that .default will work with any mapped functions, not just pluckers. pluck() can still have the .default argument, it won't be passed through dots because of argument matching. Then pluck's .default could apply only on absent elements rather than all empty elements.

@lionel-
Copy link
Member

lionel- commented Nov 28, 2018

In other words, emptyness is absence in atomic context, but not in recursive context.

@hadley
Copy link
Member

hadley commented Nov 28, 2018

So it's probably somehow related to r-lib/vctrs#141

@lionel- lionel- added this to the vctrs milestone Dec 3, 2018
@PirateGrunt
Copy link

If it matters, I think this is the same issue that I raised on the RStudio community: https://community.rstudio.com/t/possibly-inconsistent-treatment-of-extractor-in-purr-map/35093. Another example (and feeble attempts at understanding) there.

@hadley
Copy link
Member

hadley commented Aug 27, 2022

I do think zero-length vectors and NULL should be treated differently, and this is a very simple fix. (Note to self: it's in the pluck-0 branch). I'll need to check revdeps to make sure this doesn't break existing code, but I'm reasonably confident we'll be ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior pluck 🍐
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants