Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should unnest_longer() have keep_empty? #1339

Closed
DavisVaughan opened this issue Mar 28, 2022 · 2 comments · Fixed by #1442
Closed

Should unnest_longer() have keep_empty? #1339

DavisVaughan opened this issue Mar 28, 2022 · 2 comments · Fixed by #1442
Assignees
Labels
feature a feature request or enhancement rectangling 🗄️ converting deeply nested lists into tidy data frames
Milestone

Comments

@DavisVaughan
Copy link
Member

From https://community.rstudio.com/t/unnest-longer-drops-lists-rows-with-character-0/132748

Original example:

library(tidyverse)

my_df <- tibble(
  txt = c(
    "chestnut, pear, kiwi, peanut",
    "grapes, banana"
  )
)

#Extract all nuts
my_df <- my_df %>% 
  mutate(nuts = str_extract_all(txt, regex("\\w*nut\\w*"))) %>% 
  mutate(index = row_number(), .before=1)

#Row index 2 has nuts <chr [0]>
my_df
#> # A tibble: 2 × 3
#>   index txt                          nuts     
#>   <int> <chr>                        <list>   
#> 1     1 chestnut, pear, kiwi, peanut <chr [2]>
#> 2     2 grapes, banana               <chr [0]>

#unnest
my_df_long <- my_df %>% 
  unnest_longer(nuts, values_to = "nuts_long")

#Row index 2 is now missing
my_df_long
#> # A tibble: 2 × 3
#>   index txt                          nuts_long
#>   <int> <chr>                        <chr>    
#> 1     1 chestnut, pear, kiwi, peanut chestnut 
#> 2     1 chestnut, pear, kiwi, peanut peanut

Created on 2022-03-28 by the reprex package (v2.0.1)

Minimal reprex:

library(tidyverse)

df <- tibble(
  x = list("a", character())
)
df
#> # A tibble: 2 × 1
#>   x        
#>   <list>   
#> 1 <chr [1]>
#> 2 <chr [0]>

unnest_longer(df, x)
#> # A tibble: 1 × 1
#>   x    
#>   <chr>
#> 1 a

unnest(df, x, keep_empty = TRUE)
#> # A tibble: 2 × 1
#>   x    
#>   <chr>
#> 1 a    
#> 2 <NA>

Created on 2022-03-28 by the reprex package (v2.0.1)

It may be as simple as passing keep_empty through to the unchop() call in unnest_longer(), but I'd need to think about it critically to make sure

@DavisVaughan DavisVaughan added feature a feature request or enhancement rectangling 🗄️ converting deeply nested lists into tidy data frames labels Mar 28, 2022
@hadley
Copy link
Member

hadley commented Jun 11, 2022

I just noticed this absence while writing for R4DS, so I definitely think we should add keep_empty support.

@hadley
Copy link
Member

hadley commented Aug 10, 2022

It's worth noting the behaviour is different with NULL:

library(tidyr)

df <- tibble(
  x1 = list("a", character()),
  x2 = list("a", NULL)
)
df |> unnest_longer(x1)
#> # A tibble: 1 × 2
#>   x1    x2       
#>   <chr> <list>   
#> 1 a     <chr [1]>
df |> unnest_longer(x2)
#> # A tibble: 2 × 2
#>   x1        x2   
#>   <list>    <chr>
#> 1 <chr [1]> a    
#> 2 <chr [0]> <NA>

Created on 2022-08-10 by the reprex package (v2.0.1)

This is somewhat related to the strict argument to unnest_wider(), because the motivation for handling NULL in this way comes from JSON.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement rectangling 🗄️ converting deeply nested lists into tidy data frames
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants