Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for NULL in list column when unnesting? #436

Closed
kendonB opened this issue Mar 9, 2018 · 5 comments

Comments

@kendonB
Copy link

commented Mar 9, 2018

library(tidyverse)
tibble(list = list(NULL, tibble(x = 1))) %>% 
  unnest()
#> Error: Each column must either be a list of vectors or a list of data frames [list]

I would have expected:

#> # A tibble: 2 x 1
#>       x
#>   <dbl>
#> 1 NA   
#> 2  1.00
@markdly

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2018

I've been thinking about this issue too. To me, it feels like this fits in with the discussion happening over at #358 ...

@billdenney

This comment has been minimized.

Copy link
Contributor

commented Jun 27, 2018

I have a use case for this where:

I have multiple datasets in a clinical trial. Some data have one or more rows for each subject; some data may have zero rows for each subject. Specifically, lab measures from blood concentrations have at least one measure for each subject; adverse events (aka side effects) have zero or more rows per subject.

The number of rows in the data are the number of observations which is important for adverse events, and imputing an empty row would cause issues with many downstream processing efforts because counting adverse events would be more complex.

What I want to do is make nested datasets for both, merge them by subject number, and be able to unnest either individually later. As an example:

library(tidyverse)                    
#> -- Attaching packages ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
#> v ggplot2 2.2.1     v purrr   0.2.5
#> v tibble  1.4.2     v dplyr   0.7.5
#> v tidyr   0.8.1     v stringr 1.3.1
#> v readr   1.1.1     v forcats 0.3.0
#> Warning: package 'tidyr' was built under R version 3.4.4
#> Warning: package 'purrr' was built under R version 3.4.4
#> Warning: package 'dplyr' was built under R version 3.4.4
#> Warning: package 'stringr' was built under R version 3.4.4
#> -- Conflicts ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag()    masks stats::lag()
d_adverse <-                          
  data.frame(SUBJID=1,                  
    AE="nausea") %>%                      
  as_tibble() %>%                       
  nest(-SUBJID, .key="adverse")         
d_lab <-                              
  data.frame(SUBJID=1:2,                
    labname="cholesterol") %>%            
  as_tibble() %>%                       
  nest(-SUBJID, .key="lab")             
#> Warning: package 'bindrcpp' was built under R version 3.4.4
d_total <- full_join(d_adverse, d_lab)
#> Joining, by = "SUBJID"
d_total %>%                           
  select(-lab) %>%                      
  unnest()                              
#> Error: Each column must either be a list of vectors or a list of data frames [adverse]
@markdly

This comment has been minimized.

Copy link
Contributor

commented Jun 28, 2018

Hi @billdenney, if you need a temporary workaround, perhaps modifying the adverse list column to replace NULL with an empty tibble could help. Unnesting then returns the original adverse event counts:

library(tidyverse) 
d_total %>% 
  mutate(adverse = map_if(adverse, is.null, ~ tibble())) %>% 
  select(-lab) %>%                      
  unnest()  
#> # A tibble: 1 x 2
#>   SUBJID AE    
#>    <dbl> <fct> 
#> 1      1 nausea
@hadley

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

Supporting NULL values seems reasonable to me.

@hadley hadley closed this in 64ee16b Mar 7, 2019
@hadley

This comment has been minimized.

Copy link
Member

commented Mar 7, 2019

Fixed with a quick hack; will hopefully naturally fall out when I rewrite unnest() to use vctrs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.