-
Notifications
You must be signed in to change notification settings - Fork 420
Closed
Labels
featurea feature request or enhancementa feature request or enhancementrectangling 🗄️converting deeply nested lists into tidy data framesconverting deeply nested lists into tidy data frames
Description
Copying notes from lingering file on my desktop:
# Some intersting challenges from Jenny
# https://github.com/tidyverse/googledrive/blob/519452fc4d3257354079324e4afede777604848f/data-raw/discovery-doc-prep.R#L118-L124
library(tidyverse)
df <- tibble(
g = c("a", "a", "b"),
x = c(1, 3, 5),
y = c("x", "y", "z")
)
df
# One missing case z = list(1, 2, 3)
# There you just want to simplify the list col to a vector
# rows and cols change; current nest behaviour
# df %>% nest(x, y, .key = "data")
nest_both <- tribble(
~g, ~data,
"a", tibble(x = c(1, 3), y = c("x", "y")),
"b", tibble(x = 5, y = "z")
)
nest_both
# rows change; cols don't
# get list of vectors
# df %>% nest_rows(x, y)
nest_rows <- tribble(
~g, ~x, ~y,
"a", c(1, 3), c("x", "y"),
"b", 5, "z"
)
# cols change; rows don't
# use lists, not tibbles to convey intent.
# df %>% nest_cols(x, y, .key = "data")
nest_cols <- tribble(
~g, ~data,
"a", list(x = 1, y = "x"),
"a", list(x = 3, y = "y"),
"b", list(x = 5, y = "z")
)
nest_cols
# unnest ------------------------------------------------------------------
# All of these should be able to automatically determine the
# unnested direction: data frame = both; named vector = col;
# unnamed vector = row; anything else or mix = error.
unnest(nest_both, data)
unnest(nest_cols, data)
unnest(nest_rows, x, y)
# Lengths must be consistent (otherwise would have to cross?)
# nest_row %>% unnest_row()
nest_rows %>% unnest(x, y)
# bind_rows() handles name/type consistency
# nest_col %>% unnest_col()
nest_cols %>%
mutate(data = data %>% map(as_tibble)) %>%
unnest(data)
# What happens if we try do to the "wrong" direction?
nest_rows %>% unnest_col()
nest_cols %>% unnest_row()
# needs column names
# can you supply multiple columns? (yes, but how to supply names? provide numbers by default?)
# can you provide maximum number? need to handle potential raggedness
# (this is starting to feel like separate)
nest_rows %>% unnest_col()
# needs option to capture names
# how to manage types of data col? here would be mix of character and integer
# use purrr::simplify? uses unlist() but guarantees length will be ok
nest_cols %>% unnest(data)
nest_cols %>% unnest(.id = "name") # not picking up name
# would simplify to integer or die trying?
nest_cols %>% unnest(.id = "name", .type = "integer")Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featurea feature request or enhancementa feature request or enhancementrectangling 🗄️converting deeply nested lists into tidy data framesconverting deeply nested lists into tidy data frames