Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's introduce unnest_wider() in TidierData.jl! #119

Closed
atantos opened this issue Aug 23, 2023 · 1 comment
Closed

Let's introduce unnest_wider() in TidierData.jl! #119

atantos opened this issue Aug 23, 2023 · 1 comment

Comments

@atantos
Copy link

atantos commented Aug 23, 2023

Hey there!

In R's tidyverse, the unnest_wider() function provides a convenient way to spread the contents of a column, which contains arrays or lists of values, across multiple new columns. Let's consider a DataFrame named test and see how we'd like the result to appear:

test = DataFrame(a = [1,2], b = [["c","d"],["e", "f"]])

# result
result = DataFrame(a = [1,2], b_1 = ["c" , "e"], b_2 = ["d", "f"])

To achieve that with R's tidyverse we would have:

> data <- tibble(
  a = c(1,2),
  b = list(c("c","d"), c("e", "f"))
)

> data_wide <- data %>% 
  unnest_wider(b, names_sep = "_")

> data_wide
# A tibble: 2 × 3
      a b_1   b_2  
  <dbl> <chr> <chr>
1     1 c     d    
2     2 e     f    

To achieve a similar result in Julia using the DataFrames.jl package, the process is straightforward, albeit with a distinct Julia-idiomatic flavor. First, we'd define a function, split_uniformly(), to handle the transformation. Then, we'd use this function within the transformation pipeline provided by the DataFrames minilanguage:

julia> test = DataFrame(a = [1,2], b = [["c","d"],["e", "f"]])

julia> function split_uniformly(v)
    n = length(first(v))
    [NamedTuple(Symbol.("b", 1:n) .=> Tuple(amem))
     for amem in v]
end

julia> test_wide = @chain test begin
    transform(:b => split_uniformly => AsTable)
    select(Not(:b))
end
2×3 DataFrame
 Row │ a      b1      b2     
     │ Int64  String  String 
─────┼───────────────────────
   11  c       d
   22  e       f
@atantos
Copy link
Author

atantos commented Aug 23, 2023

A more appropriate place to open this issue is TidierData.jl

@atantos atantos closed this as completed Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant