-
Notifications
You must be signed in to change notification settings - Fork 373
Closed
Labels
Milestone
Description
I have a data frame that contains multiple groups of columns and I want to stack them up. If there were only one group, then the regular stack function would have worked. But, I have two groups as shown in the following MWE:
df = DataFrame(x=1:3, y1=4:6, y2=7:9, z1=10:12, z2=13:15)So the data looks like this:
│ Row │ x │ y1 │ y2 │ z1 │ z2 │
│ │ Int64 │ Int64 │ Int64 │ Int64 │ Int64 │
├─────┼───────┼───────┼───────┼───────┼───────┤
│ 1 │ 1 │ 4 │ 7 │ 10 │ 13 │
│ 2 │ 2 │ 5 │ 8 │ 11 │ 14 │
│ 3 │ 3 │ 6 │ 9 │ 12 │ 15 │
And, I need to transform it to something like this.
│ Row │ x │ order │ y │ z │
│ │ Int64 │ Int64 │ Int64 │ Int64 │
├─────┼───────┼───────┼───────┼───────┤
│ 1 │ 1 │ 1 │ 4 │ 10 │
│ 2 │ 1 │ 2 │ 7 │ 13 │
│ 3 │ 2 │ 1 │ 5 │ 11 │
│ 4 │ 2 │ 2 │ 8 │ 13 │
│ 5 │ 3 │ 1 │ 6 │ 12 │
│ 6 │ 3 │ 2 │ 9 │ 15 │
My current solution is somewhat manual:
df1 = select(df, :x, :x => (_->1) => :order, :y1 => :y, :z1 => :z)
df2 = select(df, :x, :x => (_->2) => :order, :y2 => :y, :z2 => :z)
vcat(df1, df2)Is it possible to make it more generic since it's really a stacking function?
Just an idea:
stack(df, [[:y1, :z1] => 1, [:y2, :z2] => 2]; variable_name = :order, value_names = [:y, :z])The values 1 and 2 above are really group names for the columns. In this case, the group is called order.