-
Notifications
You must be signed in to change notification settings - Fork 422
Closed
Labels
featurea feature request or enhancementa feature request or enhancementpivoting ♻️pivot rectangular data to different "shapes"pivot rectangular data to different "shapes"
Milestone
Description
I'm more and more often confronted with messy spreadsheets with duplicate column names that need to be tidied up as for example in this SO question.
tidyr::gather simply refuse to do it (error message) and reshape::melt pr reshape2::melt return the wrong numbers without any warning. The data.table version of melt works as intended.
Details
Here is a minimal reprex :# Reprex
d <- data.frame(Group = c("A", "B"),
rbind(c(0, 0, 5, 5),
c(0, 0, 10, 10)))
colnames(d) <- c("Group", "Var1", "Var2", "Var1", "Var2")
# Dataframe with duplicate column names --> quite frequent situation in messy spreadsheets...
d
#> Group Var1 Var2 Var1 Var2
#> 1 A 0 0 5 5
#> 2 B 0 0 10 10
# With tidyr we have an error message : definitively better than to have the
# wrong numbers...
tidyr::gather(d,,,-1)
#> Error: Can't bind data because some arguments have the same name
# with reshape and reshape2 : wrong results (0 everywhere, the 5 and 10 values have disapeared)
reshape::melt(d, id.vars = 1)
#> Group variable value
#> 1 A Var1 0
#> 2 B Var1 0
#> 3 A Var2 0
#> 4 B Var2 0
#> 5 A Var1 0
#> 6 B Var1 0
#> 7 A Var2 0
#> 8 B Var2 0
reshape2::melt(d, id.vars = 1)
#> Group variable value
#> 1 A Var1 0
#> 2 B Var1 0
#> 3 A Var2 0
#> 4 B Var2 0
# data.table::melt fails similarly when we work on a data.frame
# but provides exactly the intended result if we work on a data.table
data.table::melt(d, id.vars = 1)
#> Group variable value
#> 1 A Var1 0
#> 2 B Var1 0
#> 3 A Var2 0
#> 4 B Var2 0
data.table::melt(data.table::as.data.table(d), id.vars = 1)
#> Group variable value
#> 1: A Var1 0
#> 2: B Var1 0
#> 3: A Var2 0
#> 4: B Var2 0
#> 5: A Var1 5
#> 6: B Var1 10
#> 7: A Var2 5
#> 8: B Var2 10
# base::stack provides the right values but good luck for the other columns ...
stack(d[,-1])
#> values ind
#> 1 0 Var1
#> 2 0 Var1
#> 3 0 Var2
#> 4 0 Var2
#> 5 5 Var1.1
#> 6 10 Var1.1
#> 7 5 Var2.1
#> 8 10 Var2.1Created on 2018-06-25 by the reprex package (v0.2.0).
Metadata
Metadata
Assignees
Labels
featurea feature request or enhancementa feature request or enhancementpivoting ♻️pivot rectangular data to different "shapes"pivot rectangular data to different "shapes"