Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbindlist() idcol returns garbage id for lists contain inequal length vector #3785

Closed
shrektan opened this issue Aug 21, 2019 · 0 comments · Fixed by #3786
Closed

rbindlist() idcol returns garbage id for lists contain inequal length vector #3785

shrektan opened this issue Aug 21, 2019 · 0 comments · Fixed by #3786
Assignees
Labels
Milestone

Comments

@shrektan
Copy link
Member

I think it's a bug. The id is assigned based on the length of the sub-element's first vector. It should be the maximum of all the vectors of that sub-element.

library(data.table)
x <- 1:1000

# notice the last few TAGs
out1 <- lapply(x, function(.) {
  list(., 1:2, 2:3)
})
out1 <- rbindlist(out1, idcol = 'TAG')
out1
#>       TAG   V1 V2 V3
#>    1:   1    1  1  2
#>    2:   2    1  2  3
#>    3:   3    2  1  2
#>    4:   4    2  2  3
#>    5:   5    3  1  2
#>   ---               
#> 1996:   2  998  2  3
#> 1997:  30  999  1  2
#> 1998:  17  999  2  3
#> 1999:   4 1000  1  2
#> 2000:  20 1000  2  3

# use data.table, no problem, because the length has been unified first
out2 <- lapply(x, function(.) {
  data.table(., 1:2, 2:3)
})
out2 <- rbindlist(out2, idcol = 'TAG')
out2
#>        TAG    . V2 V3
#>    1:    1    1  1  2
#>    2:    1    1  2  3
#>    3:    2    2  1  2
#>    4:    2    2  2  3
#>    5:    3    3  1  2
#>   ---                
#> 1996:  998  998  2  3
#> 1997:  999  999  1  2
#> 1998:  999  999  2  3
#> 1999: 1000 1000  1  2
#> 2000: 1000 1000  2  3

# put the in-equal length last is no problem as well
out3 <- lapply(x, function(.) {
  list(1:2, 2:3, .)
})
out3 <- rbindlist(out3, idcol = 'TAG')
out3
#>        TAG V1 V2   V3
#>    1:    1  1  2    1
#>    2:    1  2  3    1
#>    3:    2  1  2    2
#>    4:    2  2  3    2
#>    5:    3  1  2    3
#>   ---                
#> 1996:  998  2  3  998
#> 1997:  999  1  2  999
#> 1998:  999  2  3  999
#> 1999: 1000  1  2 1000
#> 2000: 1000  2  3 1000

Created on 2019-08-21 by the reprex package (v0.2.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant