Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding list column to one row data table #3626

Closed
jakob-r opened this issue Jun 4, 2019 · 7 comments · Fixed by #3925
Closed

Adding list column to one row data table #3626

jakob-r opened this issue Jun 4, 2019 · 7 comments · Fixed by #3925

Comments

@jakob-r
Copy link

@jakob-r jakob-r commented Jun 4, 2019

I can not add a list column to a data.table with one row:

library(data.table)
dt = data.table(a = 1)
list_column = list(list(a = 1, b = 2))
# does not work:
dt$b = list_column
Error in set(x, j = name, value = value) : 
  Supplied 2 items to be assigned to 1 items of column 'b'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
dt[, b := list_column]
Error in `[.data.table`(dt, , `:=`(b, list_column)) : 
  Supplied 2 items to be assigned to 1 items of column 'b'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.

The following workaround works and prodcues the desired result.

dtb = data.table(b = list_column)
cbind(dt, dtb)
       a      b
   <num> <list>
1:     1 <list>
@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jun 4, 2019

could you confirm you are on the most recent master (share sessionInfo())?

a related bug was fixed recently

@jangorecki
Copy link
Member

@jangorecki jangorecki commented Jun 4, 2019

if you are using devel version from our devel drat repos please provide

data.table:::.git()

@jakob-r
Copy link
Author

@jakob-r jakob-r commented Jun 4, 2019

Sorry. I always tend to forget. But yes. I just installed the latest version from github.

> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS:   /usr/lib/libblas.so.3.8.0
LAPACK: /usr/lib/liblapack.so.3.8.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] colorout_1.2-1    usethis_1.5.0     devtools_2.0.2    data.table_1.12.3

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1        ps_1.3.0          prettyunits_1.0.2 rprojroot_1.3-2  
 [5] digest_0.6.19     crayon_1.3.4      withr_2.1.2       assertthat_0.2.1 
 [9] R6_2.4.0          backports_1.1.4   magrittr_1.5      rlang_0.3.4      
[13] cli_1.1.0         fs_1.3.1          remotes_2.0.4     testthat_2.1.1   
[17] callr_3.2.0       desc_1.2.0        tools_3.6.0       glue_1.3.1       
[21] compiler_3.6.0    pkgload_1.0.2     parallel_3.6.0    processx_3.3.1   
[25] pkgbuild_1.0.3    sessioninfo_1.1.1 memoise_1.1.0    

And the git version

> utils::packageDescription("data.table")
Package: data.table
Version: 1.12.3
...
ByteCompile: TRUE
RemoteType: github
RemoteHost: api.github.com
RemoteRepo: data.table
RemoteUsername: Rdatatable
RemoteRef: master
RemoteSha: 421f672f72c8e8749a71cda121925e8f5be84242
GithubRepo: data.table
GithubUsername: Rdatatable
GithubRef: master
GithubSHA1: 421f672f72c8e8749a71cda121925e8f5be84242
NeedsCompilation: yes
Packaged: 2019-06-04 09:58:30 UTC; ...
...
Built: R 3.6.0; x86_64-pc-linux-gnu; 2019-06-04 09:58:33 UTC; unix

@jangorecki
Copy link
Member

@jangorecki jangorecki commented Jun 4, 2019

adding extra list() call helps, as always in such cases...

library(data.table)
dt = data.table(a = 1)
list_column = list(list(a = 1, b = 2))
dt$b = list(list_column)
dt

@jakob-r
Copy link
Author

@jakob-r jakob-r commented Jun 4, 2019

So would you say it's a bug or I always should add list columns with an extra list() ?

@jangorecki
Copy link
Member

@jangorecki jangorecki commented Jun 4, 2019

Yes, it is a bug that it doesn't work for 1 row data.table while it does work for 2+ rows data.table.
Thank you for reporting.

@jangorecki jangorecki changed the title Bug: Adding list column to one row data table Adding list column to one row data table Jun 4, 2019
@jangorecki jangorecki added the bug label Jun 4, 2019
@jangorecki jangorecki added this to the 1.12.4 milestone Jun 4, 2019
@jangorecki
Copy link
Member

@jangorecki jangorecki commented Jul 30, 2019

I am not sure if we can deal with it nicely.
Because of an internal helper that acts by wrapping RHS (value to assign) into extra list, we are ending up having some ambiguous edge cases.

d = data.table

# ordinary syntax
d(a=1L)[, newcol := list(2L)][]
# because of helper we can drop `list` in RHS, and we still get the same outcome
d(a=1L)[, newcol := 2L][]
# same is true for 2+ rows DT, note that RHS is recycled
d(a=1:2)[, newcol := list(2L)][]
d(a=1:2)[, newcol := 2L][]

# problem starts where we want to have a list column to be used in RHS
# adding extra `list` to both calls will make outcome different
d(a=1L)[, newcol := list(list(2L))][]  ## list column
d(a=1L)[, newcol := list(2L)][]        ## int column
# it is the same for 1 row and 2+ rows DT

# when RHS list column has 1+ elements, it is interpreted as multiple columns to assign
# thus error reported by Jakob
d(a=1L)[, newcol := list(list(2L, 3L))][]  ## error
d(a=1L)[, newcol := list(2L, 3L)][]        ## error
d(a=1:2)[, newcol := list(list(2L, 3L))][]  ## list column
d(a=1:2)[, newcol := list(2L, 3L)][]        ## list column

I don't know if there is a way to reliably identify when user wants to use list column, or just use a syntax that doesn't require our internal helper list wrapper.
Best practice would be to expect user to always use extra list when working with list column in RHS, that means breaking change for 2+ rows DT, which currently works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants