Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Allow NULL columns in data.table() #2305

Closed
franknarf1 opened this issue Aug 16, 2017 · 1 comment · Fixed by #3455
Closed

[Request] Allow NULL columns in data.table() #2305

franknarf1 opened this issue Aug 16, 2017 · 1 comment · Fixed by #3455
Milestone

Comments

@franknarf1
Copy link
Contributor

@franknarf1 franknarf1 commented Aug 16, 2017

This could be either a question or a request. The question would be: "Are NULL columns allowed in a data.table?" The request:

I was looking at #2303 and noticed this error:

library(data.table)
data.table(character(0), NULL)
# Error in data.table(character(0), NULL) : column or argument 2 is NULL

It would arguably be nice for data.table() to allow the construction of a table like this.

as.data.table.list allows it with no error:

DT = as.data.table(list(y = character(0), x = NULL))

NULL is idiomatic as a column value for grouped operations and (mostly) rbindlist:

data.table(g = 1:2)[, if (.GRP == 1L) Sys.Date(), by=g]
rbind(DT, list(y = "a", x = 12))

# also works, albeit with a warning
data.table(g = 1:2)[, .(v = c("yo", "zo"), z = if (.GRP == 1L) Sys.Date()), by=g] 
# fails, with an informative error message
rbind(DT, list(y = "a", x = Sys.Date())) 

And finally, the "wildcard" behavior of NULL in grouped operations and rbindlist is built upon in packages like vetr (see the "alike" vignette):

library(vetr)
alike(DT, data.table(y = "a", x = "b")) # TRUE
alike(DT, data.table(y = "a", x = 12)) # TRUE

I may be overlooking some arguments against it. The main ones that come to mind are:

  • There is no way to add such a NULL column by reference.
  • Does this have any meaning or use for tables with rows?
  • There are bugs (like the one I linked at the top) that go away if NULL columns aren't a thing (?).

If it's decided that data.tables shouldn't have NULL columns (which would also be fine), then I guess as.data.table.list(x) may need to change to throw an error; or to drop the NULL elements of x even when max(lengths(x)) == 0.

@franknarf1 franknarf1 changed the title [Request] Allow NULL data.table columns in data.table() [Request] Allow NULL columns in data.table() Aug 16, 2017
@mattdowle mattdowle added this to the 1.12.2 milestone Mar 14, 2019
@mattdowle mattdowle mentioned this issue Mar 14, 2019
8 tasks
@mattdowle
Copy link
Member

@mattdowle mattdowle commented Mar 20, 2019

Yep: data.table's should not have NULL columns. Every column needs to have a type.
Have added some catches and tests. The things you mention should still work like the grouping example (I checked that's tested). If not, please raise a new issue.

mattdowle added a commit that referenced this issue Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants