Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbindlist should deal with NULL values #1871

Closed
renkun-ken opened this issue Oct 9, 2016 · 4 comments · Fixed by #3455
Closed

rbindlist should deal with NULL values #1871

renkun-ken opened this issue Oct 9, 2016 · 4 comments · Fixed by #3455
Milestone

Comments

@renkun-ken
Copy link
Member

@renkun-ken renkun-ken commented Oct 9, 2016

I'm working with a big JSON string from web containing a list of lists, each has the same set of fields of scalar values. However, some items have NULL values, which makes rbindlist refuse to work.

A simple example is

p <- list(list(a = 1, b = 2, x = NULL), list(a = 2, b = 3, x = 10))
rbindlist(p)

This produces the following error

Error in rbindlist(p) : 
  Column 3 of item 1 is length 0, inconsistent with first column of that item which is length 1. rbind/rbindlist doesn't recycle as it already expects each item to be a uniform list, data.frame or data.table

which is quite understandable since NULL is zero-length like numeric(0). For me, it would be nice to have an argument or something to regard such zero-length entries as NA so that I don't have to manually iterate over all items and set them to missing values.

@franknarf1
Copy link
Contributor

@franknarf1 franknarf1 commented Oct 9, 2016

I don't know. With a NULL there, maybe the user means that they want it to be read as a list column, like

p2 <- list(list(a = 1, b = 2, x = list(NULL)), list(a = 2, b = 3, x = list(10)))
rbindlist(p2)

   a b    x
1: 1 2 NULL
2: 2 3   10

I think if replacement by NA_something_ is your preferred interpretation for NULL in this context, best to just make a wrapper with it.

@renkun-ken
Copy link
Member Author

@renkun-ken renkun-ken commented Oct 9, 2016

The problem is that the source JSON is provided by service websites like the following:

{
  { "a": 1, "b": 2 },
  { "a": null, "b": 3 },
  { "a": 2, "b": null },
  // ... millions of records
}

@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Oct 9, 2016

is there no way to have your json reader interpret 'null' as 'na'?

On Oct 9, 2016 10:49 AM, "Kun Ren" notifications@github.com wrote:

The problem is that the source JSON is provided by service websites like
the following:

{
{ "a": 1, "b": 2 },
{ "a": null, "b": 3 },
{ "a": 2, "b": null },
// ... millions of records
}


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1871 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHQQdTvG7qCOXVD-4QvL-CeOCiXkygNJks5qyP70gaJpZM4KSAFy
.

@renkun-ken
Copy link
Member Author

@renkun-ken renkun-ken commented Oct 9, 2016

In my case, I can fetch the JSON string and use jsonlite::fromJSON with simplyDataFrame = TRUE and then the data is automatically simplified as data.frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants