-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeouts reading JSON? #62
Comments
Just identified the source of the error yesterday and the dev branch contains a potential patch. Would you be able to install off the current dev branch and test it out? I just ran a local test and was successful on the above examples. Current dev has a lot of changes besides this bug fix and not yet stable. May release bug release on CRAN to fix this issue. |
The dev branch seems to have a lot going on and our Linux server is missing many of the new required libraries (V8, GDAL, GEOS). I was unable to get rgdal to build (missing proj_api.h) even after installing the what seems to be the appropriate dev libraries. Would it be possible to back port the fix to main branch? |
Yeah, I’ll throw out a quick patch to the main branch. From: Nathan Cobb [mailto:notifications@github.com] The dev branch seems to have a lot going on and our Linux server is missing many of the new required libraries (V8, GDAL, GEOS). I was unable to get rgdal to build (missing proj_api.h) even after installing the what seems to be the appropriate dev libraries. Would it be possible to back port the fix to main branch? — This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail (or the person responsible for delivering this document to the intended recipient), you are hereby notified that any dissemination, distribution, printing or copying of this e-mail, and any attachment thereto, is strictly prohibited. If you have received this e-mail in error, please respond to the individual sending the message, and permanently delete the original and any copy of any e-mail and printout thereof. |
Haven't thrown-up the patch yet. I think this bug is a little different than I though. Still planning on a patch release, just wanting to make sure I actually squash the issue. |
I'm having the same problem with Medicare.gov datasets, such as: I get the same error message natecobb posted. I'm a novice to R and programming for that matter, so it's highly possible I'm missing the solution somewhere, but is this something I can fix or should I start looking for alternative input methods? |
As a workaround you can pull the data directly as CSV, ie:
|
@cityofchicago @tomschenkjr does this have anything to do with N/As in the data? RJSONIO looks to make you require binding those with Sapply. Heres from a stackoverflow article json_file <- fromJSON(json_file) json_file <- lapply(json_file, function(x) { http://stackoverflow.com/questions/16947643/getting-imported-json-data-into-a-data-frame-in-r It looks like when you do to .csv like what @natecobb is doing it has some default way of including the N/A's |
Actually, the JSON--when there is a missing value--just doesn't return the field. That is: {"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60}, So, it's a vector of different lengths. That will need to be reconciled before binding everything. |
I think the problem emerges here: while (nrow(page) > 0) { # more to come maybe?
query <- paste(validUrl, if(is.null(parsedUrl$query)) {'?'} else {"&"}, '$offset=', nrow(result), sep='')
response <- getResponse(query, email, password)
page <- getContentAsDataFrame(response)
result <- rbind(result, page) # accumulate
} In my test the JSON the result only has 1 row, so the offset is only incremented by 1... which means that any large data set would time out. |
I'm trying to use RSocrata to pull data from the CDC, ie:
http://dev.socrata.com/foundry/#/chronicdata.cdc.gov/ksds-npd6
or
http://www.cdc.gov/cdi/
I am unable to load any of the data sets I tried, instead ultimately timing out with an error in curl_fetch():
Other example URLs that fail:
> read.socrata("https://data.cityofchicago.org/resource/xzkq-xp2w.json?$limit=500")
> read.socrata("https://sandbox.demo.socrata.com/resource/6cpn-3h7n.json")
Changing the json suffix to csv eliminates the timeout but I assume that their are other ramifications of changing the returned data model.
The error occurs with the current version on CRAN on a Mac and Linux (Ubuntu 12.04); I tested a couple of weeks ago using the master branch from GitHub and got the same error. It also occurs with or with an application token. Its unclear to me if this is a duplicate of other problems that have been reported, although I do see a mention of this error as occurring randomly in #56
The text was updated successfully, but these errors were encountered: