Failing CRAN checks - fix by 4/21 to keep on CRAN #166

nicklucius · 2019-04-14T17:01:16Z

See the errors here: https://cran.r-project.org/web/checks/check_results_RSocrata.html

tomschenkjr · 2019-04-15T05:24:08Z

Wow! Where did this come from? Any idea on why this popped-up?

nicklucius · 2019-04-15T12:41:18Z

At this point I’m not sure! I just saw the email and haven’t looked closely yet.

geneorama · 2019-04-15T14:48:31Z

I didn't have to look far to find the problem, the very first test in tests\testthat\test-all.R is failing:

test_that("read Socrata CSV is compatible with posixify", {
  df <- read.socrata('http://soda.demo.socrata.com/resource/4334-bgaj.csv')
  dt <- posixify("09/14/2012 10:38:01 PM")
  expect_equal(dt, df$Datetime[1])  ## Check that download matches test
})

At first glance, it appears that Socrata has changed the name of the csv version to be consistent with the json version, but it also appears that they've made the date time formats the same:

> readLines("http://soda.demo.socrata.com/resource/4334-bgaj.csv?$WHERE=earthquake_id='10555601'", n=2)
[1] "\"datetime\",\"depth\",\"earthquake_id\",\"location\",\"magnitude\",\"number_of_stations\",\"region\",\"source\",\"version\""
[2] "\"2012-09-10T13:16:13.000\",\"11.60\",\"10555601\",\"(63.1085, -151.4938)\",\"1.1\",\"10\",\"Central Alaska\",\"ak\",\"2\""  
> readLines("http://soda.demo.socrata.com/resource/9szf-fbd4.json?$WHERE=earthquake_id='10555601'", n=2)
[1] "[{\"datetime\":\"2012-09-10T13:16:13.000\",\"depth\":\"11.60\",\"earthquake_id\":\"10555601\",\"location\":{\"type\":\"Point\",\"coordinates\":[-151.4938,63.1085]},\"magnitude\":\"1.1\",\"number_of_stations\":\"10\",\"region\":\"Central Alaska\",\"source\":\"ak\",\"version\":\"2\"}]"

Normally the date times would not be in the same format, would they? 2012-09-10T13:16:13.000

geneorama · 2019-04-15T16:19:08Z

I should have said "I didn't have to look far to find the *first problem", because that was just the first test.

Looking at the next test, "read Socrata CSV as default" the problem seems to be that the column types have changed. We're expecting this:

> c("character", "character", "character", "POSIXct", "numeric", 
+                  "numeric", "integer", "character", "character")
[1] "character" "character" "character" "POSIXct"   "numeric"   "numeric"   "integer"   "character"
[9] "character"

but we're getting this:

> unname(sapply(sapply(df, class),`[`, 1))
[1] "POSIXct"   "numeric"   "character" "character" "numeric"   "integer"   "character" "character"
[9] "character"

The new classes look right based on the data in

df <- read.socrata('https://soda.demo.socrata.com/resource/4334-bgaj.csv')
> head(df)
             datetime depth earthquake_id             location magnitude number_of_stations
1 2012-09-14 22:38:01   7.6      00388610 (41.1085, -117.6135)       2.7                 15
2 2012-09-14 22:14:45  10.6      15215753  (34.525, -118.1527)       1.5                 35
3 2012-09-14 22:14:21   0.0      71842370 (38.8023, -122.7685)       1.4                 21
4 2012-09-14 22:10:19   8.2      00388609 (36.9447, -117.6778)       1.5                 29
5 2012-09-14 22:06:11   6.4      00388607 (36.9417, -117.6903)       1.7                 29
6 2012-09-14 21:28:55  20.0      12258012  (19.7859, -64.0849)       3.1                  6
                       region source version
1                      Nevada     nn       9
2         Southern California     ci       0
3         Northern California     nc       0
4          Central California     nn       9
5          Central California     nn       9
6 north of the Virgin Islands     pr       0

I'm inclined to update the reference column classes to reflect the correct column types.

The decisions on how JSON was parsed in RSocrata go back to several other issues:

Issue Error in rbind(deparse. level, ...) #19
Issue Timeouts reading JSON? #62
Issue Fixes #19 JSON download error #102, which was also addressed in Closes #107 - Convert logical fields to character for JSON API #108 and fixes #15 - provide default sort order if not supplied in URL #109

nicklucius · 2019-04-18T22:35:15Z

Thanks, @geneorama!

I pushed a few changes to fix the remaining errors:

bcc9164
35e16bd

It looks like this was all caused by 2 apparent changes to the Socrata API.

Field name format
Field order

I believe I've seen a recent ticket with Socrata regarding the field order for the API not matching the order in the dataset. Could # 2 above this be related? In any case, I think at the point we can do a PR to master and new release.

geneorama · 2019-04-19T22:13:05Z

@nicklucius thank you for the additional fixes. It's strange, I don't think those tests were failing when I put in my changes the other day. I wonder if they're making more changes on the back end. Also, I too wonder if this is related to the ticket I opened.

Hot patch to fix #166

nicklucius added a commit that referenced this issue Apr 18, 2019

Update Version (#166) to reflect 3rd build of hotfix

ea868a1

geneorama closed this as completed in 464c009 Apr 22, 2019

geneorama added a commit that referenced this issue Apr 22, 2019

Merge pull request #168 from Chicago/issue166

d8a7663

Hot patch to fix #166

geneorama mentioned this issue Apr 26, 2019

Failing CRAN checks - fix by 5/8 to keep on CRAN #169

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failing CRAN checks - fix by 4/21 to keep on CRAN #166

Failing CRAN checks - fix by 4/21 to keep on CRAN #166

nicklucius commented Apr 14, 2019

tomschenkjr commented Apr 15, 2019

nicklucius commented Apr 15, 2019 via email •

edited by geneorama

geneorama commented Apr 15, 2019

geneorama commented Apr 15, 2019

nicklucius commented Apr 18, 2019

geneorama commented Apr 19, 2019

Failing CRAN checks - fix by 4/21 to keep on CRAN #166

Failing CRAN checks - fix by 4/21 to keep on CRAN #166

Comments

nicklucius commented Apr 14, 2019

tomschenkjr commented Apr 15, 2019

nicklucius commented Apr 15, 2019 via email • edited by geneorama

geneorama commented Apr 15, 2019

geneorama commented Apr 15, 2019

nicklucius commented Apr 18, 2019

geneorama commented Apr 19, 2019

nicklucius commented Apr 15, 2019 via email •

edited by geneorama