Skip to content

Commit

Permalink
flattening object fields
Browse files Browse the repository at this point in the history
  • Loading branch information
chelm committed May 6, 2015
2 parents 4f9b54b + 0ef5e52 commit fd5a555
Show file tree
Hide file tree
Showing 8 changed files with 428 additions and 74 deletions.
12 changes: 12 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
language: node_js

node_js:
- '0.12'
- '0.10'
sudo: false # Enable docker-based containers
cache:
directories: # Cache dependencies
- node_modules

script:
- npm test
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Change Log
All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](http://semver.org/).

## [0.1.1] - 2015-05-06
### Added
* Flattening all object based properties
* ensuring that each feature contains each field

## [0.1.0] - 2015-04-21
### Changed
* This project now uses `standard` as its code formatting
* Keeping a legit changelog
* Added tape testing with sinon stubs in the controller tests

[0.1.1]: https://github.com/Esri/koop/releases/compare/v0.1.0...v0.1.1
[0.1.0]: https://github.com/Esri/koop/releases/tag/v0.1.0
43 changes: 25 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,50 @@
## Socrata Provider for [Koop](https://github.com/Esri/koop)
-----------

This provider makes it possible to access [Socrata's JSON API](http://dev.socrata.com/docs/formats/json.html) as either GeoJSON or an Esri FeatureService. This is particular useful for making maps and doing analysis on the web.

## Installation
## Install

To install/use this provider you first need a working installation of [Koop](https://github.com/Esri/koop). Then from within the koop directory you'll need to run the following:

```
npm install https://github.com/chelm/koop-socrata/tarball/master
```
```
npm install https://github.com/koopjs/koop-socrata/tarball/master
```

## Register Socrata Hosts

Once this provider's been installed you need to "register" a particular instance of Socrate with your Koop instance. To do this you make `POST` request to the `/socrata` endpoint like so:
Once this provider's been installed you need to "register" a particular instance of Socrata with your Koop instance. To do this you make `POST` request to the `/socrata` endpoint like so:

```
curl --data "host=https://data.nola.gov&id=nola" localhost:1337/socrata
```
```
curl --data "host=https://data.nola.gov&id=nola" localhost:1337/socrata
```
*for Windows users, download cURL from http://curl.haxx.se/download.html or use a tool of your choice to generate the POST request*

What you'll need for that request to work is an ID and a the URL of the Socrata instance. The ID is what you'll use to reference datasets that come from Socrata in Koop.
What you'll need for that request to work is an ID and the URL of the Socrata instance. The ID is what you'll use to reference datasets that come from Socrata in Koop.

To make sure this works you can visit: http://localhost:1337/socrata and you should see all of the register hosts.

## Access Socrata Data

To access a dataset hosted in Socrata you'll need a "resource id" from Socrata. Datasets in Socrata can be accessed as raw JSON like this:
To access a dataset hosted in Socrata you'll need a "Resource ID" from Socrata. Datasets in Socrata can be accessed as raw JSON like this:

* [https://data.nola.gov/Geographic-Reference/NOLA-Short-Term-Rentals-Map/psp3-bvzw](https://data.nola.gov/Geographic-Reference/NOLA-Short-Term-Rentals-Map/psp3-bvzw) translates into -> https://data.nola.gov/resource/psp3-bvzw.json
* [https://data.nola.gov/Health-Education-and-Social-Services/NOLA-Grocery-Stores/fwm6-d78i](https://data.nola.gov/Health-Education-and-Social-Services/NOLA-Grocery-Stores/fwm6-d78i) translates into -> https://data.nola.gov/resource/fwm6-d78i.json

And then the ID `psp3-bvzw` can be referenced in Koop like so:
And then the ID `fwm6-d78i` can be referenced in Koop like so:

[http://koop.dc.esri.com/socrata/nola/psp3-bvzw](http://koop.dc.esri.com/socrata/nola/psp3-bvzw)
http://koop.dc.esri.com/socrata/nola/fwm6-d78i

If your Socrata data has more than one location column, you can specify the desired location column in the http request like this:

https://path_to_koop/socrata/socrataProvider/dataSetID!spatialColumn

## Handle Large Datasets

The Socrata API defaults to 1000 results per request, but can be set to return up to 50,000. Koop will page through large datasets to capture all the points. To change the number of results per request, modify the 'limit' variable in the socrata.getResoruce function in models/Socrata.js.

## Examples

Here's a few examples of data hosted in Socrata and accessed via Koop
Here are a few examples of data hosted in Socrata and accessed via Koop.

* GeoJSON [http://koop.dc.esri.com/socrata/nola/psp3-bvzw](http://koop.dc.esri.com/socrata/nola/psp3-bvzw)
* FeatureService [http://koop.dc.esri.com/socrata/nola/psp3-bvzw/FeatureServer/0]
* All of the publicly registered Socrata instances [http://koop.dc.esri.com/socrata](http://koop.dc.esri.com/socrata)
* GeoJSON: http://koop.dc.esri.com/socrata/nola/fwm6-d78i
* FeatureService: http://koop.dc.esri.com/socrata/nola/fwm6-d78i/FeatureServer/0
* All publicly registered Socrata instances: http://koop.dc.esri.com/socrata
1 change: 1 addition & 0 deletions controller/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ var Controller = function (Socrata, BaseController) {
res.send(err, 500)
} else {
// Get the item
req.query.limit = 10000000
Socrata.getResource(data.host, req.params.id, req.params.item, req.query, function (error, geojson) {
if (error) {
res.send(error, 500)
Expand Down
219 changes: 165 additions & 54 deletions models/Socrata.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,104 +39,215 @@ var Socrata = function (koop) {
}

socrata.socrata_path = '/resource/'
socrata.socrata_view_path = '/resource/'

// got the service and get the item
socrata.getResource = function (host, hostId, id, options, callback) {
var fields, types,
type = 'Socrata',
key = id
var type = 'Socrata',
key = id,
locFieldName,
urlid,
paging = false,
limit = 1000

// test id for '!' character indicating presence of a column name and handle
if (id.indexOf('!') !== -1) {
locFieldName = id.substring(id.indexOf('!') + 1, id.length)
urlid = id.substring(0, id.indexOf('!'))
} else {
urlid = id
}

// attempt to load from cache, if error perform new request and get first page
koop.Cache.get(type, key, options, function (err, entry) {
if (err) {
var url = host + socrata.socrata_path + id + '.json'
var meta_url = host + socrata.socrata_view_path + id + '.json'
// dmf: have to make a request to the views endpoint in order to get metadata
var name

socrata.request(meta_url, function (err, data, response) {
var url = host + socrata.socrata_path + urlid + '.json?$order=:id&$limit=' + limit
socrata.request(url, function (err, data, response) {
if (err) {
callback(err, null)
} else {
try {
name = JSON.parse(data.body).name
} catch(e) {
callback(e, null)
// test to see if paging will be needed later
if (Object.keys(JSON.parse(data.body)).length === limit) {
paging = true
}
}
socrata.request(url, function (err, data, response) {
if (err) {
callback(err, null)
} else {
try {
types = JSON.parse(data.headers['x-soda2-types'])
fields = JSON.parse(data.headers['x-soda2-fields'])
var locationField
// get name of location field
try {
var locationField
if (locFieldName) {
locationField = locFieldName
} else {
var types = JSON.parse(data.headers['x-soda2-types'])
var fields = JSON.parse(data.headers['x-soda2-fields'])
types.forEach(function (t, i) {
if (t === 'location') {
locationField = fields[i]
}
})
}

socrata.toGeojson(JSON.parse(data.body), locationField, function (err, geojson) {
// parse first page to geoJSON and insert
socrata.toGeojson(JSON.parse(data.body), locationField, fields, function (err, geojson) {
if (err) {
return callback(err)
}
geojson.updated_at = new Date(data.headers['last-modified']).getTime()
geojson.name = id
geojson.host = {
id: hostId,
url: host
}
koop.Cache.insert(type, key, geojson, 0, function (err, success) {
if (err) {
return callback(err)
}
geojson.updated_at = new Date(data.headers['last-modified']).getTime()
geojson.name = name || id
geojson.host = {
id: hostId,
url: host
}
koop.Cache.insert(type, key, geojson, 0, function (err, success) {
if (err) {
return callback(err)
if (success) {
// check to see if paging is needed
if (paging === false) {
callback(null, [geojson])
} else {
// create GeoJSON return object
var retGeoJSON = geojson
// detrmine count of table and needed pages
var count, pages,
pagesComplete = 0,
countUrl = host + socrata.socrata_path + urlid + '.json?$select=count(*)'
request.get(countUrl, function (err, data, response) {
if (err) {
return callback(err)
}
count = parseInt(JSON.parse(data.body)[0].count, 10)
if ((count / limit) % 1 === 0) {
pages = (count / limit - 1)
} else {
pages = Math.floor(count / limit)
}
// page through data
for (var p = 1; p <= pages; p++) {
var pUrl = host + socrata.socrata_path + urlid + '.json?$order=:id&$limit=' + limit + '&$offset=' + (p * limit)
request.get(pUrl, function (err, data, response) {
if (err) {
return callback(err)
}
// parse pages to GeoJSON and insert partial
socrata.toGeojson(JSON.parse(data.body), locationField, function (err, geojson) {
if (err) {
return callback(err)
}
geojson.updated_at = new Date(data.headers['last-modified']).getTime()
geojson.name = id
geojson.host = {
id: hostId,
url: host
}
koop.Cache.insertPartial(type, key, geojson, 0, function (err, success) {
if (err) {
return callback(err)
}
if (success) {
// append geojson to return object
geojson.features.forEach(function (f) {
retGeoJSON.features.push(f)
})
// update pages completed and check for completion of pages
pagesComplete++
checkDone()
}
})
})
})
}

// function to check completion of pages
var checkDone = function () {
if (pagesComplete === pages) {
callback(null, [retGeoJSON])
} else {

}
}
})
}
callback(null, [geojson])
})
}
})
} catch(e) {
if (koop && koop.log) {
koop.log.error('Unable to parse response %s', url)
}
callback(e, null)
}
})
} catch (e) {
koop.log.error('Unable to parse response %s', url)
callback(e, null)
}
})
}
})
} else {
callback(null, entry)
}
})
}

socrata.toGeojson = function (json, locationField, callback) {
socrata.toGeojson = function (json, locationField, fields, callback) {
if (!json || !json.length) {
callback('Error converting data to geojson', null)
} else {
var geojson = {type: 'FeatureCollection', features: []}
var geojsonFeature
var geojson = { type: 'FeatureCollection', features: [] }
var geojsonFeature,
newFields = []
json.forEach(function (feature, i) {
geojsonFeature = {type: 'Feature', geometry: {}, id: i + 1}
var lat, lon
geojsonFeature = { type: 'Feature', geometry: {}, id: i + 1 }

// make sure each feature has each property and flatten objects
fields.forEach(function (f) {
if (f.substring(0, 1) !== ':') {
if (typeof feature[f] === 'object') {
for (var v in feature[f]) {
var newAttr = f + '_' + v
feature[newAttr] = feature[f][v]
newFields.push(newAttr)
}
delete feature[f]
}
}
})

if (feature && locationField) {
if (feature[locationField] && feature[locationField].latitude && feature[locationField].longitude) {
geojsonFeature.geometry.coordinates = [parseFloat(feature[locationField].longitude), parseFloat(feature[locationField].latitude)]
lon = parseFloat(feature[locationField].longitude)
lat = parseFloat(feature[locationField].latitude)
if ((lon < -180 || lon > 180) || (lat < -90 || lat > 90)) {
geojsonFeature.geometry = null
geojsonFeature.properties = feature
geojson.features.push(geojsonFeature)
} else {
geojsonFeature.geometry.coordinates = [lon, lat]
geojsonFeature.geometry.type = 'Point'
delete feature.location
geojsonFeature.properties = feature
geojson.features.push(geojsonFeature)
}
} else if (feature && feature.latitude && feature.longitude) {
geojsonFeature.geometry.coordinates = [parseFloat(feature.longitude), parseFloat(feature.latitude)]
geojsonFeature.geometry.type = 'Point'
geojsonFeature.properties = feature
geojson.features.push(geojsonFeature)
lon = parseFloat(feature.longitude)
lat = parseFloat(feature.latitude)
if ((lon < -180 || lon > 180) || (lat < -90 || lat > 90)) {
geojsonFeature.geometry = null
geojsonFeature.properties = feature
geojson.features.push(geojsonFeature)
} else {
geojsonFeature.geometry.coordinates = [lon, lat]
geojsonFeature.geometry.type = 'Point'
geojsonFeature.properties = feature
geojson.features.push(geojsonFeature)
}
} else {
geojsonFeature.geometry = null
geojsonFeature.properties = feature
geojson.features.push(geojsonFeature)
}
})
// 2nd loop over the data to ensure all new fields are present
if (newFields && newFields.length) {
geojson.features.forEach(function (feature) {
newFields.forEach(function (field) {
if (!feature.properties[field]) {
feature.properties[field] = null
}
})
})
}
callback(null, geojson)
}
}
Expand All @@ -161,7 +272,7 @@ var Socrata = function (koop) {
locationField = fields[i]
}
})
socrata.toGeojson(JSON.parse(data.body), locationField, function (error, geojson) {
socrata.toGeojson(JSON.parse(data.body), locationField, fields, function (error, geojson) {
geojson.updated_at = new Date(data.headers['last-modified']).getTime()
geojson.name = data.name || key
geojson.host = data.host
Expand Down

0 comments on commit fd5a555

Please sign in to comment.