Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plugin to read spatial data directly from csv #902

Closed
artemp opened this issue Oct 11, 2011 · 3 comments
Closed

Add plugin to read spatial data directly from csv #902

artemp opened this issue Oct 11, 2011 · 3 comments
Assignees
Milestone

Comments

@artemp
Copy link
Member

artemp commented Oct 11, 2011

A simple text based format is needed for displaying points easily (or WKT encoded geometries).

Three main use cases:

  1. Novice user wants to create data from scratch or geocode some tabular data - its small and already in a spreadsheet, why not just allow them to read it and render it directly to avoid conversion step? This user could then push their data into google docs or even version in github so multiple users could collaborate on the simple file and rendering could get live updated. Mapnik could gracefully skip invalid rows (verbosely) and then errors could then be corrected at the source rather than just in the conversion step (to some database).

  2. A lot of API's dump data as csv. These same API's should support json, but until Mapnik adds a fast, native geojson plugin, a fast native csv plugin can suffice for optimized rendering of small data chunks (by catching in memory at first load).

  3. Massive government data already in csv with lon/lat. User wants to be able to look at it before trying to make sense of it more, and it is so big that normal spreadsheet or conversion tools fall down. We can efficiently render the bits of it that seem valid to enable better data exploration.

And specifically re GeoJSON . It is great, but:

  • is currently only supported in mapnik through ogr driver
  • gets slow quickly due to lack of fast parsing and indexing in ogr driver
  • can be passed as string, but triggers unneeded overhead in ogr to detect it as json
  • json is not as easy to edit by hand as plain text (main issue)

So, a native CSV (e.g. tabular plugin):

  • would have no external dependencies
  • could be more viable than using ogr csv plugin (no need for vrt, actual type detection, faster rendering)
  • would be useful for writing tests (could remove json usage)
  • aggressive up-front parsing and feature caching could enable faster rendering speeds than any other mapnik datasource approach (for reasonable size datasets).
  • could support any geometry type using wkt column or just simple points by auto-detecting long/lat columns
  • should be simple to write and maintain
  • could easily support being used inline in stylesheets
  • csv is easily edited by novice users

Cons are:

  • CSV is not standard and can vary in format/newlines (but boost::escaped_list_separator/boost::spirit can be leveraged)
  • Type detection is expensive and tricky to do perfectly (just look at sqlite)
  • Supporting CSV well is going to lessen some users desire to move to better formats
  • Slipperly slope: can I join this data to the csv?
@springmeyer
Copy link
Member

work underway on this at https://github.com/springmeyer/mapnik/tree/csv_plugin

@springmeyer
Copy link
Member

now as a branch here and pull request is queued up: #912 for post 2.0.1 release merge.

@springmeyer
Copy link
Member

merged into master in c97c4c9, closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants