Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New package to normalize spatial data for web plotting. #15

Open
bhaskarvk opened this issue Aug 21, 2017 · 30 comments
Open

New package to normalize spatial data for web plotting. #15

bhaskarvk opened this issue Aug 21, 2017 · 30 comments

Comments

@bhaskarvk
Copy link

bhaskarvk commented Aug 21, 2017

This is a bit of a future planning, but here is the main idea.
Currently there is code in the leaflet package that extracts data from sp and sf objects and converts it into a dataframe that is then passed to the Javascript side (by converting it into a JSON).
This code is fairly generic and not really dependent on anything leaflet specific. It makes a lot of sense to take out this code and make it a package of its own. That way we can build other web plotting R packages to wrap say d3.geo or mapboxGL or cesium and reuse a major chunk of the code that takes data from spatial objects and passes it to Javascript.

I have some discussions about this with @jcheng5 and agrees that this is a good idea. There are some questions I have for the r-spatial community.

a) Do you think this is a good idea ?
b) If so then do you think it makes sense for this proposed package to live in r-spatial repo ?
c) If b) is 'yes' what sort of licensing and copyright arrangement we need in place between RStudio and r-spatial ?

cc @tim-salabim @edzer

@tim-salabim
Copy link
Member

tim-salabim commented Aug 22, 2017

@bhaskarvk

a) I think this is a great idea
b) I don't see why this is not suited for r-spatial
c) I am no expert on licensing and all that is related, thus I don't really feel qualified to provide a definite answer on this one. I would love to get some more input here.

Two comments/questions from my side (that I can think of right now):

  • I am not overly familiar with geojsonio but isn't that what this package is about?
  • Should we aim at implementing this in C++ via Rcpp in order to handle large data performantly? Especially with regard to potentially implementing webgl rendering.

@bhaskarvk
Copy link
Author

@tim-salabim geojsonio is for reading/writing geo/topo JSONs from the file-system. What I am proposing is a common package that will take any spatial R object (sp, sf, geo/topo JSONs either as lists, char strings or R objects) and making them available to a htmlwidget in a consistent manner.
That way we can easily make new web GIS plotting pkgs that wrap mapboxGL / Cesium / OpenLayers etc.

It may well be that this package will rely on sp/sf/geojson/geojsonio packages to read the data but what differentiates it is the consistent manner in which it makes this spatial data available to the widget side.

So then you can have code that can look like

leaflet() %>% 
  addPolygons(some<sp|sf>polygon-data,...)

# OR

mapboxGL() %>%
  addPolygons(some<sp|sf>polygon-data,...)

# OR

openLayers() %>%
  addPolygons(some<sp|sf>polygon-data,...)

w/o having to duplicate the code which reads the spatial objects in individual leaflet/mapboxGL/Cesium packages.

@tim-salabim
Copy link
Member

relevant rstudio/leaflet#452

@edzer
Copy link
Member

edzer commented Aug 25, 2017

For the part (sf -> data.frame, sp -> data.frame), I think it makes more sense to have this as part of the sf and sp APIs, i.e. inside the packages.

For instance, rstudio/leaflet#452 does not happen when leaflet uses st_coordinates on the sfc object, instead of calling do.call(rbind, sfc) which wrongly assumes sfc is not an empty list.

@bhaskarvk
Copy link
Author

Yes that's an acceptable solution as well. I just think it belongs outside of leaflet.

@edzer
Copy link
Member

edzer commented Aug 25, 2017

Fair enough. Which functions in leaflet does this concern?

@mdsumner
Copy link
Member

mdsumner commented Sep 1, 2017

I've been working on this in spbabel and in a superseded form of that in https://github.com/mdsumner/sc

The discussion and rationale there is my best overview of the landscape, but I've learnt quite a lot more since those were written.

The form must be relational, composed of multiple data frames - that's the only way to store all the types that are needed, and it's the only way to store topology at all (you can't do that with nesting, even if you nest indexes you still need a common pool table for those indexes to refer to).

It's desperately needed that we have a common agreed form for these data in R, and I think specific packages should all contain decompositions to a generic form for their specific types. I've learnt enough in those and related projects for progressing my work but I'm very happy to pursue this in a more general form that transcends all and any specific implementations currently in use. The OSM work by rOpenSci has similar challenges, and osmdata in particular is an important use-case.

@mpadge this is part of the general problem we've been talking about :)

Finally, I'm absolutely delighted to hear this is seen as important and I'm extremely happy to help in any way I can. This is essential for the R community to move forward on, and I look forward to seeing how my explorations will fit into this, thanks @bhaskarvk !

@bhaskarvk
Copy link
Author

@edzer All the code in leaflet's R/normalize*.R files is what I was thinking.

@SymbolixAU
Copy link

I might be going off on a tangent to the theme of this discussion, but, what's r-spatial/the community's thoughts on using encoded polylines to represent geometries?

Whenever I plot a map in googleway I always encode my spatial objects first as it reduces the size of the object being plotted (and the encoded polylines are natively supported in Google Map's API).

I've been playing about with a spatialdatatable package to do the encoding. I don't know how far I'm going to take this package, but if there's appetite to include it in r-spatial then I'll carry on.

@edzer
Copy link
Member

edzer commented Oct 18, 2017

I see this similar to s2 cells and geohash; dedicated optimizations where you can afford some rounding and bandwidth is an issue. This one aims at communicating with the google maps stack.

Your package does this, as well as the integration with data.table. Does it make sense to somehow separate that?

If you believe it will attract a larger user community, we could move the package here.

@SymbolixAU
Copy link

SymbolixAU commented Oct 18, 2017

I think separating the encoding/normalising is probably a good idea and would be a better fit for this 'new package' (whatever it turns out to be). And I think there will definitely be a better way of doing the encoding from sf - to polyline than my nested lapply's. I also started to look into boost and CGAL, but haven't progressed with it.

The reason I started writing spatialdatatable was to speed up the geosphere calculations, and also make them naturally usable inside data.table[ ] syntax.

@SymbolixAU
Copy link

SymbolixAU commented Dec 3, 2017

I've made a start by creating googlePolylines to handle the encoding and decoding of (primarily) sf objects into encoded polylines. As mentioned, the encoded lines reduce precision, but can speed-up plotting

I've seen plugins for leaflet to use these polylines too so there may be some opportunity for integration.

@SymbolixAU
Copy link

Thanks for the reminder @tim-salabim !

Given my recent updates to mapdeck I think I've got a solid base of code to make this 'normalised data' package, so I'm happy to get this going.

anyone got a good suggestion for a package name?

@SymbolixAU
Copy link

SymbolixAU commented Oct 28, 2018

@mdsumner
Copy link
Member

@SymbolixAU can I suggest you take a look at silicate - the binary branch - there's two key functions BINARY and SC.

  • BINARY - builds a simple structure of object and vertex table, with edges of each object nested in edge_
  • SC - builds a fully labelled-entity structure of object, edge and vertex

(object is feature in sf terms, but more general - we can have mesh types and other non SF forms)

The first is not topological (no vertex de-dupe) and cannot survive vertex subsetting without remapping the indexes. The second is topological (unique in x/y by default), with unique IDs for object, edge and vertex - so it can be arbitrarily ordered and passed through other systems.

This has festered a bit, and my anglr package needs an update with the new SC/BINARY structure, but I'm hoping we can find common ground here. These forms admit conversion to other formats pretty easily, and there are verbs for extracting the entities sc_coord, sc_path, sc_vertex etc. so most of the format-specific details can go into methods for those.

@SymbolixAU
Copy link

yes. And I really want to start working with your structures to see what they are all about. We definitely need to get it all integrated.

@mdsumner
Copy link
Member

mdsumner commented Nov 1, 2018

all right, sorry for the dead horse flogging - spatialwidget doesn't look like what I thought you were talking about - trying to get a bearing on how you see things. 👍

@SymbolixAU
Copy link

I'm going to add some concrete R examples to the spatialwidget package to hopefully make the design / rationale clear :)

@SymbolixAU
Copy link

SymbolixAU commented Nov 1, 2018

Going to commit something this evening, but, for starters, this is what I'm aiming for.

You pass it an sf object, tell it which columns of sf are the colours/opacities/whatever (or you can specify specific values), and it returns a list with 2 JSON objects. These JSON objects can then be parsed by an htmlwidget

spatial_line(mapdeck::roads[1:5, ], stroke_colour = "FQID", stroke_opacity = 3, stroke_width = 3)

$data
[1] "[{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#FDE72503\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.014291,-37.830458],[145.014345,-37.830574],[145.01449,-37.830703],[145.01599,-37.831484],[145.016479,-37.831699],[145.016813,-37.83175],[145.01712,-37.831742],[145.0175,-37.831667],[145.017843,-37.831559],[145.018349,-37.83138],[145.018603,-37.83133],[145.018901,-37.831301],[145.019136,-37.831301],[145.01943,-37.831333],[145.019733,-37.831377],[145.020195,-37.831462],[145.020546,-37.831544],[145.020641,-37.83159],[145.020748,-37.83159],[145.020993,-37.831664]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#44015403\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.015016,-37.830832],[145.015561,-37.831125],[145.016285,-37.831463],[145.016368,-37.8315],[145.016499,-37.831547],[145.016588,-37.831572],[145.01668,-37.831593],[145.01675,-37.831604],[145.016892,-37.83162],[145.016963,-37.831623],[145.017059,-37.831623],[145.017154,-37.831617],[145.017295,-37.831599],[145.017388,-37.831581],[145.017523,-37.831544],[145.018165,-37.831324],[145.018339,-37.831275],[145.018482,-37.831245],[145.018627,-37.831223],[145.01881,-37.831206],[145.018958,-37.831202],[145.019142,-37.831209],[145.019325,-37.831227],[145.019505,-37.831259],[145.020901,-37.831554],[145.020956,-37.83157]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#FDE72503\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.020116,-37.830563],[145.019885,-37.830572],[145.019502,-37.83069],[145.01935,-37.8307],[145.019104,-37.830655],[145.01582199999999,-37.829909],[145.013658,-37.829467],[145.013556,-37.82946],[145.013446,-37.829437],[145.013344,-37.829403],[145.013174,-37.829359],[145.01303,-37.829346],[145.012949,-37.829349],[145.012915,-37.8294],[145.01289,-37.829551],[145.012699,-37.82969]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#23898D03\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.013367,-37.82957],[145.013578,-37.82958],[145.014053,-37.829673],[145.014522,-37.829757],[145.015338,-37.829902],[145.016323,-37.830123],[145.017672,-37.830471],[145.019195,-37.830872]]}}},{\"type\":\"Feature\",\"properties\":{\"stroke_colour\":\"#20928C03\",\"stroke_width\":3.0},\"geometry\":{\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[145.019266,-37.831062],[145.014738,-37.830149],[145.014392,-37.830096],[145.014048,-37.830059]]}}}]"
attr(,"class")
[1] "json"

$legend
[1] "{\"stroke_colour\":{\"colour\":[\"#44015403\",\"#3B528B03\",\"#21908C03\",\"#5DC96303\",\"#FDE72503\"],\"variable\":[\"1347.00\",\"2389.25\",\"3431.50\",\"4473.75\",\"5516.00\"],\"colourType\":[\"stroke_colour\"],\"type\":[\"gradient\"],\"title\":[\"FQID\"],\"css\":[\"\"]}}"
attr(,"class")
[1] "json"

where it can render the ~18k rows in milliseconds

nrow(mapdeck::roads)
# [1] 18286

system.time({
  lst <- spatial_line(mapdeck::roads, stroke_colour = "FQID", stroke_opacity = 3, stroke_width = 3)
})
# user  system elapsed 
# 0.084   0.010   0.100 

@mdsumner
Copy link
Member

mdsumner commented Nov 1, 2018

Do you mean for the aes()-like mapping for that? I assume the conversion is straightforward (geojsonsf c++ ...).

I've toyed with aes(), though now it's probably purely group_by and select with named special attributes is the way to go with rlang? So I go

mapdeck::roads[1:5, ] %>% spatial_line(geometry = geometry,, stroke_colour = FQID, stroke_opacity = 3, stroke_width = 3)

and under the hood what happens is like

mapdeck::roads[1:5, ] %>% transmute(geometry = geometry, stroke_colour = FQID, stroke_width = 3)

But, lazily and without actually creating a new sf object - using rlang. Is that on the right track?

@SymbolixAU
Copy link

SymbolixAU commented Nov 1, 2018

I think under the hood it's along those lines, yes. With a little bit extra wrangling to create colours from the variables (and also a summary palette for a legend), and finally the geojson step.

The idea is the output of spatial_line() feeds directly to javascript through the various invoke_method() calls in:

So internally, each of those addPolyine(), add_polyline(), add_path() will have a function body similar to

add_new_polyline <- function(sf, ... ) {
  ## a bit of internal stuff for each implementation
  js <- spatial_line(...)
  invoke_method( ..., js , ... )
}

for comparison

sf <- mapdeck::roads

library(microbenchmark)

microbenchmark(
  leaflet = {
    leaflet::leaflet() %>%
      leaflet::addPolylines(data = sf)
  },
  googleway = {
    googleway::google_map(key = "abc") %>%
      googleway::add_polylines(data = sf)
  }, 
  spatialwidget = {
    spatial_line(mapdeck::roads, stroke_colour = "FQID", stroke_opacity = 3, stroke_width = 3)
  },
  times = 5
)

# Unit: milliseconds
#          expr        min         lq      mean     median       uq       max neval
#       leaflet 5643.93449 5701.28007 5877.8871 5724.51738 5765.045 6554.6590     5
#     googleway 2568.89439 2578.94522 2651.6126 2614.33833 2704.747 2791.1383     5
# spatialwidget   92.20373   96.02003  107.3444   98.26774  103.182  147.0484     5

@mdsumner
Copy link
Member

mdsumner commented Nov 1, 2018

This is very nice helpful, can I ask about the dots passed on to invoke_method, is that some way of keeping track at R and js levels? (Or just some magic?)

@SymbolixAU
Copy link

Merely laziness on my part here, to indicate there are other arguments in those functions :)

@tim-salabim
Copy link
Member

tim-salabim commented Nov 1, 2018

They are just additional arguments passed from R to the js method that is being invoked.
See e.g. here for the R side and correspondingly here for what the js binding (the method) receives.

@mdsumner
Copy link
Member

mdsumner commented Nov 1, 2018

oh phew, thanks!

@SymbolixAU
Copy link

I've added three R functions, widget_point(), widget_line() and widget_polygon() which you can use directly, and I've updated the README and merged all the dev into master.

I think this gives more concrete examples of what I'm aiming for.

I'm going to use this spatialwidget library in mapdeck and googleway, so that's been the primary focus of my design, but if there's anything leaflet/mapview would benefit from let me know.

@tim-salabim
Copy link
Member

Cool! I will play with it in the near future. At the moment still focussing on leaflet.glify performance and usability enhancements.

@SymbolixAU
Copy link

To keep this thread updated, I'm planning on submitting spatialwidget to CRAN in a week or so

@harryprince
Copy link

harryprince commented Jan 13, 2019

I propose a detailed comparison between leafgl, deckgl and mapdeck to figure out which is the best solution when we need to plot large-scale points. Wish SQL monkey like me can save more time.

r-spatial/leafgl#11

@tim-salabim
Copy link
Member

This discussion is partly related to #13. Hence, I'd be inclined to leave it open for now.

I still haven't gotten around to play with spatialwidget but my feeling is that it is the closest we have come to a normalised spatial data package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants