Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add landuse kind to roads #136

Closed
nvkelso opened this issue Aug 17, 2015 · 20 comments
Closed

Add landuse kind to roads #136

nvkelso opened this issue Aug 17, 2015 · 20 comments
Assignees

Comments

@nvkelso
Copy link
Member

nvkelso commented Aug 17, 2015

When a road intersects a landuse polygon, split the road up and tag the segment inside the landuse polygon with the polygon's landuse kind property.

Light grey roads are hard to see crossing a park, and white casings that look good over generic earth background pop too much over green parks.

@bcamper
Copy link
Collaborator

bcamper commented Aug 17, 2015

We'd do this server-side by splitting the lines? Sounds cool.

@nvkelso
Copy link
Member Author

nvkelso commented Aug 17, 2015

Yes, there'd need to be an extra (PostGIS) processing step.

@zerebubuth
Copy link
Member

When splitting a road up into parts which intersect and parts which don't, should the feature ID of the two parts be the same, or different? If different, how should we allocate new IDs -- do they have to be unique?

@rmarianski
Copy link
Member

I'll defer to @nvkelso, but my understanding is that we should use the same feature id.

@nvkelso
Copy link
Member Author

nvkelso commented Sep 2, 2015

Let's keep the same OSM feature ID for the new feature parts as the original feature. Because of tile boundaries, people already have to deal with multiple features in view with the same ID, and if someone needs to isolate / optimize for the bits they can create a effective ID by indexing on ID and kind?

@rmarianski
Copy link
Member

@zerebubuth All looks good to me ⚡. I put in a couple of comments/questions on this pr: tilezen/TileStache#37. Circle complained about a couple of pep8 issues on tilezen/tilequeue#22, but that should be easy to fix.

We might also want to apply different "cutting" strategies too. In particular, the intercutting with the buildings layer might have some other logic. But maybe we should just address that when working on that issue (#142) and not worry about it now.

Do you think that the n^2 intersections approach might be "fast enough"? I'm wondering if we can get away with this approach for the time being, or if we'll need to worry about that sooner rather than later. Might be worth measuring the impact, ideally on a tile that has lots of landuse and road features.

Btw, do you have a pr for the config updates to queries.yaml in vector-datasource?

@zerebubuth
Copy link
Member

@rmarianski Thanks for the review! I'll fix/answer those things now.

I'm not sure I quite follow what we'd want to do differently with the buildings layer. Do you mean we might want to "cut" against only one background polygon, so as not to actually split the feature?

I really don't know whether the O(N^2) will be fast enough - my gut feeling is that it'll be okay on most tiles, but there's always bound to be a few nasty corner cases. The question is whether it's worth mitigating the corner cases by introducing an indexing step which might slow down the average case.
I'll take some measurements and see if I can quantify what the overhead is, so that we can make an informed decision.

The changes to the queries.yaml are just:

post_process:
  TileStache.Goodies.VecTiles.transform.intercut:
    base_layer: roads
    cutting_layer: landuse
    attribute: kind
    target_attribute: landuse_kind

I figured I'd wait until the PR lands in tilequeue/tileserver and submit the config changes as a separate one.

@zerebubuth
Copy link
Member

I'll take some measurements and see if I can quantify what the overhead is, so that we can make an informed decision.

On a very small sample of tiles between zooms 10-16, the median increase in processing time was 67%, 95th percentile 717%. The most slowdown was 20x. And it's clearly getting much worse by zoom:

zoom    mean slowdown
10      7.1939042374
11      4.6927452014
12      6.3343295777
13      2.2770619587
14      1.6341794795
15      1.6826758032
16      1.3669686753

I'll try implementing some of the indexing / unioning ideas to see if we can get that down to an acceptable number.

@zerebubuth
Copy link
Member

Turns out union on all the geometries was a bad idea. I tried it on the outlier from the last run, hoping to get it down from 189s, and it's still running after 5 mins. I'll try indexing instead.

@rmarianski
Copy link
Member

Turns out union on all the geometries was a bad idea. I tried it on the outlier from the last run, hoping to get it down from 189s, and it's still running after 5 mins. I'll try indexing instead.

I suppose you'd have to also apply an n^2 approach here too to find which features overlapped, yes? And if more than 2 features overlapped, I think shapely has an optimization for that case (assuming the code just called union repeatedly)

http://toblerity.org/shapely/manual.html#shapely.ops.cascaded_union

I wonder if it's worth considering combining approaches too. Use the index to detect overlapping features, and then union those together.
(I'm harping on unions because I think we'll have to figure something like that out for merging water features in #140)

@zerebubuth
Copy link
Member

Yup, sadly I'd already used cascaded_union (to be precise, its successor unary_union). My guess is that for the tile I was testing with (10/511/340, which basically contains all of London), there's not much overlap and it just ends up building a horribly complex multipolygon.

So far, indexing is looking much better. Although it has meant pulling in an extra dependency on the rtree Python library, which requires libspatialindex, so there have been some changes to the requirements / Vagrant setup.

@nvkelso
Copy link
Member Author

nvkelso commented Sep 3, 2015

Newer shapely includes a version of rtree if you want less imports.

On Sep 3, 2015, at 07:55, Matt Amos notifications@github.com wrote:

Yup, sadly I'd already used cascaded_union (to be precise, its successor unary_union). My guess is that for the tile I was testing with (10/511/340, which basically contains all of London), there's not much overlap and it just ends up building a horribly complex multipolygon.

So far, indexing is looking much better. Although it has meant pulling in an extra dependency on the rtree Python library, which requires libspatialindex, so there have been some changes to the requirements / Vagrant setup.


Reply to this email directly or view it on GitHub.

@rmarianski
Copy link
Member

Newer shapely includes a version of rtree if you want less imports.

Right now we're pegged to an older version of shapely, for no other reason than repeatable builds. I've been meaning to update the versions of all our dependencies, and this sounds like a good reason to do so.

@nvkelso
Copy link
Member Author

nvkelso commented Sep 3, 2015

@zerebubuth
Copy link
Member

Thanks! I've re-written it to use shapely's wrapper of geos' STRtree (which turned out to be slightly faster than the rtree package too), but it's still pretty slow:

zoom    slowdown
10      1.4770874841
11      1.4717270406
12      1.9300250016
13      1.4621002199
14      1.3356416201
15      1.4768507732
16      1.3341983048

That's a whole lot better than the previous one at low zooms, but still seems to be around 50% slower than not using the algorithm at all. Is that an acceptable price to pay?

@zerebubuth
Copy link
Member

Oh, and I should mention; STRtree in shapely is since 1.4, and we're on 1.4.3, so hopefully we can keep the reproducible builds and get indexing support 😄

@nvkelso
Copy link
Member Author

nvkelso commented Sep 3, 2015

I suspect the slowdown is exacerbated by lots of tiny polygon geometries.

We're currently filtering to ensure a minimum size of 1 screen pixel here:
https://github.com/mapzen/vector-datasource/blob/master/queries/landuse.jinja2#L55

    {% if zoom == 9 %}
    AND way_area::bigint > 409600
    {% elif zoom == 10 %}
    AND way_area::bigint > 102400
    {% elif zoom == 11 %}
    AND way_area::bigint > 25600
    {% elif zoom == 12 %}
    AND way_area::bigint > 6400
    {% elif zoom == 13 %}
    AND way_area::bigint > 1600
    {% elif zoom == 14 %}
    AND way_area::bigint > 400
    {% elif zoom == 15 %}
    AND way_area::bigint > 100
    {% endif %}

But in the stylesheet we're being more aggressive:

    landuse:
        data: { source: osm }
        filter: 
            any:
                # limit show smaller landuse areas to higher zooms
                - { $zoom: { min: 9 },  area: { min: 10000000 } }
                - { $zoom: { min: 10 }, area: { min: 8000000 } }
                - { $zoom: { min: 11 }, area: { min: 2000000 } }
                - { $zoom: { min: 12 }, area: { min: 500000 } }
                - { $zoom: { min: 13 }, area: { min: 100000 } }
                - { $zoom: { min: 14 }, area: { min: 50000 } }
                - { $zoom: { min: 15 }, area: { min: 20000 } }
                - { $zoom: { min: 15 }, area: { min: 2000 } }
                - { $zoom: { min: 16 } }    

We're trying to keep the vector tiles more or less true to what's in raw OSM (so I'm not suggesting using the values from the stylesheet in the server filter), but I wonder if pruning a little bit more out (2 or 3 sq pixels worth of area instead of 1 sq px would get us a significant speed increase?

@rmarianski
Copy link
Member

We're trying to keep the vector tiles more or less true to what's in raw OSM (so I'm not suggesting using the values from the stylesheet in the server filter), but I wonder if pruning a little bit more out (2 or 3 sq pixels worth of area instead of 1 sq px would get us a significant speed increase?

I think it depends on how much gets filtered out, and I would guess that it would have more of an effect at the lower zooms. It would also reduce the tile sizes too (proportional to how much gets filtered out).

@zerebubuth zerebubuth assigned rmarianski and unassigned zerebubuth Sep 8, 2015
@nvkelso nvkelso added in review and removed verify labels Sep 8, 2015
@nvkelso nvkelso assigned zerebubuth and unassigned rmarianski Sep 8, 2015
@rmarianski
Copy link
Member

Pull requests look good to me: 👍

@zerebubuth
Copy link
Member

All merged, thanks!

zerebubuth added a commit that referenced this issue Sep 9, 2015
This will enable the code, already merged, for intercut and apply the landuse polygon's kind tag as "landuse_kind" on the roads. The roads will be split at landuse polygon boundaries.

Refs #136.
@nvkelso nvkelso modified the milestone: Compilation 2.0 Sep 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants