Simplify the early stages of creating a `StreetNetwork` from OSM data #127

BudgieInWA · 2022-11-23T16:57:47Z

The start of main osm_to_street_network function is a complicated tangle of stateful modifications of the StreetNetwork with plenty of comments hinting at dependencies between seemingly unrelated steps. The delegation of responsibilities and relationship between StreetNetwork::blank, extract_osm, split_up_roads and clip_map is not clear, with mutations happening to the StreetNetwork all over the place.

osm2streets/streets_reader/src/lib.rs

Lines 102 to 115 in 3d64f99

    
           let mut streets = StreetNetwork::blank(); 
        
           // Note that DrivingSide is still incorrect. It'll be set in extract_osm, before Road::new 
        
           // happens in split_ways. 
        
           streets.config = opts.map_config.clone(); 
        
           if let Some(ref pts) = clip_pts { 
        
               let gps_bounds = GPSBounds::from(pts.clone()); 
        
               streets.boundary_polygon = Ring::new(gps_bounds.convert(pts))?.into_polygon(); 
        
               streets.gps_bounds = gps_bounds; 
        
           } 
        
           let extract = extract_osm(&mut streets, osm_xml_input, clip_pts, &opts, timer)?; 
        
           let split_output = split_ways::split_up_roads(&mut streets, extract, timer); 
        
           clip::clip_map(&mut streets, timer)?;

I think we should improve the public APIs of StreetNetwork and OsmExtract so that they help explain the ordering constraints and delegation of responsibilities. I'll share my ideas about that here, which requires a diversion into clipping and the StreetNetwork to OSM mapping.

Clipping the Map

Clipping is an interesting problem because the interpretation of the map features is influenced by nearby features, that's the whole reason osm2streets needs to exist. I think that StreetNetwork ought to treat its features fundamentally as areas, not centerpoints and centerlines. (That's why I'm advocating for geometry calculations to move up earlier in the process.) If that is the case, then precisely clipping a StreetNetwork has to involve cutting Roads and Intersections in half. This is not something that we should support at the representation level, because the representation still needs to use points and centerlines as reference for the distinct features (intersections and roads) that make up the areas.

In fact, it is an unavoidable problem that we can't accurately represent anything that is close to the edge of the raw data we have. We should acknowledge that fact and build it into our understanding and the API. There should be a buffer border of some distance around the area that we have data for that we consider to be nonsense output. Probably about half the width of a very wide road.

When doing the OSM to Streets conversion, I think we should trim the raw OSM to the boundary polygon in OsmExtract, before adding it into StreetNetwork. The trimming operation is better defined in the domain of the OSM data itself, because the OSM data is points and lines. There should be no clipping operation done on StreetsNetwork representation data, instead we could clip the end-result geometry precisely to the inside edge of the buffer boundary, or output that boundary and let the consumer do it.

Mapping Between `StreetNetwork` and OSM

A major usecase for StreetNetwork is empowering tools for authoring OSM data, so it will be useful to be able to get from some StreetNetwork feature back to OSM feature that defines it. From our recent discussions, it seems like it will be useful for transformations to have access back to the defining OSM too. That way they can use whatever weird properties of the OSM data they want, without us having to represent it first class within the StreetNetwork features.

The takeaway here is that StreetNetwork should store a full copy of the original OSM data (at least until the user wants to discard it) and maintain a mapping. Doing so makes it easier to reason about adding in additional details at later steps, like use_barrier_nodes and use_crossing_nodes does currently.

The Refactor

Ok, enough waffling about motivations, my proposal is to use private methods and "constructor" methods to delineate the responsibilities, and use public methods to communicate what is safe to do without worrying about internal data dependencies.

Move the concept of a clip boundary into OsmExtract, with the invariant that it only stores data within the boundary. Don't be scared about discarding data just outside boundary, we already have no idea what features we can't see beyond the bounds of the raw data we do have, so lets be clear about our cut-off point.
Implement OsmExtract::new(doc: osm::Document, boundary: Option<Vec<LonLat>>) -> Self which doesn't depend on anything StreetNetwork. It can clip the raw OSM features as it reads them from doc.
Implement StreetNetwork::from_osm(osm: OsmExtract, config: MapConfig) -> Self which does split_up_roads internally, and whatever else is necessary to return a StreetNetwork that is ready to have its public API called.
Make sort_roads and update_movements private again, and work towards all transformations doing their work using the public StreetNetwork API, instead of direct mutation.

Because StreetNetwork::insert_road exists the implication is that it handles some internal data dependency so the caller doesn't have to. split_up_roads should be private to StreetNetwork because it wants to be more efficient than using insert_road and is willing to understand the data dependency in order to do it safely. Transformations should use the public API because they should be non-vital and safe. The only danger of skipping them or calling them in the wrong order should be the missed opportunity for an easy improvement.

The idea is that an interactive editor could use StreetNetwork::new(bounds: GPSBounds, config: MapConfig) to create a blank map for the user, and use insert_intersection and insert_road to build up a map based on user input. Then transformations would be buttons available to "tidy up" the map. They can be clicked in any order as many times as the user likes.

The text was updated successfully, but these errors were encountered:

dabreegster · 2022-11-23T17:18:41Z

Thanks for writing this up! I'm totally in favor.

The trimming operation is better defined in the domain of the OSM data itself, because the OSM data is points and lines.

I really like this idea! I think I've tried to get external tools to do that clipping before, but never gotten exactly the right thing working. Currently we tend to assume something like complete_ways strategy (https://osmcode.org/osmium-tool/manual.html) is passed in as input.

One consequence of rethinking the logic at the OSM level is treating very long roads or railways equivalently. Currently we special case railways that start and end outside the boundary but pass somewhere inside. No reason we shouldn't treat roads the same. (Old traffic sim assumptions leaking through were the original reason.)

The takeaway here is that StreetNetwork should store a full copy of the original OSM data

In https://github.com/a-b-street/osm2streets/tree/osm_tags I'm getting rid of osm_tags on Road. Separately a caller could hold onto a Document, which maps way IDs to Tags. A Road tracks multiple way IDs. From a code structure perspective, does that still seem like what we want to do?

I agree this would very nicely clean up current hacks around crossing_nodes. Right now a very downstream caller outside of this library makes use of that data, but they could do the association with their own road representation themselves. Eventually osm2streets will natively handle crossing and barrier points in some other way.

The idea is that an interactive editor...

That's sort of the intention behind the raw map editor, but in its current form, it's overly complex because of the old way that IDs used to be managed. The refactor you describe will make building this kind of app much, much easier.

BudgieInWA · 2022-11-24T04:50:26Z

I think I've tried to get external tools to do that clipping before, but never gotten exactly the right thing working. Currently we tend to assume something like complete_ways strategy (osmcode.org/osmium-tool/manual.html) is passed in as input.

Yeah, the complete_ways strategy is how I imaging raw OSM data coming in because that's how the data download in JOSM works. All ways are complete and some stick out of the bounding box. The UI then draws the downloaded area on the map (showing "out of bounds" as yellow hatches), so the user can understand both 1) the extent of the objects they are editing and 2) where there may be unseen features nearby or connected to the loaded objects.

Because the OSM data model allows splitting ways wherever the mapper wants to, I think any "out of bounds" data should be ignored consistently, so it would be really cool for OsmExtract to clip strictly to the boundary polygon, truncating ways that cross over the boundary by inserting dummy nodes right at the boundary. It would want to keep areas as closed loops by doing a boolean intersection I guess... and maybe the clipped result is less useful for areas, but I feel like clipping our linear highway features in this way is still the best compromise.

OsmExtract might need to allow multiple clipped objects per id for ways that leave the boundary then come back in, or otherwise represent which sections are outside the boundary.

One consequence of rethinking the logic at the OSM level is treating very long roads or railways equivalently. Currently we special case railways that start and end outside the boundary but pass somewhere inside. No reason we shouldn't treat roads the same. (Old traffic sim assumptions leaking through were the original reason.)

Yes, I'd like to see the clipping be exhaustive at the OSM level, doing proper intersection checks for every line segment, though moving the clipping logic out of StreetNetwork land and into OSM land will be an improvement, even if the implementation is a rudimentary to begin with.

The takeaway here is that StreetNetwork should store a full copy of the original OSM data

In osm_tags I'm getting rid of osm_tags on Road. Separately a caller could hold onto a Document, which maps way IDs to Tags. A Road tracks multiple way IDs. From a code structure perspective, does that still seem like what we want to do?

I like the idea of OsmExtract being the ergonomic collection of raw data that the caller would refer to. Maybe the StreetNetwork (optionally) stores the extract itself and maintains a two-way mapping as it makes edits. That is, Roads and Intersections etc. point back to the OSM IDs and OsmExtract points to StreetNetwork IDs. StreetNetwork would be responsible for keeping this mapping up to date as it modifies its representation. It depends if node_id_to_road_id and node_id_to_intersection_id are useful long term.

The idea is that an interactive editor...

That's sort of the intention behind the raw map editor, but in its current form, it's overly complex because of the old way that IDs used to be managed. The refactor you describe will make building this kind of app much, much easier.

Even before we get there, thinking about how StreetNetwork can encapsulate the complexities behind those sorts of operations is how I make progress on understanding those complexities :D

dabreegster · 2022-11-24T09:10:20Z

It would want to keep areas as closed loops by doing a boolean intersection I guess

We have some support for that today. Areas get more complex, because they're often relations with some of the members missing in the complete-ways extract.

osm2streets/streets_reader/src/osm_reader/geom.rs

Line 28 in 117e8fb

    
           /// Take a bunch of partial PolyLines and attempt to glue them together into a single Ring. If the

attempts to use the boundary to glue things together. It's used for areas at https://github.com/a-b-street/abstreet/blob/ede907e23971427413b25982ef2c22e7c1309c6a/convert_osm/src/extract.rs#L149. Currently imperfect (a-b-street/abstreet#32) and I think hard to get right. So just focusing on non-area ways for the sake of this bug makes sense to me.

allow multiple clipped objects per id for ways that leave the boundary then come back in, or otherwise represent which sections are outside the boundary.

This is the case that

osm2streets/streets_reader/src/clip.rs

Line 33 in 117e8fb

    
           // Look for intersections outside the map. If they happen to be connected to multiple roads,

tries to handle. From a traffic sim perspective, some very weird patterns can happen for agents going to/from those map edge intersections. It's actually a connected road in reality, but the connection happens just outside the boundary. Not sure there's a better answer than doing this duplication though.

StreetNetwork would be responsible for keeping this mapping up to date as it modifies its representation

Hmm, maintaining a two-way mapping sounds tougher. Right now each Road and Intersection points back to a list of OSM IDs. We can always go build the reverse mapping (many-to-many I think...) by looking at the whole StreetNetwork. If we wanted to cheaply maintain a two-way mapping, we'd have to decide where to store it and plumb it around. If it lives in StreetNetwork, the borrow checker might fight us if we have an &mut method on Road or Intersection changing IDs. But maybe we never tend to do that in practice.

…ds into MapConfig. #127

dabreegster · 2022-12-12T13:49:15Z

I had a go at starting this idea, because our current haphazard handling of map edge geometry is complicating the geometry refactor. A few problems...

Tests: We don't have an explicit boundary for any of the tests. I think we need to go back and record one, because we otherwise assume a bounding box covering everything in the OSM input. It'd be good to do this anyway, so we can make obnoxious cases like quad_intersection stop stretching super far because of the old behavior about not clipping light rail. StreetExplorer could remember + generate a GeoJSON with the bbox passed to Overpass, and add a button to download it.

Implement OsmExtract::new(doc: osm::Document, boundary: Option<Vec>) -> Self which doesn't depend on anything StreetNetwork. It can clip the raw OSM features as it reads them from doc.

Currently OsmExtract has callback methods like handle_way. This is so a consumer like A/B Street can handle non-road stuff still. This is making me debate where the OSM-level clipping should happen. Maybe handle_way should take the Way, mutate it by clipping if needed, and return Option<Way> to the caller. If None, it was a road and has been handled. If not, the caller can use it, and additionally it's guaranteed to be clipped.

This gets a bit tricky with areas. We already have glue_multipolygon and a few other things in this repo anyway, only used by A/B Street. Maybe we move some of the area clipping logic from A/B Street back here. Eventually osm_reader + Document could be a separate Rust library, totally orthogonal to osm2streets concerns. There are a few OSM-reading crates out there, but I don't think any attempt clipping. Which is maybe kind of reasonable, because possibly the right thing to do depends on semantics deeper than the OSM data model. (A way can represent both an area and a linestring-like thing, and the clipping behavior for each is a bit different.)

I'm leaning towards a different API... after reading a Document, we optionally clip it. Nothing to do with OsmExtract -- it would already receive a clipped document. That avoids the funny handle_way API.

I think the complication of clipping at the document level is introducing geometry that doesn't come from OSM. Way already has both nodes and pts, so I think this'll be OK.

…127

Also has the side effect of getting rid of handling for tiny roundabouts, because it's hard to reason about...

dabreegster · 2022-12-12T15:04:59Z

Something started in a branch; it's much simpler. Two more questions raised:

Is it time to get rid of OriginalRoad? Can we just remember a Road's osm_way_ids? Is it ever useful to track which segments they came from, or could we just stitch this together from the two endpoints osm_node_ids?
If we don't provide a clip and infer one from the OSM data, we'll never detect map edges. That's... kind of expected. Should we always require a boundary as input maybe, to simplify the API? I'll browse through all my use cases and see if there's always one readily at hand.

I'll open a few PRs to start pushing this through.

them. #127

…s GeoJSON. #127

…reetExplorer. #127

Also plumb through the boundary when actually calling osm2streets.

…127

Also has the side effect of getting rid of handling for tiny roundabouts, because it's hard to reason about...

them. #127

…s GeoJSON. #127

…reetExplorer. #127

Also plumb through the boundary when actually calling osm2streets.

…127

Also has the side effect of getting rid of handling for tiny roundabouts. 3 tests break, with clipped roads appearing on the wrong side of the boundary.

…f the PolyLine are in-bounds. #127

…127

Also has the side effect of getting rid of handling for tiny roundabouts. 3 tests break, with clipped roads appearing on the wrong side of the boundary.

…f the PolyLine are in-bounds. #127

…127

Also has the side effect of getting rid of handling for tiny roundabouts. 3 tests break, with clipped roads appearing on the wrong side of the boundary.

…f the PolyLine are in-bounds. #127

The reality is that roads can have multiple way IDs and intersections multiple node IDs. The idea of remembering an OSM segment (way, node -> node) doesn't handle all cases. Any callers trying to do this need to deal with the real complexity of multiple OSM Ids. Also get rid of merge_osm_ways, an old debugging hack for A/B Street. If we want something like this again, we can set internal_junction_road earlier. Bunch of test diffs from picking up OSM IDs now.

dabreegster · 2022-12-23T00:04:00Z

I had to back away a bit more from the idea of clipping an OSM document being blind to the types of things being clipped. When loop roads like https://www.openstreetmap.org/way/128464171 are split by a boundary, it was breaking previously. The previous way was also throwing out some geometry for large multipolygon relations that crossed the boundary.

We can revisit generic clipping again, but we need osm2areas or something with tests first. :) A/B Street importer is my proxy for that in the meantime.

- traffic signals should never wind up on map edges - multipolygons and OSM document-level clipping are still strange

…ing. #127

dabreegster added a commit that referenced this issue Nov 26, 2022

Remove A/B Street Options from this library. Move a few relevant fiel…

575b5af

…ds into MapConfig. #127

This was referenced Nov 28, 2022

Scrub A/B Street fields and half-baked logic #137

Merged

Parse placement tags and shift roads accordingly #139

Merged

dabreegster mentioned this issue Dec 9, 2022

Intersection geometry refactor #136

Open

10 tasks

dabreegster added a commit that referenced this issue Dec 12, 2022

Rearrange some osm_reader code, putting methods on Document. #127

cbba086

dabreegster added a commit that referenced this issue Dec 12, 2022

Implement clip on the OSM-level Document, instead of a StreetNetwork. #…

e641c1f

…127

dabreegster added a commit that referenced this issue Dec 12, 2022

Overhaul StreetNetwork road splitting + clipping. #127

35fcbd0

Also has the side effect of getting rid of handling for tiny roundabouts, because it's hard to reason about...

dabreegster added a commit that referenced this issue Dec 12, 2022

Use leaflet-geoman plugin to draw custom boundary polygons and import

b4ef4c7

them. #127

dabreegster added a commit that referenced this issue Dec 12, 2022

Display the boundary in StreetExplorer and let the user download it a…

687bd9b

…s GeoJSON. #127

dabreegster added a commit that referenced this issue Dec 12, 2022

Manually create clipping boundaries for all test cases. Display in St…

db64e89

…reetExplorer. #127

dabreegster added a commit that referenced this issue Dec 12, 2022

Regenerate tests with the boundary, and accept all changes. #127

857e2bb

Also plumb through the boundary when actually calling osm2streets.

dabreegster mentioned this issue Dec 12, 2022

Explicitly use clipping boundaries in StreetExplorer and tests #151

Merged

dabreegster added a commit that referenced this issue Dec 13, 2022

Rearrange some osm_reader code, putting methods on Document. #127

7adf63c

dabreegster added a commit that referenced this issue Dec 13, 2022

Implement clip on the OSM-level Document, instead of a StreetNetwork. #…

b60626d

…127

dabreegster added a commit that referenced this issue Dec 13, 2022

Overhaul StreetNetwork road splitting + clipping. #127

ff74cad

Also has the side effect of getting rid of handling for tiny roundabouts, because it's hard to reason about...

dabreegster added a commit that referenced this issue Dec 13, 2022

Overhaul StreetNetwork road splitting + clipping. #127

4d457fe

Also has the side effect of getting rid of handling for tiny roundabouts, because it's hard to reason about...

dabreegster added a commit that referenced this issue Dec 14, 2022

Use leaflet-geoman plugin to draw custom boundary polygons and import

8cd1f7b

them. #127

dabreegster added a commit that referenced this issue Dec 14, 2022

Display the boundary in StreetExplorer and let the user download it a…

43563fc

…s GeoJSON. #127

dabreegster added a commit that referenced this issue Dec 14, 2022

Manually create clipping boundaries for all test cases. Display in St…

f32c4ab

…reetExplorer. #127

dabreegster added a commit that referenced this issue Dec 14, 2022

Regenerate tests with the boundary, and accept all changes. #127

6898bd6

Also plumb through the boundary when actually calling osm2streets.

dabreegster added a commit that referenced this issue Dec 15, 2022

Rearrange some osm_reader code, putting methods on Document. #127

0ff05ec

dabreegster added a commit that referenced this issue Dec 15, 2022

Implement clip on the OSM-level Document, instead of a StreetNetwork. #…

b8ddc34

…127

dabreegster added a commit that referenced this issue Dec 15, 2022

Overhaul StreetNetwork road splitting + clipping. #127

0667575

Also has the side effect of getting rid of handling for tiny roundabouts. 3 tests break, with clipped roads appearing on the wrong side of the boundary.

dabreegster added a commit that referenced this issue Dec 15, 2022

Rewrite clip_polyline_to_ring and be more careful about which parts o…

e4e811d

…f the PolyLine are in-bounds. #127

dabreegster mentioned this issue Dec 15, 2022

Overhaul clipping #154

Merged

dabreegster added a commit that referenced this issue Dec 16, 2022

Rearrange some osm_reader code, putting methods on Document. #127

1e0128f

dabreegster added a commit that referenced this issue Dec 16, 2022

Implement clip on the OSM-level Document, instead of a StreetNetwork. #…

2e57d85

…127

dabreegster added a commit that referenced this issue Dec 16, 2022

Overhaul StreetNetwork road splitting + clipping. #127

e0c6d41

Also has the side effect of getting rid of handling for tiny roundabouts. 3 tests break, with clipped roads appearing on the wrong side of the boundary.

dabreegster added a commit that referenced this issue Dec 16, 2022

Rewrite clip_polyline_to_ring and be more careful about which parts o…

acdee70

…f the PolyLine are in-bounds. #127

dabreegster added a commit that referenced this issue Dec 16, 2022

Rearrange some osm_reader code, putting methods on Document. #127

c19b7f0

dabreegster added a commit that referenced this issue Dec 16, 2022

Implement clip on the OSM-level Document, instead of a StreetNetwork. #…

a7639bf

…127

dabreegster added a commit that referenced this issue Dec 16, 2022

Overhaul StreetNetwork road splitting + clipping. #127

2c41dd9

Also has the side effect of getting rid of handling for tiny roundabouts. 3 tests break, with clipped roads appearing on the wrong side of the boundary.

dabreegster added a commit that referenced this issue Dec 16, 2022

Rewrite clip_polyline_to_ring and be more careful about which parts o…

2007bf2

…f the PolyLine are in-bounds. #127

dabreegster added a commit that referenced this issue Dec 23, 2022

Handle some clipping edge cases. #127

ac5623a

dabreegster added a commit that referenced this issue Dec 23, 2022

More workarounds for clipping, found while upgrading A/B Street. #127

ca1f718

- traffic signals should never wind up on map edges - multipolygons and OSM document-level clipping are still strange

dabreegster added a commit that referenced this issue Dec 23, 2022

Be a bit more rigorous with dangling multipolygon members after clipp…

b169d11

…ing. #127

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify the early stages of creating a `StreetNetwork` from OSM data #127

Simplify the early stages of creating a `StreetNetwork` from OSM data #127

BudgieInWA commented Nov 23, 2022

dabreegster commented Nov 23, 2022

BudgieInWA commented Nov 24, 2022

dabreegster commented Nov 24, 2022

dabreegster commented Dec 12, 2022

dabreegster commented Dec 12, 2022

dabreegster commented Dec 23, 2022

Simplify the early stages of creating a StreetNetwork from OSM data #127

Simplify the early stages of creating a StreetNetwork from OSM data #127

Comments

BudgieInWA commented Nov 23, 2022

Clipping the Map

Mapping Between StreetNetwork and OSM

The Refactor

dabreegster commented Nov 23, 2022

BudgieInWA commented Nov 24, 2022

dabreegster commented Nov 24, 2022

dabreegster commented Dec 12, 2022

dabreegster commented Dec 12, 2022

dabreegster commented Dec 23, 2022

Simplify the early stages of creating a `StreetNetwork` from OSM data #127

Simplify the early stages of creating a `StreetNetwork` from OSM data #127

Mapping Between `StreetNetwork` and OSM