Create edge flags per edge instead of per way #2457

easbar · 2021-11-03T22:44:51Z

So far we create edge flags (EncodingManager/FlagEncoder#handleWayTags) for each OSM way, even though a way might be split into multiple edges. This works, because the flags are the same for each edge the way is split into. However, once we consider elevation (like #2456 ) or node tags (like traffic signals for example) this will no longer work. We already have EncodingManager/AbstractFlagEncoder#applyWayTags for this reason, but this is only used for the flag encoders, not the encoded values so far. We also determine the custom areas and countries per way and not per edge. This works as long as ways stop at country borders, but for smaller custom areas this can already break. Imagine a longer way that crosses multiple custom areas but has to be associated with just one. Also estimated_distance is rather imprecise when it is calculated per way, but applied for each edge.

GraphHopper does not really use the concept of an OSM way, except during import of course. So the only reason we calculate the flags per way not per edge is performance. Running the import for Germany with this PR (where the edge flags are calculated for each edge) was around 5-10% slower than master. This can of course depend on which encoded values are calculated and how many custom areas there are. However, in case there are very many custom areas I would also assume the higher precision is wanted or even necessary anyway.

We could start separating encoded values that need to be calculated per edge from those that are calculated per way, or add something like TagParser#applyWayTags (like here: 92dae9e), but considering the small speed up to me it seems like this might not really be worth it. In case the custom area index lookup was the bottleneck there is still room to speed up the area lookup (use rasterization).

I'll try to check the performance of this for a planet-wide import with many encoded values next.

…ficial tags for each edge

easbar · 2021-11-06T06:55:38Z

I got a few more numbers:

map	encoded_values	edge flags	artificial tags	import time
Germany	default	per way	per way	151-157s
Germany	all	per way	per way	170s
Germany	default	per segment	per segment	159s
Germany	all	per segment	per segment	180s
Germany	default	per segment	disabled	150s
Germany	all	per segment	disabled	170s

encoded_values=all means I disabled all encoded values listed in config-example.yml (except mtb_rating because it was broken (#2458)). encoded_values=default means the graph.encoded_values setting was empty (road_class,road_class_link,road_environment,max_speed,roundabout,road_access).

The artificial way tags are custom areas/countries mostly, but also date parsing and estimated distance.

We see that adding more encoded values increases the import time by about 9-12% when we calculate everything per way (170s vs. ~155s) and around 12% when we calculate everything per segment (180s vs 159s). Calculating everything per segment results in only around 6% slow down even when all encoded values are used. If less encoded values are used the slow down is even less pronounced.

Calculating the artificial way tags per segment is also only around 6% slower than not calculating them at all (this is for the default config with country lookups, but without any other custom areas!).

easbar · 2021-11-08T14:39:26Z

Some results when using the planet file (car only, no elevation data)

map	encoded_values	edge flags	artificial tags	import time
planet	default	per way	per way	5243s (1h27min)
planet	default	per segment	per segment	5724s (1h35min)

The difference was 8 minutes or 9%. @karussell (or anyone else) WDYT? I feel like this speed up is not worth the complication. Separating handleWayTags (=edge flags per way) and applyWayTags (=edge flags per segment) might be ok, but assigning custom areas based on the center of the OSM way is ugly (even though it probably works for countries in most cases). Another possibility might be splitting ways at custom area borders, but that seems more complicated to me and maybe it does not even perform any better.

karussell · 2021-11-09T09:43:14Z

core/src/main/java/com/graphhopper/reader/osm/OSMReader.java

        GHPoint estimatedCenter = null;
+        // todonow: we have to remove the tags again, because there can be multiple edges per way, but what if this was
+        //          not an artificial tag?
+        way.removeTag("estimated_distance");


maybe we start using a gh: prefix?

Yes, this is a bit ugly. The cleanest thing we can do is probably using a fresh copy of the ReaderWay for each edge? And yes prefixing the tags to distinguish them from standard OSM tags also seems useful.

Why do we even "abuse" tags for this? Wouldn't it be better to have a clear distinction between OSM tags and our own metadata? e.g way.setMetadata()

Good point :D

But modifying the way seems wrong even when we use a prefix or way.setMetadata(), because the artificial tags are only valid for the current edge and the way object will be used for multiple edges. Should we store the artificial 'tags' in a separate map and pass this to handleWayTags? So for the EncodingManager, TagParsers and FlagEncoder we would have something like:

public IntsRef handleWayTags(ReaderWay way, Map<String, Object> edgeTags, AcceptWay acceptWay, IntsRef relationFlags)

?! Or should we use the gh: prefix and copy the way object for each edge?

I consider functions with more than 4 arguments to be code smell, to be honest. But I'm not sure if we should move all of them into a single wrapper class though. We could still make a distinction between "raw" data(ReaderWay) and processed data. Without having touched relation handling in GH: are the relationFlags associated with a single way or is this some shared datastructure?

Relation flags just represent the condensed information belonging to all the relations associated to a way. The only reason we use the IntsRef/flags for this information is saving memory during import. You can think of them as data that was already extracted from the source OSM data. If we used a wrapper class, why would you prefer separating the ReaderWay from the others?

Not really a strong preference. I was just pointing out that we could do this if we want to clearly separate between raw OSM data and the results of GH parsing/processing.

I also wonder if we could postpone the signature change (even if it was the right thing to do). We will have to change the signature for #2465 and also the node tag handling might change again. Currently the only artificial/edge tag we use that actually does occur in OSM data (according to taginfo) is country and we already ignore/overwrite this in current master.

Ok, let's come back to this later.

karussell · 2021-11-09T09:46:46Z

(even though it probably works for countries in most cases).

My assumption here was that if properly mapped this would work for all cases :) but keeping this is a bit dirty.

I would say let's make it simpler and optimize speed later if really required and major benefits (>30%)

easbar · 2021-11-10T06:20:28Z

My assumption here was that if properly mapped this would work for all cases :)

Apart from mapping errors yes. But the real problem are custom areas.

I would say let's make it simpler and optimize speed later if really required and major benefits (>30%)

I agree.

cosmin-petrescu · 2021-11-12T07:10:27Z

core/src/main/java/com/graphhopper/reader/osm/OSMReader.java

        // Estimate length of ways containing a route tag e.g. for ferry speed calculation
-        double firstLat = first.getLat(), firstLon = first.getLon();
-        double lastLat = last.getLat(), lastLon = last.getLon();
+        double firstLat = pointList.getLat(0), firstLon = pointList.getLon(0);


Sadly, because we calculate this estimated center we may choose another country (see https://discuss.graphhopper.com/t/country-of-way-is-wrong-on-road-near-border-with-curvature/6908). Wouldn't be a better idea to pick up a middle point (or close to is) as:

double middleLat = pointList.getLat(pointList.size() / 2), middleLon = pointList.getLon(pointList.size() / 2);

This way we make sure the selected point in on the edge/way.

I opened #2467 for this

easbar added 12 commits November 2, 2021 07:10

Set artificial way tags and edge flags per segment not per way

b8bb421

Do date parsing per way

e45c19f

Merge branch 'master' into artificial_tags_per_segment

faadedf

Merge branch 'master' into artificial_tags_per_segment

8f3090a

Fix lat/lon

199559c

Move data parsing back to where edge flags are calculated, clear arti…

5f6b74f

…ficial tags for each edge

test: disable artificial tags entirely

804e713

test: add lots of encoded values

fe0063f

test: enable artificial way tags again

069f56c

hiking_rating -> hike_rating, remove broken mtb_rating

090e385

And again without artificial tags

e681e59

go back to master, but keep the encoded values

ff91f37

easbar added 3 commits November 7, 2021 08:09

per edge again, but including preprocessrelation

8a197c0

Merge branch 'master' into artificial_tags_per_segment

5b7e8fc

go back to default evs

64c4562

karussell reviewed Nov 9, 2021

View reviewed changes

cosmin-petrescu reviewed Nov 12, 2021

View reviewed changes

Merge branch 'master' into artificial_tags_per_segment

c3ba0fb

easbar marked this pull request as ready for review November 17, 2021 11:39

easbar merged commit 69b412d into master Nov 18, 2021

easbar deleted the artificial_tags_per_segment branch November 18, 2021 18:17

easbar mentioned this pull request Nov 18, 2021

Use a point along the edge geometry to determine the country/custom_areas #2467

Closed

easbar added a commit that referenced this pull request Nov 18, 2021

Update comments and changelog (for #2457)

b015ac8

easbar added this to the 5.0 milestone Nov 18, 2021

This was referenced Nov 19, 2021

Remove AbstractFlagEncoder#applyWayTags #2469

Closed

Use a point along the edge geometry to determine areas #2472

Merged

This was referenced Mar 1, 2022

FerrySpeedCalculator refactoring #2526

Open

add applyWayTags to TagParser #2527

Closed

Clean up FerrySpeedCalculator, use exact edge distance #2528

Merged

easbar mentioned this pull request Mar 15, 2022

Improve handling of short ferries without duration tag #2539

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create edge flags per edge instead of per way #2457

Create edge flags per edge instead of per way #2457

easbar commented Nov 3, 2021 •

edited

Loading

easbar commented Nov 6, 2021

easbar commented Nov 8, 2021

karussell Nov 9, 2021

easbar Nov 10, 2021

otbutz Nov 10, 2021

easbar Nov 10, 2021

easbar Nov 17, 2021

otbutz Nov 18, 2021

easbar Nov 18, 2021

otbutz Nov 18, 2021

easbar Nov 18, 2021

easbar Nov 18, 2021

karussell commented Nov 9, 2021 •

edited

Loading

easbar commented Nov 10, 2021

cosmin-petrescu Nov 12, 2021 •

edited

Loading

easbar Nov 18, 2021

Create edge flags per edge instead of per way #2457

Create edge flags per edge instead of per way #2457

Conversation

easbar commented Nov 3, 2021 • edited Loading

easbar commented Nov 6, 2021

easbar commented Nov 8, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karussell commented Nov 9, 2021 • edited Loading

easbar commented Nov 10, 2021

cosmin-petrescu Nov 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

easbar commented Nov 3, 2021 •

edited

Loading

karussell commented Nov 9, 2021 •

edited

Loading

cosmin-petrescu Nov 12, 2021 •

edited

Loading