Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge rates not used in matching #4785

Open
Nate-Wessel opened this issue Jan 8, 2018 · 14 comments
Open

Edge rates not used in matching #4785

Nate-Wessel opened this issue Jan 8, 2018 · 14 comments
Labels

Comments

@Nate-Wessel
Copy link

I've recently been playing with the new relation support in profiles, trying to get my map-matching results to favour transit routes. I had been having trouble seeing any evidence that the edge rates I was setting had any influence on the matching results, so for testing purposes I set rates and speeds on all streets to the same value, but multiplied rates on streets with transit routes by 30x. I started getting slightly different results, but still not the results I expected.

The route service now seems to show the effects of this for my test case, giving me some weird results that clearly follow known transit lines very closely rather than taking a shortest or quickest path. When I use the matching service however, I don't see a similar effect. I have a few test cases I'm looking at (GPS traces from transit vehicles) that hadn't been matching to their routes properly, but which looked like they could with a little nudge. However the map-matching result is seemingly unchanged regardless of the way I set rates on edges.

Does the matching service even support custom edge weights? If so, do I have to enable this feature somehow, or what else could I be missing? I'm using the profile API version four, and developed a rudimentary profile based on the current car.jua profile.

@Nate-Wessel
Copy link
Author

Nate-Wessel commented Jan 8, 2018

It seems that speed also has no effect on matching, whether timestamps are included or not. I get exactly the same matches even with ridiculous speeds. It only effects the duration and weight given in the response. Duration and weight by the way give identical values in every case I've looked at.

@TheMarex
Copy link
Member

The rate only affects the matching algorithm indirectly. The rate obviously changes the route with minimal cost between two coordinates. However the way the match plugin scores these routes to be the most probable one does not involve the rate, but only the distance/shape of the route.

I would recommend to maybe make a special profile that only contains the transit lanes to constrain it directly.

@Nate-Wessel
Copy link
Author

Thanks for your reply.

I could do that, but it really misses the point of doing the map matching. To use only the routes would imply that I know for certain where the vehicles are going, which just isn't the case. Sometimes they make detours, sometimes they're deadheading to/from garages, etc. and for what I'm doing it's pretty important that I pick up those details. I really just want to give a slight preference to known routes.

I know I don't have a common use-case here, but I would suggest that alternative distance metrics should be included in the map matching algorithm in future releases. Certainly plain old distance is fundamental to choosing match candidates within a radius of a GPS point, but why would only literal distance be considered when we have to arrange a plausible trip linking up all of the input coordinates? Given that the match API accepts a vector of timestamps as input, I would expect speed at least to play a definite role in selecting a plausible route. So for example, if there are two parallel routes of roughly the same physical distance, one faster, one slower, shouldn't the timestamps of the input coordinates help decide the case? And then if things were set up to use either distance and/or time as a distance metric, certainly rate could be used as well.

It seems like that might be pretty straightforward to implement as it could just be switching one cost metric (distance) for another (time, rate).

@danpat
Copy link
Member

danpat commented Jan 12, 2018

It seems like that might be pretty straightforward to implement as it could just be switching one cost metric (distance) for another (time, rate).

@Nate-Wessel patches are always welcome!

@Nate-Wessel
Copy link
Author

I'd be up for having a go at this, though I'd appreciate a little input up front to make sure I'm actually on the right track. (ha!)

It seems from the algorithm described by Newsom and Krumm (2009) that the assessment of match probability is based entirely on a comparison of the straight-line distance between GPS points and the network distance of a match candidate. For them, a network distance not much greater than the straight-line distance indicates a better match. The only way they use time in that paper is as a reasonableness check for speeds on matched route candidates. Krumm has a conference paper from a couple years earlier though where he describes essentially the same algorithm but using a time metric rather than distance, and he provides an equation for the time-based transition probabilities.

He says in the later paper that the distance-based probabilities were found to work better, but I imagine the time-based ones would perhaps work better for lower sampling frequencies and especially where routes are circuitous (as they often are in my case). The longer the time between points, the less I would expect that network distance should look like straight-line distance.

For rate-based transition probabilities, we would basically want to prefer the route with the lowest weight (distance/rate). I'm not sure at the moment how transition probabilities could be calculated in this case... we may need to reestimate a Beta parameter based on a sample dataset. .. and set the minimum weight by whatever the lowest observed weight is rather than by having something e.g. straight-line distance as a baseline. I'd have to think about this one some more.

From looking at the car.lua profile, it seems like we already have a mechanism for selecting among these three options (distance, time, weight) for routing in the weight_name property. My intuition was that this already effected matching. I assume we could just access this property when selecting the transition probabilities to use for the matching algorithm. I'm also assuming that it's not so hard to access weight or time rather than distance if we already have a list of edges as match candidates. But I haven't programmed in C++ before, so I'm not entirely sure what I'm looking at yet.

@TheMarex
Copy link
Member

TheMarex commented Jan 19, 2018

So the reason why we preferred the method of using distance over duration is motivated by the application we wrote the map matching for. The map matching originally was developed to match real vehicle traces to the road network to extract speed/traffic information for each segment.
We knew the duration value computed by our heuristics in the Lua profile was wrong, that is why we wanted to collect the empirical data. By using the heuristic duration we would bias our traffic model to only include traces that already confirm the model we have.

That said, if you have a good empirical model, I would expect the duration to yield better results. I suspect that Newson & Krumm used a duration model that was similarly flawed as our: Based on heuristics using the maximum speed (a freeflow model). One major problem will be to figure out a good transition probability estimator, since the variance on duration is not fixed over the whole road network.
Imagine a duration model that uses average day time speeds as base. Some residential roads will always be empty, no matter if its day or night time. So you can expect the variance on speed on that roads to be basically zero. Now compare that to a major street. The day time average will be much bigger than the night time average, so there is a big variance.
Without knowing the variance of each street the candidate route passes over it is hard to compute the actual probability. Though, it should be possible to estimate an overall variance, the question is just how good this model would be.

From looking at the car.lua profile, it seems like we already have a mechanism for selecting among these three options (distance, time, weight) for routing in the weight_name property. My intuition was that this already effected matching. I assume we could just access this property when selecting the transition probabilities to use for the matching algorithm. I'm also assuming that it's not so hard to access weight or time rather than distance if we already have a list of edges as match candidates. But I haven't programmed in C++ before, so I'm not entirely sure what I'm looking at yet.

I think we could implement this feature without having the need to change the weighting. The routing should always just use the weight, what this refers to can be configured in the profile. We could add an option to the match plugin that determines how the transition probability is computed.

  1. Using the distance of a route.
  2. Using the duration of a route.

It should not be too hard to nicely abstract that away behind an option that could be changed on a request-by-request basis, maybe use the duration by default if you pass in timestamps.

@Nate-Wessel
Copy link
Author

That's a good point about the non-constant time variability; perhaps the time-based matching will take a little more thought. I haven't really taken a good look at the formula proposed in that conference paper I mentioned.. I just saw it was there, but I would suspect it doesn't include per-edge travel time distributions. I think what I really need for my case though is weight-based probabilities.

The routing should always just use the weight, what this refers to can be configured in the profile.

What do you mean by "the routing"? Perhaps I should illustrate a specific case where I'm having some trouble:
problematic matching screenshot

The dashed line is my input GPS trace, and the orange highlight is the returned match geometry. I know that this particular transit route should continue on Golfair Blvd and turn north at the light. The match given here detours on side streets instead. In the profile I used to build the graph, I've given every edge the same exact speed, but multiplied the rate where there is a transit route in the OSM data as there is here on Golfair. I know there aren't any other problems in the data, because the correct route often is given, at least when the input coordinates are a little less ambiguous than in this particular case. I've also used OSRM tileviewer to check that I am actually assigning values to the correct edges. I only know how to look at speed through this tool at the moment though, so I cannot directly confirm that rate is actually being assigned to edges. The way_function from my profile is below:

function process_way(profile, way, result, relations)
	railway = way:get_value_by_key('railway')
	highway = way:get_value_by_key('highway')
	route = way:get_value_by_key('route')
	-- perform a quick initial check and abort if the way is
	-- obviously not routable.
	-- highway or route or railway tags must exist
	if (not highway) and (not route) and (not railway) then
		return
	end
	-- now check for any access tags
	for i, access_tag in ipairs(profile.access_tags_to_check) do
		-- does this access tag exist and have an acceptable value?
		local access_value = way:get_value_by_key(access_tag)
		if access_value and not profile.access_values_to_accept[access_value] then 
			-- not a barrier to transit
			print('denied way with access_tag',access_tag,'=',access_value)
			return 
		end -- if acceptable access value
	end -- for access tags to check
	-- set the default travel mode
	result.forward_mode = mode.driving
	result.backward_mode = mode.driving
	-- set default speeds
	result.forward_speed = 20
	result.backward_speed = 20
	-- set default rates based on speed
	result.forward_rate = result.forward_speed / 3.6
	result.backward_rate = result.backward_speed / 3.6
	-- HANDLE ANY RELATION DATA
	-- get a list of relations
	local rel_id_list = relations:get_relations(way)
	-- iterate over them
	for i, rel_id in ipairs(rel_id_list) do
		-- get the relation object
		local rel = relations:relation(rel_id)
		-- find the type of relation
		local reltype = rel:get_value_by_key("type")
		-- is the relation a route?relation
		if reltype == 'route' then
			-- is the route a transit route?
			local route = rel:get_value_by_key("route")
			local route_name = rel:get_value_by_key("name")
			if route and ( route == 'bus' or route == 'tram' ) then
				-- let's be sure we've done something
				local street_name = way:get_value_by_key("name")
				print('route',route_name,'on',street_name)
				-- adjust the weight (rate)
				result.forward_rate = result.forward_rate * 5
				result.backward_rate = result.backward_rate * 5
				-- now DON'T re-multiply for ways with multiple routes
				return
			end -- if transit route
		end -- if route
	end -- for relation
end -- way function

I ultimately just want a slight preference for matches that are on-route to correct cases like the above.

@Nate-Wessel
Copy link
Author

I spent a bit more time looking into this issue and found that, in fact, my profile is simply not applying rates to the edges at all. I created a separate issue for this: #4837

@Nate-Wessel
Copy link
Author

It seems that the rate was being set by my profile, however the matching results are definitely yielding, as in the example above, geometries that favour minimal distance over minimal weight.

@qpkorr
Copy link

qpkorr commented Aug 5, 2018

Hi! I found this issue whilst seeking the best method to tweak my profile to more heavily preference motorways - I'm new to OSRM and even understanding your process_way function is a challenge. I'm keen to also know the minimum time the trip should take (within the speed limits on each road segment) - so I was thinking it's the motorway rate I should be altering - but reading this - I'm wondering if it's actually possible to achieve my goal. I was just wondering if you had any comments - and if you found any solutions to altering the matching behaviour? Thanks!

@Nate-Wessel
Copy link
Author

I never did find a way to solve this issue to my satisfaction. To my knowledge, matching is still done based on distance only and rate/speed is not considered.

@Nate-Wessel Nate-Wessel changed the title Edge rates not used in matching? Edge rates not used in matching Aug 6, 2018
@qpkorr
Copy link

qpkorr commented Aug 6, 2018

Thanks for the response. I dipped my toes in the lua code and added the following additional function in the Sequence of handlers in the process_way function:

function adjust_rate_to_favour_motorways(profile,way,result,data)
  -- print what we're doing to see if this code is getting called
  local name = way:get_value_by_key('name') or "UNKNOWN"
  local highway = way:get_value_by_key('highway') or "UNKNOWN"
  print(name .. ":" .. highway)
  
  -- Change rates for non-motorways only
  if highway ~= "motorway" then
    result.forward_rate = result.forward_rate * 0.1
    result.backward_rate = result.backward_rate * 0.1
  end
end

It definitely seemed to affect the routing, favouring motorways as intended. I added it as the second-last handler in the car profile (ah - with a couple of other minor changes to it, but I doubt they were relevant).

However - my understanding had been that this should affect the routing but not the travel times (duration) as the speed should be unaffected - but in practice, on routes where the route description didn't change, the times still increased (around 25%, including those that included only motorways, and thus shouldn't have been affected by this handler).

Have I misunderstood something - eg am I modifying something other than "edge rates"? I'm very new to this as I say. The last commit of osrm-backend I've got is from Wed Jun 6 13:50:48 2018 +0200, if that's relevant.

@qpkorr
Copy link

qpkorr commented Aug 7, 2018

Hmm - more hunting and I found the reason for the change in my times (my fault, my comparison times had speed_reduction changed). More to the point - I realized I'm only looking at routing, not matching - and as you say, the former is indeed being influenced by rate modifications. Apologies for the irrelevant distraction!

Copy link

github-actions bot commented Jul 8, 2024

This issue seems to be stale. It will be closed in 30 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants