Introducing A Centroid/Converge/Rendezvous/Meet API #2734

kevinkreiser · 2020-12-14T21:35:37Z

What is this?

This past weekend I was thinking about a problem. That problem was specifically:

Given a bunch of people at different locations, what is the optimal location for those people to meet?

My first thoughts on such an algorithm were essentially that classical algorithms won't work because:

They require the destination to be known and they use this as the stopping criterion but here we only know the origin locations. Isochrones is an exception but it uses a maximum distance/time as its stopping criterion
We need to track multiple independent paths concurrently. We do this with the matrix but its very inefficient

My second thought was well what even is a "best" meeting place?

In the plane, or geometrically speaking, I immediately thought of the centroid, as it's the point that minimizes the sum of distances to all input points
Here our metric is actually path cost rather than distance. That is, we want to find a location from all input locations, for which the sum of all paths' costs to that location is minimal

Once I had formulated the problem properly it was time to consider how the algorithm should work. For that I came up with a few requirements:

Convergence happens when all origins find their shortests paths (settle) to the same edge in the graph
We need an efficient way to do path multiplexing rather than relying on tracking all of the paths in their own different queues
We need an API over HTTP
We need a demo UI

Problem 1: Efficient Path Multiplexing

The existing implementation of the priority queue (combo of edgelabel, edgestatus and double bucket queue) doesn't currently support tracking paths from multiple locations at once. In bidirectional a* and in cost matrix for example, where we need this kind of implementation, we instatiate two queues for each origin/destination pair. This has some drawbacks:

This means that allocations happen across all copies of these data structures
The algorithms have to jump around in memory as we access each separate copy

So I wanted to see if I could find a way to mark entries in the queue in such a way that I could tell which path expansion (location) they came from and differentiate them. Thus using a single queue for many path expansions at the same time (multiplexing).

To do this I was able to mark both the edge labels in the labelset and the edge statuses (index into the labelset) with a path id/index/color to differentiate which location a particular path was tracking. I was able to use the 7 spare bits in the tile id of the edge status and free up some bits in the edge label. Double bucket queue didn't need any changes because it works with labelset indices directly.

After those changes everything worked as normal without any changes to the other algorithms since the path id/index/color is optional. But what I want to try out is to switch to using just one queue in bidirectional a* to see if I can get a performance boost from not having to jump back and forth to 2 different memory locations. I'm considering PRing just this change separately but I'll get to that a bit later!

Problem 2: Core Algorithm

The algorithm itself is quite simple. Remember that we need to come up with a destination, which means we can't use any of the directed search algorithms (a* et al) that rely on knowing the destination. It would be nice if we could because they have better performance than Dijkstra's but the way they get that performance is precisely by using the goal heuristic to coach the path expansion toward the goal. In our case we cannot pick a goal that would generate an admissable heuristic which means thats off the table for us. But there are other tricks we can do to cull the search space. I'll get to those in the future work section.

So it's dijsktra's for us. No problem. Remember what the main objective is for convergence, we need to return the first edge in the graph to which all locations have found a shortest path. What that means is that, as we pop edges off the queue (ie. find shortest path from an origin to that edge) we need to track which other edges also found paths to this edge. To do that I created a small struct which holds 2 64bit masks. Each bit in the mask represents an origin that has either found or not found its shortest path to this edge. When I get the callback from dijsktras that we've settled an edge, I check which path/origin it was for and I flip that bit on my tracking struct. If that bit I just flipped was the last one to need flipping, then we have converged. Once that happens dijkstras stopps and we call FormPaths, which recoveres the edges of the path for each individual origin location (looks like an alternate routes result, ie has multiple routes in the output).

Problem 3: HTTP API

More cool stuff to elaborate on here but the great news is that the input to the API already looks exactly like a normal /route request in that you need 2 or more locations. So I quickly added a new action to the request called centroid and focused on the output. The good news there is that output looks exactly like a route with alternates=n except now n is the total number of input locations. Another thing I did here was modify the valhalla route serializer to support alternates (THANK GOD). The current implementation has something like {"trip":{... your route here ...}}. This made it quite clunky to add alternates but I found a reasonable way: {"trip":{... your route here ...}, "alternates":[{... your alternate here ...}, ...]}. I can PR this separately as well.

Demo:

I cracked open vim and quickly hacked together a leaflet demo to show off this API. You simply click the locations on the map you want to use as your input locations and then press the button at the bottom to fire off the request. The green dots are origins and the red dot is the destination. PR in the demos repo is here: valhalla/demos#234

Future Work:

There are a number of things to do to make this work practical and useful. Some of which make sense to add to the API, some of which should be saved for other projects that make use of it. I'll list off a few:

Things that make sense to add:

We can make the process more efficient in the general case by pruning the dijkstras expansion when it leaves some bounding box of the input locations. In extreme cases this could cause failure of the path finding however we could fall back to no bounding box on a retry.
Allow caller to specify a max road class to allow a meeting on. You dont want to meet someone on a limited access highway, that could be rough 😄
If we let the algorithm run a bit longer we could find multiple meeting locations. We'd have to come up with criteria for acceptance of alternate meeting locations but a quick first one could be distance based: no alternate meeting locations within x meters of each other as they are too similar.
A way to penalize areas in the graph that you wouldnt want to meet at. We have hard avoids which would work but maybe soft avoids are better?

Things that make sense for users of the API to add themselves:

Seems like the results of this API could be intersected with a POI database and that could be used to give back more relevant results. Like if you knew a person wanted to eat and the lowest cost meeting place had no restaurants nearby but the first alternate had 10 restaurants then maybe you suggest the alternate

Apart from that there are a lot of nice TODOs listed in the code which I'm not really worried about tackling in this iteration. I think its quite alright to offer this service as a fun little beta API for people to try out!

Unit Testing

This PR still needs unit tests, I'm working on those.

…e same edgeset

…ecific expansion. this allows us to use the same label set for multiple simultaneous path expansions

…revious commit

…uash this with the 2 previous commits

…and typos. it works!

kevinkreiser · 2020-12-14T22:44:19Z

bench/thor/routes.cc

@@ -276,7 +276,7 @@ static void BM_Sif_Allowed(benchmark::State& state) {
  // auto pred = sif::EdgeLabel(0, tgt_edge_id, edge, costs, 1.0, 1.0,
  // sif::TravelMode::kDrive,10,sif::Cost());
  auto pred = sif::EdgeLabel();
-  int restriction_idx;
+  uint8_t restriction_idx;


you'll see changes similar to this throughout, basically we didnt need 2^32 values to represent the restriction index for a restriction at a particular edge. if we ever find an edge that even has 256 restrictions i'll eat my hat 😄

curious what the benefit is vs. the potential for overflow (though rare do you now have to do some bounds checking anywhere?)

yeah inside the function that actually checks for restrictions (in dynamic cost) if the index is larger than 254 we cant tell what restriction was there in the route. note that this doesnt mean that we wont adhere to the restriction, it just means that the serializer doesnt know that a restriction was there.

kevinkreiser · 2020-12-14T22:45:19Z

src/thor/centroid.cc

+ * @param reader   provides access to graph primitives
+ * @return the constructed location
+ */
+valhalla::Location make_centroid(const valhalla::baldr::GraphId& edge_id,


use an edge to make a destination location for the route

kevinkreiser · 2020-12-14T22:47:12Z

src/thor/centroid.cc

+namespace thor {
+
+// constructor
+PathIntersection::PathIntersection(uint64_t edge_id, uint64_t opp_id, uint8_t location_count)


this class tracks potential convergence points. it uses masks to figure out which locations have found shortest paths to this edge. i had actually forgotten that i dont need to track both the edge id as well as its opposing. since we only use the smallest of the 2 edge ids to track that particular meeting point. ~~so this is another TODO, we can remove the opp_id and save some ram.~~ <-- Done!

kevinkreiser · 2020-12-14T22:48:23Z

src/thor/centroid.cc

+
+// this is fired when the edge in the label has been settled (shortest path found) so we need to check
+// our intersections and add or update them
+thor::ExpansionRecommendation Centroid::ShouldExpand(baldr::GraphReader& reader,


this is the meat of the algorithm. here we get informed that a specific location settled a specific edge, we check if we are already tracking it and we flip the bit corresponding to the input location that settled it. if we flipped the last outstanding bit in the mask, then its over and we found a least cost convergence point

kevinkreiser · 2020-12-14T22:48:52Z

src/thor/centroid.cc

+// walk edge labels to form paths for each location to the centroid
+template <typename label_container_t>
+std::vector<std::vector<PathInfo>>
+Centroid::FormPaths(const google::protobuf::RepeatedPtrField<valhalla::Location>& locations,


nothing interesting here, just loop over the locations and recover their paths individually

kevinkreiser · 2020-12-14T22:49:59Z

src/thor/dijkstras.cc

@@ -41,7 +41,8 @@ namespace thor {

 // Default constructor
 Dijkstras::Dijkstras()
-    : access_mode_(kAutoAccess), mode_(TravelMode::kDrive), adjacencylist_(nullptr) {
+    : access_mode_(kAutoAccess), mode_(TravelMode::kDrive), adjacencylist_(nullptr),
+      multipath_(false) {


the main changes to dijkstras are a boolean to flag we want to do path multiplexing (this means assigning an id to each locations initial edges in the labelset) as well as actually passing those to the different functions on the labelset/edgestatus

kevinkreiser · 2020-12-14T22:51:26Z

src/thor/route_action.cc

@@ -205,6 +205,37 @@ std::string thor_worker_t::expansion(Api& request) {
  return rapidjson::to_string(dom, 5);
 }

+void thor_worker_t::centroid(Api& request) {


this is the main place where new service/api related work was needed. here we do the same thing that a regular route request does but we call into the new algorithm and build a leg for each individual route that came out of it.

kevinkreiser · 2020-12-14T22:51:52Z

src/thor/worker.cc

@@ -49,6 +49,16 @@ const std::unordered_map<std::string, float> kMaxDistances = {
 constexpr float kDistanceScale = 10.f;
 constexpr double kMilePerMeter = 0.000621371;

+std::string serialize_to_pbf(Api& request) {


moved this into the anonymous namespace

kevinkreiser · 2020-12-14T22:52:05Z

src/thor/worker.cc

@@ -142,7 +142,7 @@ thor_worker_t::work(const std::list<zmq::message_t>& job,
      }
      case Options::isochrone:
        result = to_response(isochrones(request), info, request);
-        denominator = options.sources_size() * options.targets_size();


noticed this bug and fixed it

kevinkreiser · 2020-12-14T22:52:48Z

src/tyr/route_serializer_valhalla.cc

-  tyr::route_references(trip_json, api.trip().routes(0), api.options());
-  auto json = json::map({{"trip", trip_json}});
+  auto json = json::map({});
+  auto alternates = json::array({});


this is where i added support for alternates in the valhalla route response format, pretty straight forward actually!

kevinkreiser · 2020-12-14T22:54:40Z

valhalla/sif/edgelabel.h

-        cost_(0, 0), sortcost_(0), distance_(0), transition_cost_(0, 0) {
+        origin_(0), toll_(0), not_thru_(0), deadend_(0), on_complex_rest_(0), path_id_(0),
+        restriction_idx_(0), cost_(0, 0), sortcost_(0), distance_(0), transition_cost_(0, 0) {
+    assert(path_id_ <= baldr::kMaxMultiPathId);


two main things i did here were add support for specifying a path id (optionally) and made restriction index manditory.

kevinkreiser · 2020-12-14T22:57:01Z

valhalla/thor/edgestatus.h

+   * @param  set         Label set for this directed edge.
+   * @param  index       Index of the edge label.
+   * @param  tile        Graph tile of the directed edge.
+   * @param  path_id     Identifies which path the edge status belongs to when tracking multiple paths


basically all the functions here now have an optional path_id which gets bitwise or'd into the spare bits of the tile/level id so we can use the same edgestatus to track up to 127 paths

…ypes in centroid

dnesbitt61 · 2020-12-16T13:46:27Z

May want to add a maximum distance between locations as a protection against long running requests. One location in MD and one in CO took 90 seconds on my laptop. For now this seems to be a nice feature for short distances (which I would argue as a "meetup" most requests would likely be over short distances).

nilsnolde · 2020-12-17T17:33:05Z

Just trying it out myself, love the idea. A little feedback from a user's POV:

rather style I guess: would be nice to find them all in alternates incl the first trip
~~name suggestion gravity? IMO centroid is a little too strongly rooted with geographic centroid and gravity comes closer to the concept~~ that's actually bs.. it's rather centroid than gravity..

+1 for maximum distance config, took around 3-5 seconds (depending on options) for total 200 km of trips.

Couldn't help but quickly implement it in the QGIS plugin to try it out;) (also returns the meeting point with total distance/duration) You can install from here if you wanna try: https://qgisrepo.gis-ops.com

kevinkreiser · 2020-12-30T20:15:46Z

TODO:

service limits on distance and number of locations
unit tests

…uilding into single function making config changes less of a hassle

kevinkreiser · 2021-01-25T13:54:42Z

Since I initially PRd this a lot has changed in the repo. I've resolved all the conflicts but I did do two large quality of life things in the tests:

i removed all inline configs from all tests and made them use a test::build_config method. this helps with the next time we make a configuration change, we only have to do it in one place instead of 50
i collapsed all the main gurka route/match/locate functions into 2 generic functions to which you pass the action you want to call. this way we can add new apis to test easily rather than duplicatin the same boiler plate over and over

I also added a minimal centroid unit test.

dgearhart

🚢

kevinkreiser added 9 commits December 11, 2020 13:20

optionally allow use of 7bit path_index to track multiple paths in th…

b4544ce

…e same edgeset

pair back edgelabels restriction index to a reasonable max of 254

ee64157

use 7 spare bits in edge label to mark the label as belonging to a sp…

82c7213

…ecific expansion. this allows us to use the same label set for multiple simultaneous path expansions

first sketch of the primary algorithm. not tested

c0fe5c1

initialize best connection and reverse the path, squash me with the p…

a095e5f

…revious commit

halve the cost on the last edge assuming we'll meet in the middle. sq…

2b5da75

…uash this with the 2 previous commits

add the API layer and squash some simple mistakes with bit twiddling …

0376a61

…and typos. it works!

fix summary info to track the right route

a273435

more TODO notes

ae0111a

kevinkreiser requested review from danpaz, purew, dnesbitt61 and danpat December 14, 2020 21:38

kevinkreiser commented Dec 14, 2020

View reviewed changes

kevinkreiser added 5 commits December 14, 2020 18:03

cleanup, dont need to store opposing id in potential convergence point

229077a

Merge branch 'master' into kk_centroid

b3195cf

changelog

84e75ec

Merge remote-tracking branch 'origin/kk_centroid' into kk_centroid

70a5b1f

clang tidy linting

7689dea

kevinkreiser added 2 commits December 14, 2020 20:59

linting mistakes

ff2fe62

refactor dijkstras interface to specify expansion type. support all t…

facd3da

…ypes in centroid

dnesbitt61 previously approved these changes Dec 15, 2020

View reviewed changes

Merge branch 'master' into kk_centroid

ccfbb60

kevinkreiser dismissed dnesbitt61’s stale review via ccfbb60 December 30, 2020 20:14

kevinkreiser added 11 commits December 30, 2020 15:16

lint

70428c6

properly merged master this time

cb5c0d4

lint

4e0e8d4

Merge remote-tracking branch 'origin/master' into kk_centroid

e7aa52c

add service limits for centroid service. merge all unit test config b…

cac0b77

…uilding into single function making config changes less of a hassle

resolve conflicts

d11aaef

add centroid to actor interface and python bindings

8a5f1d7

refactor gurka API helper functions to be more generic

52aa891

Merge remote-tracking branch 'origin/master' into kk_centroid

7330a32

add minimal centroid unit test

ff7e482

lint

3ff27f8

dnesbitt61 approved these changes Jan 25, 2021

View reviewed changes

kevinkreiser merged commit d7a4e78 into master Jan 25, 2021

dgearhart approved these changes Jan 25, 2021

View reviewed changes

kevinkreiser mentioned this pull request Jan 27, 2021

Fix Recently Added Valhalla Alternates Serialization #2811

Merged

nilsnolde mentioned this pull request Jul 22, 2021

Centroid action missing in api docs #3226

Open

kevinkreiser deleted the kk_centroid branch January 5, 2022 03:40

kevinkreiser mentioned this pull request Sep 13, 2022

we should stop expanding if we hit our reach target #2728

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing A Centroid/Converge/Rendezvous/Meet API #2734

Introducing A Centroid/Converge/Rendezvous/Meet API #2734

kevinkreiser commented Dec 14, 2020 •

edited

Loading

kevinkreiser Dec 14, 2020

dnesbitt61 Dec 15, 2020

kevinkreiser Jan 5, 2021

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020 •

edited

Loading

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

kevinkreiser Dec 14, 2020

dnesbitt61 commented Dec 16, 2020

nilsnolde commented Dec 17, 2020 •

edited

Loading

kevinkreiser commented Dec 30, 2020

kevinkreiser commented Jan 25, 2021

dgearhart left a comment

Introducing A Centroid/Converge/Rendezvous/Meet API #2734

Introducing A Centroid/Converge/Rendezvous/Meet API #2734

Conversation

kevinkreiser commented Dec 14, 2020 • edited Loading

What is this?

Problem 1: Efficient Path Multiplexing

Problem 2: Core Algorithm

Problem 3: HTTP API

Demo:

Future Work:

Unit Testing

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevinkreiser Dec 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnesbitt61 commented Dec 16, 2020

nilsnolde commented Dec 17, 2020 • edited Loading

kevinkreiser commented Dec 30, 2020

kevinkreiser commented Jan 25, 2021

dgearhart left a comment

Choose a reason for hiding this comment

kevinkreiser commented Dec 14, 2020 •

edited

Loading

kevinkreiser Dec 14, 2020 •

edited

Loading

nilsnolde commented Dec 17, 2020 •

edited

Loading