Skip to content

RFC102: TSP and Generating paths

Stephen Woodbridge edited this page Nov 8, 2013 · 1 revision

Overview

I have a use case where I need to select N locations from a map, Optimize the order of the points based on TSP and then generate the resulting route. This can be done today with a lot of work and custom coding but it is extremely slow and inefficient. A similar use case is that of the VRP problem, where we need to generate a large distance matrix, solve the VRP problem then generate the resulting routes.

Why is this important?

Both of these are common use cases that users and clients run into. We have the tools to cobble together a solution but it is not easy. And our best solution is slow because of inefficiencies of loading edges multiple times and building graphs multiple times and iterating over the solution.

How does it work?

Today we have to perform the following steps:

  1. take a list points and find the nearest vertex in an edge table
  2. build a distance matrix using the list of vertex ids and an edge table
  3. solve TSP or VRP problem using the distance matrix
  4. compute the path between each adjacent vertex ids in the TSP order list or the list of VRP solutions

What are some of the problems?

In each of the steps above:

  1. I have written a simple plpgsql procedure pgr_pointsToVids() that handles this. This is probably OK.
  2. This requires multiple requests to something like pgr_kdijkstra() for each vertex id in the list. So we have to read the edges into a C array, pass it to boost to build a graph, solve the graph, return the results. And this has to be done for each row. I have created pgr_vidsToDMatrix() that only load the edges into a C array once and then calls boost N times. Ideally we would like to build the graph once and then reuse that N times.
  3. This is really outside the scope of the problem, these consume the data and present a solution that we then must act upon.
  4. Now that we have an ordered set of vertex ids, we now need to reload the edges and compute the path again for each pair of nodes in the solution. If we could save the data in step 2. and then reuse it that might aide in performance.
  5. The more we integrate multiple functions into application level code for performance, we loss modularity and simplicity for testing and validation.

Potential solutions

implement fast routing

Like RFC101: Adding Support for HwyHierarchies to pgRouting

I think this is the long term path that we should shoot for, but it will not solve all problems and having a parallel implementation based on the existing tools will be needed by some people.

Store intermediate results in a temp table in step 2. for reuse in step 4.

Here is the problem work flow:

  1. points -> vertex ids
  2. using kdijkstra for each vertex id, create a record in a temp table like:
    • source, target, cost, array[eid, eid, ...]
    • compute dmatrix from this temp table
    • implement a manytomany_kdijkstra_path() function to compute this
  3. solve TSP or VRP using the dmatrix
  4. extract the routes from the temp table based on [source, target] needed in the solution
  5. return the result
  6. drop the temp table

Push everything into C code and save results in memory

This is conceptually similar to the above by instead of using a temp table, everything is done in memory. This will allow for greater optimization at the cost of complexity. And the code is not generally reusable without a lot of additional design and thought.

I think I have convinced myself the this is not the way to go.

Clone this wiki locally