Separate concerns in Path class #1727

easbar · 2019-09-13T08:47:49Z

This issue is very similar to #1110, but since that one addresses so many things I am opening a new one here.

The Path class really does too many thing and it leads to too much coupling between not necessarily related things. I took a look and so far I think Path should be split into three parts:

Path as the result type of the routing algorithms. This should be a data-only class consisting of weight/distance/time/edges/nodes (and maybe a debugString). How this data/result is obtained should be entirely up to the algorithms.
The path extraction (producing a result type Path from one (or two) SPTEntrys) that is currently done within Path (and leads to the Path class having multiple subclasses as well, making things even more complicated) is really a part of the routing algorithm(s) and therefore should be moved into the algorithm classes. It might be worth to move it into some Helper class that the algorithms can use instead of using inheritance.
Any kind of path post processing (extracting EdgeIteratorStates, instructions details etc.) should be moved into a separate class as well, maybe this can be achieved using some static methods.

The text was updated successfully, but these errors were encountered:

michaz · 2019-09-14T13:36:01Z

100%

michaz · 2019-09-14T13:39:55Z

Just in case you start on this: May I add something to the wish list?

While looking into CH-Alternatives, I noticed that it can also be helpful to navigate the CH-Path itself. The real path through the CH-augmented graph. With un-unpacked shortcuts and all. Because I may want to analyze it without fully unpacking it, e.g. binary-search for a certain travelled distance by recursively going down.

I think that's just a matter of doing less, namely not hide the non-unpacked Path. Just wanted to note. Of course, it could also go on Path4CH as special methods. But I'm pretty much for "data first", think about objects and such later.

easbar · 2019-09-14T14:16:47Z

e.g. binary-search for a certain travelled distance by recursively going down.

that's a particularly bad example because we just removed distances from CH shortcuts: #1719. Or maybe you want something like unpack one shortcut but not another ? Anyway, I think I get it, you want for example the list of edge/shortcut ids before the shortcuts are unpacked ? Its good you say this, because at one point I was asking this myself (but could not really come up with a use-case for it). Will keep it this in mind.

michaz · 2019-09-14T14:21:28Z

you want for example the list of edge/shortcut ids before the shortcuts are unpacked

Yes.

we just removed distances from CH shortcuts

Oh, that's fine. Yes, it was just an example. I think not even the real one.

easbar · 2019-09-17T11:05:15Z

Refactoring Path is quite a task and after thinking about and looking at it for a while I think we definitely should try to do it in small steps, because there are many factors involved. For the beginning I found its best to remove the path extraction stuff -> #1730. Not sure if this already helps with the (not)-unpacking you had in mind, but maybe this can be also improved further afterwards.

michaz · 2020-08-12T16:23:48Z

Almost done I think. Remaining issues I'm having:

The additional properties (description, maybe also weight, distance, time) seem a bit random on this class as it is now.
The start/end-node properties obfuscate a little the (very different) reasons why they are there:
- start-node is there because otherwise the path isn't well-defined by only the list of edge-ids
- end-node is there because a single node and no edge is also a Path.
It should be immutable instead of making property setters package-private (I already made them public -- since this is a basic data type that is the input to other procedures (Instructions, PathDetails, ..) I need to be able to create instances freely.
Looks like the self-rolled push-iterator/visitor thing isn't really necessary -- don't see anything that can't be replaced by a regular Iterator and mean the same thing.

michaz · 2020-08-12T16:30:02Z

The fundamental ugliness (to me) is that, to process it as a data type, you basically have to pattern-match:

if "not found", then do something
else if single-node, then get node number and do something with it
otherwise, iterate over edges

And that isn't immediately obvious.

otbutz · 2020-08-13T07:10:22Z

Sounds like Path should be an abstract class than? e.g the list of edges doesn't make any sense for the first two use cases

michaz · 2020-08-13T07:55:14Z

Nah, list of edges + list of nodes with the invariant that they form a path is okay as representation. So if this is a Graph:

1 -- 2 -- 3 -- 4 -- 5
   e1   e2   e3   e4

this is a Path:

[], []

So is this:

[3], []

and this:

[3,2], [e2]

and this:

[2,3], [e2]

My question is more how best to consume / pattern-match this.
If the first thing I see is the prominent forAllEdges method, I think "great, let's just iterate over all edges, because every EdgeIteratorState gives me fromNode, edgeID and toNode" -- and then I forget that there are two types of Path with no edges: Empty paths and one-node paths.

Maybe just give up on this, let it just be a pure tuple of edge ids and node ids, remove the back-pointer to Graph, and let the consumer figure out what to do with it?

michaz · 2020-08-13T08:04:14Z

Sounds like Path should be an abstract class than?

Your solution would be a solution to that, of course.

I like it because it's ~~almost~~ like this, but not sure if it is the most "normal" thing to do in Java.

michaz · 2020-08-13T08:11:52Z

Let's remember the history of this class: It originally served all of these purposes:

Be the return type of the entire routing request, on the application level. (Later replaced by PathWrapper aka ResponsePath)
Be the intermediate thing the routing algorithm holds on to while routing
Be an abstraction where a path through contraction hierarchies really also is the path through the original graph it represents
Be a builder for itself

We've come a long way since then.

otbutz · 2020-08-13T08:20:33Z

If it's going to be immutable, i'd go for the abstract class approach as it's the most natural one from the OOP perspective:

AbstractPath
EmptyPath(could even be a static instance)
NodePath
Path

otbutz · 2020-08-13T08:22:18Z

A simple marker interface would also be an option if the AbstractPath boilds down to a "shared nothing"

easbar · 2020-08-13T12:39:11Z

Another issue I am having with the Path class is performance related: When the routing algorithms calculate a Path they iterate over all edges to calculate the accumulated distance and time. But often we later do this iteration again to calculate instructions, path details etc. So in my opinion calculating time and distance is an additional task on top of calculating the 'path' just like instructions and path details. It might be better to reduce Path to a list of edge keys (so the directions of the edges are already included) and calculating anything else would be done in a post-processing step (and ideally traversing the edges would only happen once). Maybe we need the list of nodes as well (not sure).

The additional properties (description, maybe also weight, distance, time) seem a bit random on this class as it is now

Yes, as far as I am concerned they can be removed and added to something that is built from a Path.

start-node is there because otherwise the path isn't well-defined by only the list of edge-ids

Maybe this problem is solved if we make Path a list of edge keys instead of edge ids?

if "not found", then do something

Hm, but checking for Path.isFound() does not seem to be too ugly to me. As a return type of the routing algorithms it seems very reasonable to me that there is a notion of a 'not found' path and one has to check for this. We could also go for Optional<Path> or use null if you ask me.

else if single-node, then get node number and do something with it

Is this really such a problem? In cases where one wants to iterate over all edges and the edge list is empty it makes sense that we iterate over an empty list? And if one wants to iterate over the nodes of a path or get the start/end node this can be done by storing the start/end nodes with Path.

tldr: I have not tried it yet, but I think Path (the result type of the routing algorithms) ideally should be (close to):

/**
 * Result type used by the routing algorithms. Additional information about the shortest path like time, distance, details and instructions can be calculated in a post-processing step
 */
public class Path {
  private IntArrayList edgeKeys;
  private int startNode;
  private int endNode;
  private double weight;
}

easbar · 2020-09-29T09:52:32Z

there are two types of Path with no edges: Empty paths and one-node paths.

But don't you have to do a minimum amount of pattern matching anyway when you check if the path is 'valid' (points are connected) or not (connection not found)? And if you do (and the path is valid) you already know that all paths with no edges are one-node paths?

I would like to stop 'automatically' calculating the distance and time when running the algorithms, because:

It is inefficient to get the edge iterator states to calculate time/distance while 'unpacking' the path and then do it all again when calculating points/nodes, path details and instructions.
It makes testing harder. When checking the correctness of different algorithms it would be sufficient to check the returned weights and the list of edge-ids (or keys, or nodes). Calculating the time and difference can be done completely independent from finding the optimal path and IMO should be done in a post-processing step (that can be tested in isolation).

Ideally, I would even like to support running algo.calcPath in 'weight-only' mode where we do not need to 'unpack' the path or backtrace the SPT parent pointers at all.

Can we agree on this:

/**
 * Result type used by the routing algorithms. Additional information about the shortest path like time, distance, details and instructions can be calculated in a post-processing step
 */
public class Path {
  private IntArrayList edgeKeys;
  private int startNode;
  private int endNode;
  private double weight;
}

? Note that the Path contains the edge keys here, and I think there should be a graph.getEdgeIteratorState(int edgeKey). Do we need the node IDs as well? It probably does not hurt? The isFound() method should be easy to implement by simply checking for infinite weight?

easbar added the refactoring label Sep 13, 2019

easbar mentioned this issue Sep 13, 2019

Alternative routes with node-based CH #1722

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate concerns in Path class #1727

Separate concerns in Path class #1727

easbar commented Sep 13, 2019

michaz commented Sep 14, 2019

michaz commented Sep 14, 2019 •

edited

Loading

easbar commented Sep 14, 2019

michaz commented Sep 14, 2019 •

edited

Loading

easbar commented Sep 17, 2019 •

edited

Loading

michaz commented Aug 12, 2020

michaz commented Aug 12, 2020 •

edited

Loading

otbutz commented Aug 13, 2020

michaz commented Aug 13, 2020 •

edited

Loading

michaz commented Aug 13, 2020 •

edited

Loading

michaz commented Aug 13, 2020 •

edited

Loading

otbutz commented Aug 13, 2020

otbutz commented Aug 13, 2020

easbar commented Aug 13, 2020

easbar commented Sep 29, 2020

Separate concerns in Path class #1727

Separate concerns in Path class #1727

Comments

easbar commented Sep 13, 2019

michaz commented Sep 14, 2019

michaz commented Sep 14, 2019 • edited Loading

easbar commented Sep 14, 2019

michaz commented Sep 14, 2019 • edited Loading

easbar commented Sep 17, 2019 • edited Loading

michaz commented Aug 12, 2020

michaz commented Aug 12, 2020 • edited Loading

otbutz commented Aug 13, 2020

michaz commented Aug 13, 2020 • edited Loading

michaz commented Aug 13, 2020 • edited Loading

michaz commented Aug 13, 2020 • edited Loading

otbutz commented Aug 13, 2020

otbutz commented Aug 13, 2020

easbar commented Aug 13, 2020

easbar commented Sep 29, 2020

michaz commented Sep 14, 2019 •

edited

Loading

michaz commented Sep 14, 2019 •

edited

Loading

easbar commented Sep 17, 2019 •

edited

Loading

michaz commented Aug 12, 2020 •

edited

Loading

michaz commented Aug 13, 2020 •

edited

Loading

michaz commented Aug 13, 2020 •

edited

Loading

michaz commented Aug 13, 2020 •

edited

Loading