Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shortest path weight calculated by LM deviates from the one calculated by Dijkstra. #1557

Closed
easbar opened this issue Feb 28, 2019 · 17 comments

Comments

@easbar
Copy link
Member

easbar commented Feb 28, 2019

Following the discussion in the forum, I tried running random queries ALT vs Dijkstra and found cases where there is a difference, see here: 1f34cd6

However, there do not seem to be deviations between AStarBidirection and Dijkstra, so not sure if the problems are due to the stopping critirion which is the same for ALT and AStarBidirection.

@easbar
Copy link
Member Author

easbar commented Feb 28, 2019

I was wondering if the buildRandomGraph method has a problem, but modifying the distances should be ok, since they are only increased ?

distance += random.nextDouble() * distance * 0.01;
Also the method uses Helper.DIST_PLANE to calculate the distances (should be ok because this is what the WeightApproximators use as well).

@karussell
Copy link
Member

so not sure if the problems are due to the stopping critirion which is the same for ALT and AStarBidirection

The suggestion in the forum was that we have to use a different stopping criterion for ALT, so your observation is an indication there is a problem with it for ALT

but modifying the distances should be ok, since they are only increased ?

Yes, increasing the weight shouldn't be a problem for A*

Also the method uses Helper.DIST_PLANE to calculate the distances (should be ok because this is what the WeightApproximators use as well).

Yes, should also be okayish for "smaller" distances.

@easbar
Copy link
Member Author

easbar commented Feb 28, 2019

The suggestion in the forum was that we have to use a different stopping criterion for ALT, so your observation is an indication there is a problem with it for ALT

Are the active landmarks fixed per query or can they change during the query (onHeuristicChange) ? This could have implications on the stopping criterion. Then again I used numLandmarks=numActiveLandmarks=4 in the tests, so there is nothing to change.

The slides mentioned in the forum do not say there needs to be done anything particular for ALT, even though the stopping criterion is explained in the ALT chapter (but more generally for bidirectional AStar).

@karussell karussell changed the title Shortest path weight calculated by ALT deviates from the one calculated by Dijkstra. Shortest path weight calculated by LM deviates from the one calculated by Dijkstra. Feb 28, 2019
@karussell
Copy link
Member

(renamed title from ALT to LM as the similarity to alternative routes is a bit ugly ;) Please revert if bad)

Are the active landmarks fixed per query or can they change during the query (onHeuristicChange) ?

They are fixed.

The slides mentioned in the forum do not say there needs to be done anything particular for ALT

Ah, yes. Sorry, I remembered it incorrectly.

There is a problem with virtual nodes, see #1532

@karussell
Copy link
Member

Another problem could be the precision or rounding issues that I did overlook when implementing this.

@easbar
Copy link
Member Author

easbar commented Feb 28, 2019

Ok I think we can rule out #1532 because the QueryGraph is not involved in the test.

@karussell
Copy link
Member

Yes, at least for the "shortest" test you did, not necessarily for the problem discussed in the forum.

@aoles
Copy link

aoles commented Mar 6, 2019

I think the issue is actually related to the accuracy of the stored landmark distances. It seems to be a consequence of the difference in the computed weights based on these compared to the values one would get when the actual distances were used.

TL;DR

In order to guarantee that the shortest path is found is is necessary to account for the limited precision of stored landmark distances by modifying the A* search stopping condition by subtracting the landmark scaling factor from both the forward and reverse heap weights such that it reads

currFrom.weight + currTo.weight - 2 * factor >= bestPath.getWeight()

Full explanation

Bidirectional A* search stop condition is given by

where d(s,v) and d(w, t) are the distances from source and to target for the vertices on top of the heaps (forward and reverse), pf and pr are the corresponding potentials, and μ is the length of the best s-t path seen so far. The potentials are defined as

where πf(v) and πr(v) are the estimates of d(v, t) and d(s, v), respectively. These are given by the landmarks:

and similarly for πr. However, the real landmarks distances are stored with limited precision. In particular, they are scaled by a factor F, and their fractional part ε gets truncated such that the value actually used in calculations is

Therefore, the computed functions are not the original potentials anymore, but rather their approximated counterparts:


As evident from the equation above, the computed value πf' differs from the original potential by δ

The same reasoning applies to πr, from which it follows that

and similarly for pr'.

Now, the approximated potentials p' are part of the heap weights, and as such they influence the algorithm stopping condition. If the approximated value is lower than the true one there is "only" an issue with performance because the search might take longer than necessary, i.e. continue even though the shortest path has been already found. However, when it's the other way round such that p' is higher than p then the stopping condition might get fulfilled to early causing the search to finishing prematurely before the optimal path has been found. In worst case scenario both pf' and pr' might end up overestimated by a value up to F. So in order to guarantee that the shortest path is found it is necessary to account for the lost accuracy by offsetting each of them by -F. This leads to the modified stopping condition

@easbar
Copy link
Member Author

easbar commented Mar 8, 2019

Thanks a lot for your explanation! Do you have a suggestion how to obtain the factor in AStarBidirection#finished ? I tried briefly with a fixed value, but I had to use a rather large number to make the tests I mentioned above pass. Did you manage to fix the problems mentioned in the forum with this approach ?

@aoles
Copy link

aoles commented Mar 8, 2019

Thank you for getting back to me!

In order to obtain the scaling factor I've added public double LMApproximator.getFactor() which I then call in AStarBidirection.initTo to set private double AStarBidirection.approximatorOffset like so:

if (weightApprox.getApproximation() instanceof LMApproximator)
  approximatorOffset = 2 * ((LMApproximator) weightApprox.getApproximation()).getFactor();

The corresponding stop condition reads

currFrom.weight + currTo.weight - approximatorOffset >= bestPath.getWeight();

I can confirm that this hack fixes all the issues I described in the forum.

@aoles
Copy link

aoles commented May 28, 2019

The fix described above works only in the old approach where the from/to weights were stored independently. It seems that the new approach introduced in #1376 of storing only one of the two weights, say from, and only the difference to the other one to has a serious flaw in case the difference overflows the negative DELTA_MIN. This happens when to < from and to - from < DELTA_MIN. Then the reconstructed weight to' = from + DELTA_MIN will be higher than the original to value violating the landmark condition that the weights must never overestimate the distance. On the other hand, the overflow of the positive DELTA_MAX is not an issue because then the distance is just underestimated.

In fact, when I reserve more space for the delta storage by setting FROM_WEIGHT_BITS = 16 all of the previously failing single-graph tests in CompareAlgoVsDijkstraTest pass. However, the randomGraph() test still fails after a few iterations.

Unlike in the former approach of storing absolute weights where the precision loss could be compensated by an offset term in the routing algorithm stop condition, I don't see an easy way to ensure the correctness of the algorithm when a possibly capped difference is stored.

@sfendrich
Copy link

So the problem is that, if DELTA_MINis underflowed, we store a value that is too large and the A* heuristics becomes incorrect, right?
One possible solution would be to remember (from the landmarks computation) the worst value (WORST) affected by the condition to < from and to - from < DELTA_MIN. Whenever DELTA_MIN appears inthe heuristics computation, it could be replaced by WORST, so the true distance is underestimated and the heuristics should be correct.

In addition, the discussion of #1376 mentions that a bit-combination of 16/16 worked better than 18/14. If so, I wonder what is the benefit of the 16/16 bit over the former solution of storing from and toin two shorts. Can we be certain that the performance gain of the newer solution is not influenced by A* stopping too early due to the precision loss?

@karussell
Copy link
Member

karussell commented May 29, 2019

Thanks @aoles for finding this issue. This is indeed tricky to fix. Maybe with a special value like we already have for "infinity".

the discussion of #1376 mentions that a bit-combination of 16/16 worked better than 18/14
I wonder what is the benefit of the 16/16 bit over the former solution of storing from and toin two shorts

@sfendrich see this comment from @baumboi:

With PRECISION = 16 we have the same accuracy and factor we had in the old system with two shorts, but less weightings maxed out (which means more precision on long distances).
With PRECISION = 18 we'd get better precision on short and normal distances, but have at least as many weights maxed out as before (probably even more because of the quite small size of the delta values).

So, if the delta solution would be better because of the problem @aoles mentioned, then we would get even better performance for 18/14 IMO.

@michaz
Copy link
Member

michaz commented Jan 15, 2020

Hey @aoles, thank you so much (again) for your analysis. I'm cautiously optimistic that we have this under control as of #1749.

(Turned out there is another reason why we need the "factor" in the stopping criterion. Even if the approximator would always round down, it would still violate the triangle inequality, with a similar result (we terminate early) and the same solution.)

@aoles
Copy link

aoles commented Jan 16, 2020

You are most welcome @michaz , thanks a lot for the update!

Awesome work! Could you maybe briefly summarize what the other issue is and why there is still need for an offset in the stopping condition even with an always underestimating approximator?

Cheers,
Andrzej

@michaz
Copy link
Member

michaz commented Jan 19, 2020

Could you maybe briefly summarize what the other issue is and why there is still need for an offset in the stopping condition even with an always underestimating approximator?

It's approximately this: For the stopping criterion to work, it is sufficient that the approximator only underestimates. But only if, additionally, the top heap weight increases monotonically. This can fail to happen if the potential function between u and v decreases by a higher amount than the edge weight of (u,v). And this can happen at least when (u,v) is shorter than the "factor", no matter how we round.

In that case, the A*-augmented edge weight (real edge weight minus difference of potential function) is negative, which means that the stopping criterion which just looks at the top heap weight doesn't work anymore. The node we still need to scan isn't even on the heap yet. Same solution: Require the top heap weight to be higher by the maximum amount it can decrease. By "factor".

@michaz
Copy link
Member

michaz commented Jan 20, 2020

Fixed by #1749.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants