New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[A-star] Added expansion pruning via cutoff if cutoff is provided #7073
[A-star] Added expansion pruning via cutoff if cutoff is provided #7073
Conversation
Should have label type: API since it adds an optional param to an exposed function. |
…the Unknown node label type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me. It's unfortunate that there isn't a literature for it. But you are right that the description in the original paper along with the added documentation pointing out that inadmissible heuristics can lead to wrong results makes it clearer what should happen.
I've suggested some additional tests of the heuristic and how it could interact with the cutoff. I hope they make sense to you. Please edit/remove/tamper as you think it should be best set up.
We don't include typing yet, so could you remove that in the function signature for cutoff and in the code when defining explored and enqueued?
We now have started adding "keyword-only" forcing by asking that new keywords be declared keyword-only. (We're working on transition to having most keywords be keyword only -- but that's a different transition.) So, could you add , *
before the cutoff parameter?
Thanks!!
I would probably split the tests into two;
Sure - One of the mypy checks failed, so I thought it was required.
I wasn't aware of that syntax tbh, but sure, let me do that.
Likewise - we use networkx quite extensively, always a pleasure to contribute back. :) |
Co-authored-by: Dan Schult <dschult@colgate.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready.
One more typing line to remove (see below.
Co-authored-by: Dan Schult <dschult@colgate.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @anders-rydbirk , this is an interesting addition! I took the liberty of pushing a very minor docstring formatting update but other than that LGTM. It'd be interesting to see what impact this has on applications of A*
Hi @rossbar Thanks, appreciate it. I can share a few preliminary results as I've cheated a bit and tried out this implementation in parallel to making this PR (mostly due to us being on python 3.9, so it might take a couple of weeks before we can actually start using this addition to networkx). BeforeAfterThe cutoff we are using is Approximately the same load API load over the same time period offset by 1 week. We still see some spikes, but it's quite a significant impact on q99 response time nonetheless. The remaining spikes we see are related to categorical data issues, where the impact is a lot more clear to us now compared to before. Any chance you know when it might be released? :) |
Not concretely, but given the recent past I would estimate the next minor release would be sometime in early 2024 |
…tworkx#7073) * Added expansion pruning via cutoff if cutoff is provided Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: Ross Barnowski <rossbar@caltech.edu>
Added a cutoff to limit the exploration of a large, potentially disconnected graph. The use-case is highlighted in #7026.
Reasoning for implementing this in A-star:
It is not directly mentioned in papers that a cutoff can be added. However, since it's not an unknown concept (as also implemented in Dijkstra), I think the original paper on A-star highlights it the best. Going by the definition of the evaluation function f:
where$g(n)$ is the actual cost from source to $n$ and $h(n)$ is the actual cost of an optimal path from $n$ to the target.
It is then given for A* to be admissible that our heuristic$h'(n)$ must be $h'(n) <= h(n)$ . I.e. when $g(n) + h'(n) > cutoff$ , we are guaranteed that no path exists from $n$ to the target that satisfies the cutoff. Meaning, we could essentially add the cutoff check before enqueueing a node (and simply $n$ that fails the cutoff check since $n$ would have the lowest $f(n)$ in the queue.
continue
if it fails this check), or terminate the search when the algorithm explores a nodeI've opted for the former since it reduces the memory footprint by applying the cutoff early.