New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Search interface rework #55
[WIP] Search interface rework #55
Conversation
TODO
|
cab1921
to
4aede8c
Compare
Hi @ferranpujolcamins , Pros:
Cons:
To get some little things out of the way: I agree with the API break with regards to findAllBFS and findAllDFS. I really don't think that's a big deal at all. So, is the cool abstraction we're getting worth the performance degradation and the added complexity being introduced into the code base? My guess is the vast majority of users of SwiftGraph would benefit more from a built-in implementation of A* than they would from being able to customize BFS or DFS. I don't think GraphTraverser as it stands would be a suitable protocol for Dijkstra since Dijkstra returns the shortest distance to every vertex, not a single final vertex. So, I think that would stay separate. I think we can implement A* on top of GraphTraverser. But of course we can also pretty easily implement it on its own as well. Personally, I think the cons outweigh the pros. Maybe I am missing some though. I'd like to discuss it further. I also, frankly, feel bad saying that because I see you put a lot of work into this and I see how well laid out and structured the code is, even if it does add complexity. If we were getting performance improvements for extra code complexity, I'd say let's do it. If we were getting simplified code for a slight performance decrease I'd say let's do it too. But I don't think we're actually getting simplified code here—we're getting several new structs and a new protocol for what were previously ~10-15 line single methods. I would also add two further points:
In conclusion, I think the four things we need to ask ourselves with regards to this change are how it impacts: performance, user customization, maintenance, and code complexity. I would say they are as follows:
I think we should talk about it more though. I am open to making the change I just want to make sure we are doing it for the right reasons and not just because it's cooler from an algorithmic perspective or because you've already put a lot of time into it... Best, |
Hi @davecom thanks for your answer. To be honest, when I started implementing all this my hope was that I could get code reuse without sacrificing performance by the use of generics. It turned out that generics in Swift are not like Rust or C++ so at the end I got super bad performance. But I wanted to present the results anyway to know your thoughts. Thanks to your explanation I now better understand what's your view on SwiftGraph so I think we can get to common ground. I'm opening yet another PR soon, so you can compare.
|
Hi @ferranpujolcamins , I really appreciate all of your work on the project. You've given it a good shot in the arm. Best, |
Thanks! |
Yes, what do you think? Or we could follow your advice above:
In that case can we still reuse most of your code? I think we can? |
Let me finish this alternative simpler PR so we can compare. |
Maybe the way to go to get top performance, api flexibility and code maintainability is meta-programming: https://github.com/krzysztofzablocki/Sourcery/blob/master/guides/Writing%20templates.md |
I've tried to get good performance with meta-programming and again is difficult. To get max performance, each variant of the algorithm needs to be very specific, thus is difficult to work with a generic algorithm that fits all methods. The complication we would be adding to get good performance is too high. So I vote to merge #57. If we have good test coverage and tests for all edge cases for all methods we will be fine. |
Introduction
This PR is the follow of PR #43.
The goals of this PR are:
DFS
andBFS
.DFS
andBFS
so an arbitrary computation can be performed over a graph (a closure is fed with the graph vertices in a DFS or BFS order).Explanation of the changes
GraphTraverser
is a protocol representing an algorithm to traverse a graph. Types implementing it implement a graph traversal algorithm in itsfrom(_ initalVertexIndex: Int, goalTest: (Int) -> Bool, reducer: G.Reducer) -> Int?
method.An extension to
GraphTraverser
provides with default implementations for several convenience methods that wrapfrom(_ initalVertexIndex: Int, goalTest: (Int) -> Bool, reducer: G.Reducer) -> Int?
.DFS
andBFS
are structs conforming toGraphTraverser
, so both algorithms have the same api provided by the extension. Furthermore, both forBFS
andDFS
, there's an extension to Graph with convenience methods to perform a search by calling a method on a graph instance.DFSVisitOrder
andBFSVisitOrder
are variants ofDFS
andBFS
that can guarantee a certain order of traversal for the neighbors of each visited vertex. I've kept them as a separate algorithm because the visit order has a performance penalty, even when the ordering closure is a no-op ({ $0 }
).DFS
andBFS
have thewithVisitOrder
method to easily construct aDFSVisitOrder
andBFSVisitOrder
variant.Examples
API Changes
This PR introduces one and only one breaking api change:
findAll
is gone, now you must choose betweenfindAllBFS
andfindAllDFS
All other changes are non-breaking changes to the api.
Performance
Swift is not Rust, so more often than not abstraction comes with a performance penalty. In this PR I've sacrificed a lot of performance in favor of less code duplication.
Below there's a table with the results of some of the performance tests compared to master, stating execution time difference (+ is bad, - is good):
As you can see, this PR introduces a major performance regression compared to our current master, although with some surprising big improvements on some cases. So, if performance was the only consideration we should not merge this PR.
When comparing to a commit on master before the latest merged performance improvements, we see that after merging this PR we could ship a new version with good performance improvements in most cases and only some performance drop of up to 30% in few cases.
Future Performance
This PR as is, is bad for performance. The main reason of this drop of performance is that all the methods implemented in the extension of
GraphTraverser
are now written as special cases of an abstract algorithm that operates with closures. Before, this algorithms had no function calls at all.The good news is that this inefficient methods can be overwritten in
DFS
andBFS
. For example,bfs(fromIndex: Int, toIndex: Int) -> [E]
is the method which has seen its performance dropped the most. We could write a specialized version of the dfs for it, with no closures, just like it is in master now. We must decide what balance between code reuse and performance we want. I left this for future PR though.The someone implementing a new totally new
GraphTraverser
can follow a progressive path:GraphTraverser
and get a bunch of non-optimized convenience methods for free.GraphTraverser
implementation, benchmark it and then decide if some methods need to be specialized for better performance.Also if Swift ever gets more efficient abstractions we might be able to remove some of the specialized functions.
Notes
goalTest
and thereducer
closures of thefrom(_ initalVertexIndex: Int, goalTest: (Int) -> Bool, reducer: G.Reducer) -> Int?
method into a single closure returning a tuple of bools, but this resulted in a slight loss of performance. It's ok since I thing the current solution with two closure is more ergonomic.GraphTraverser
? I think we could. But shall we? The main point ofGraphTraverser
is to get the bunch of convenience methods it defines in its extension for free. Are those methods a sound interface to Dijkstra's algorithm?