New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph theoretic algorithms for internal use (e.g. Tarjan's algorithm) #16174
Comments
We have Can the algorithm you describe for ODEs apply to solving regular systems of equations as well? |
It could be used for regular systems of equations but there we would have undirected graphs. So the idea would be that given
we could partition that into two subproblems for (a,b) and (c,d). I'm not sure how useful that wold be or if it already happens. In the case of ODEs we can do the same but we can also go a step further since for ODEs we have a directed graph so given:
We can identify that (g,h) is a strongly connected component that must be solved as a unit. The variable f is not disconnected from the graph but is not in a strongly connected component with any other variables so that ODE can be solved independently. Once we've solved for f we can eliminate it from the equations for g,h. Then they can be solved as a separate system. Once we have the solution for h the equation for z becomes a single ODE that can be solved independently. For the ODE module this could greatly expand the number of systems that are matched since we only need to match the strongly connected components of the system. Making use of it needs good support for non-homogeneous / non-autonomous systems though. |
Ah. So this actually combines nicely with topological_sort, since the strongly connected components create a DAG. The topological_sort tells you what order to solve the sub-systems in. Although apparently Tarjan's algorithm already returns the components in (reverse) topological order). I suspect many places that use topological_sort could be using this instead, as a more generalized way to handle graphs with cycles. |
For example you can topologically sort a substitution, to make it happen "in order". #6257 If there's a cycle, like |
I would describe We could certainly partition the substitutions into strongly connected components. Would those components be unambiguous though. What about something where one replacement expression appear in another e.g. |
I think something similar can be used when evaluating summations with symbols in the limits, see #15767 where one approach is to reorder the summation order. (Which will not solve the issue at hand, but still can be useful if deciding to do an explicit evaluation as it may split it into several independent summations/evaluations.) Edit: and therefore also integrals with symbols in the limits, although I do not know if there is a similar issue there. |
I hadn't thought about that. Maybe you would also need an edge if one term's "old" also contains another's. I'm mostly thinking out loud with this idea. I don't know if it is a good one, just trying to think of ways this algorithm could be useful. |
Here's an implementation of Tarjan's algorithm: def strongly_connected_components(vertices, edgefunc):
'''Strongly connected components of a graph
vertices is an iterable giving the vertices of the graph.
edgefunc(vertex) gives an iterable of vertices that have edges from
vertex.
'''
def follow(v1):
# Add to the data structures
index = len(stack)
indices[v1] = lowlink[v1] = index
stack.append(v1)
# Recurse over descendants
for v2 in edgefunc(v1):
if v2 not in indices:
follow(v2)
lowlink[v1] = min(lowlink[v1], lowlink[v2])
elif v2 in stack:
lowlink[v1] = min(lowlink[v1], indices[v2])
# Pop off complete connected components
if lowlink[v1] == indices[v1]:
component = [stack.pop()]
while component[-1] is not v1:
component.append(stack.pop())
components.append(component[::-1])
lowlink = {}
indices = {}
stack = []
components = []
for v in vertices:
if v in indices:
continue
follow(v)
return components
# This represents a directed graph which in dot notation would look like:
# digraph G {
# A -> B
# A -> C
# A -> D
# B -> C
# C -> B
# C -> D
# }
graph = {
'A': ('B', 'C', 'D'),
'B': ('C'),
'C': ('B', 'D'),
'D': (),
}
for c in strongly_connected_components(graph, graph.__getitem__):
print(c) Output: $ python scc.py
['D']
['B', 'C']
['A'] |
Can it use the same input format as topological_sort. Input format is one thing that a real graph library like networkx will provide more flexibility over, but we should just pick a format that is convenient and use it. Is |
Input format can be easily changed.
It is called over integers. |
Here's a version that takes the same format as topological_sort: from sympy import *
def strongly_connected_components(G):
'''Strongly connected components of a graph
G is a tuple (V, E) with V the vertices and E the edges as (v1, v2) pairs.
'''
V, E = G
Gmap = {vi: [] for vi in V}
for v1, v2 in E:
Gmap[v1].append(v2)
def follow(v1):
# Add to the data structures
index = len(stack)
indices[v1] = lowlink[v1] = index
stack.append(v1)
# Recurse over descendants
for v2 in Gmap[v1]:
if v2 not in indices:
follow(v2)
lowlink[v1] = min(lowlink[v1], lowlink[v2])
elif v2 in stack:
lowlink[v1] = min(lowlink[v1], indices[v2])
# Pop off complete connected components
if lowlink[v1] == indices[v1]:
component = [stack.pop()]
while component[-1] is not v1:
component.append(stack.pop())
components.append(component[::-1])
lowlink = {}
indices = {}
stack = []
components = []
for vi in V:
if vi in indices:
continue
follow(vi)
return components
# This represents a directed graph which in dot notation would look like:
# digraph G {
# A -> B
# A -> C
# A -> D
# B -> C
# C -> B
# C -> D
# }
V = ['A', 'B', 'C', 'D']
E = [
('A', 'B'),
('A', 'C'),
('A', 'D'),
('B', 'C'),
('C', 'B'),
('C', 'D'),
]
G = (V, E)
for c in strongly_connected_components(G):
print(c) |
Here's an example of how we could use this in solve to partition a system of equations (this is an undirected graph so we're just finding the connected components). The function solve_scc solves a system of equations by first partitioning the system into connected components and is faster than solve. def solve_scc(eqs, syms):
# Build graph
V = syms
E = []
# Map back from syms to eqs
eqmap = {s:set() for s in syms}
for eq in eqs:
eqsyms = eq.free_symbols & set(syms)
for s1 in eqsyms:
eqmap[s1].add(eq)
for s2 in eqsyms:
if s1 is s2:
break
E.append((s1, s2))
E.append((s2, s1))
G = (V, E)
# Find coupled subsystems:
coupled_syms = strongly_connected_components(G)
# Solve subsystems and combine results
soldict = {}
for csyms in coupled_syms:
ceqs = set.union(*[eqmap[s] for s in csyms])
csol = solve(ceqs, csyms, dict=True)
assert len(csol) == 1
soldict.update(csol[0])
return [soldict]
# Build a big-ish linear system:
Nrep = 5 # Nrep*3 equations
syms = symbols('x:%d' % (3*Nrep,))
a, b, c = symbols('a b c')
B = Matrix([
[a, b, c],
[a, -b, c],
[a, b, -c],
])
rhs = [1, 2, 3] * Nrep
M = BlockMatrix([[B if i == j else zeros(3, 3) for i in range(Nrep)] for j in range(Nrep)])
M = M.as_explicit()
#pprint(M)
b = Matrix([[bi] for bi in rhs])
x = Matrix([[si] for si in syms])
eqs = list(M*x - b)
#pprint(eqs)
import time
# Solve with scc_solve:
start = time.time()
sol_scc = solve_scc(eqs, syms)
print('solve_scc:', time.time() - start, 'seconds')
# Solve with solve:
start = time.time()
sol = solve(eqs, syms, dict=True)
print('solve:', time.time() - start, 'seconds') Running this the timings are: $ python scc.py
solve_scc: 0.47367119789123535 seconds
solve: 35.56143808364868 seconds The differences grow as a system like this gets larger (increase Nrep). Solving an NxN linear system is maybe |
I think it would make sense to add strongly_connected_components. We both have usecases for it, so go ahead, I would say! |
I have thought of a couple of patches that would benefit from having algorithms for e.g. finding connected components of a graph. Do any graphs theoretic algorithms exist anywhere for internal use in SymPy?
I see #8186 rejects the idea of adding a graph theory module for external use but this is about internal use.
I have two examples:
stronglyconnected components. Thosestronglyconnected components can be solved as reduced problems in a divide and conquer style approach. This would give a significant performance boost for many matrices.For these two it would be useful to have e.g. Tarjan's algorithm:
https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
I'm opening this issue to ask:
The text was updated successfully, but these errors were encountered: