Add incremental cycle detection algorithm #36

Keno · 2021-11-10T07:43:53Z

This adds the abstract interface for and a basic, naive implementation
of an algorithm to solve the incremental (online) cycle detection problem.
The incremental cycle detection problem is as follows:

Starting from some initial acyclic (directed) graph (here taken to be empty),
process a series of edge additions, stopping after the first edge
addition that introduces a cycle.

The algorithms for this problem have been developed and improved mostly
over the past five years or so, so they are not currently widely used
(or available in other libraries). That said, ModelingToolkit was using
an implementation in its DAE tearing code (which I intend to replace
with a version based on this code) and there are recent papers showing
applications in compiler optimizations.

The main focus of the current PR is the abstract interface. The algorithm
itself is Algorithm N (for "Naive") from [BFGT15] Section 3. The reference
develops several more algorithms with differing performance patterns,
depending on the sparsity of the graph and there are further improvements
to the asymptotics in the subsequent literature. However, Algorithm N is
simple to understand and extend and works ok for what is needed in MTK,
so I am not currently planning to implement the more advanced algorithms.
I do want to make sure the extension point exists to add them in the future,
which I believe this approach accomplishes.

[BFGT15] Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Robert E. Tarjan. 2015
A New Approach to Incremental Cycle Detection and Related Problems.
ACM Trans. Algorithms 12, 2, Article 14 (December 2015), 22 pages.
DOI: http://dx.doi.org/10.1145/2756553

codecov · 2021-11-10T08:09:03Z

Codecov Report

Merging #36 (abacfa1) into master (0290c71) will increase coverage by 0.02%.
The diff coverage is 98.96%.

@@            Coverage Diff             @@
##           master      #36      +/-   ##
==========================================
+ Coverage   97.52%   97.54%   +0.02%     
==========================================
  Files         111      112       +1     
  Lines        6229     6325      +96     
==========================================
+ Hits         6075     6170      +95     
- Misses        154      155       +1

src/cycles/incremental.jl

YingboMa · 2021-11-10T16:39:33Z

src/cycles/incremental.jl

+"""
+abstract type IncrementalCycleTracker{I} <: AbstractGraph{I} end
+
+function (::Type{IncrementalCycleTracker})(s::AbstractGraph{I}; in_out_reverse=nothing) where {I}


Why not in_out_reverse=false?

The idea is to be able to detect somebody explicitly passing in in_out_reverse=false, in case we want to change these options later, since it's a bit ad hoc. It helps to have a value that represent "no user preference" such that people writing wrappers can use that as a default value if they want without incorrectly signaling user intent. E.g. if we wanted to deprecate this argument, we'd check isa(in_out_reverse, Bool).

jpfairbanks · 2021-11-10T21:34:22Z

Sorry for asking a question that could be answered by a more thorough study of the code, but it looks like the in_out_reverse keyword argument is having the effect of interpreting the graph as reversing the in and out neighbors. Is there anything that this algorithm needs to do for that specially?

I would rather have

struct ReverseGraph{G} <: AbstractGraph where G
  g::G
end

# [proxy all the AbstractGraph API onto ReverseGraph.g]
nv(r::ReverseGraph) = nv(r.g)
ne(r::ReverseGraph) = ne(r.g)

# swap the role of in and out
in_neighbors(r::ReverseGraph) = out_neighbors(r.g)
out_neighbors(r::ReverseGraph) = in_neighbors(r.g)

Any algorithm that works on directed graphs could have this flag and I think we should solve this at the type level once and for all. That way we would solve this problem in a central place and not have to put this keyword on every function that expects a DiGraph.

Otherwise, I think this is a good feature to have. It sounds like there has been a big improvement in this area since the last time I seriously kept up with streaming graph literature.

Keno · 2021-11-10T21:40:07Z

Sorry for asking a question that could be answered by a more thorough study of the code, but it looks like the in_out_reverse keyword argument is having the effect of interpreting the graph as reversing the in and out neighbors. Is there anything that this algorithm needs to do for that specially?

So what this flag does is to reverse whether the algorithm uses inneighbors or outneighbors, because the graphs in MTK only have one of the directions defined for performance reasons. For this algorithm, this is conceptually equivalent to reversing the edges of the graph, except for the topological sort, where reversing the graph would give opposite ordering. If you passed in a reversed graph, you'd also have to put the burden of reversing the edge in the API on the user, which can be confusing. Ideally we'd have some sort of trait that defines whether the graph has inneighbors, outneighbors or both and just use that, but that seemed like a bigger change.

Keno · 2021-11-11T23:08:26Z

I played with replacing in_out_reverse with a trait, but then I realized that doesn't actually fix it, because in_out_reverse also determines whether you can batch the source or the dest vertices, which you can't determine from the trait. Given that, I'd prefer to get this in as is and if we ever get a trait along those lines, we can use it to set an automatic default for in_out_reverse, but it doesn't get rid of the flag.

Keno · 2021-11-11T23:23:08Z

Per request from @YingboMa add support for constructing the weak topological ordering when the graph is not empty to start with.

jpfairbanks · 2021-11-12T00:20:14Z

I think from the way the AbstractGraph API is designed you are supposed to always support both in and out neighbors, but you might have an O(E) complexity for one direction with O(1) complexity in the other direction.

The idea of reversing the graph with a wrapper struct came up on another issue, and it has previously come up when we first added support for directed graphs and their adjacency matrices. The first place we had to think about this is the orientation of the adjacency matrix, is A[i,j] the number of direct edges from i to j or j to i? It makes sense to me to handle this like LinearAlgebra with a wrapper for transpose.

But in this case not only do you need to reverse the graph, you want the toposort to be the opposite ordering of the edges. There is something about this that feels weird to me. Like the toposort should be in the direction of the edges. Given that this issue comes up for every directed graph algorithm I'd like to have a consistent solution. We use a keyword dir=:in or :out in the traversals code.

https://github.com/JuliaGraphs/Graphs.jl/blob/master/src/traversals/bfs.jl#L35

Maybe that solution is appropriate here.

Keno · 2021-11-12T05:25:54Z

But in this case not only do you need to reverse the graph, you want the toposort to be the opposite ordering of the edges. There is something about this that feels weird to me. Like the toposort should be in the direction of the edges

Well, it's a little specific to this particular algorithm because cycles are preserved under the edge reflection symmetry, so a cycle detection on the reversed graph is a cycle detection on the original graph (but with reversed topsort and a reversed direction of the batch insertion).

I don't disagree that the generic graph reversal wrapper would be useful, I just don't think it's the correct API for this particular function, because the reversal is a bit of an implementation detail of the algorithm. I also think it's mostly unrelated to whether or not outneighbors is defined or not, since you still want to do the reversed traversal to support the batching of the source vertices.

Regarding the generic API, I agree that Graphs.jl is currently written with the assumption that both are implemented, but I don't think that's necessarily a good design. Certainly all the basic graph data structures should have both defined, but I don't think forcing downstream graph implementations to implement both is desirable or necessary. It is of course possible to implement the O(E) version, but I think you'd rather get an error if you're already going through the trouble of using a custom, optimized graph type. After all, you can always just convert it to a SimpleDiGraph and do whatever you want on it and it'll be significantly faster than adding an extra O(E) to the complexity.

That said, I don't think the question of whether graphs that don't implement both directions are part of the API needs to be resolved here, since the reversal is necessary for the batching consideration anyway. I think I'd rather punt that particular question to a future point when addressing it is actually required.

jpfairbanks · 2021-11-13T21:21:58Z

Ok, I agree that we shouldn't need to fix that general approach here. We should probably just be consistent with the code in traversals.jl and use dir=[:in|:out] until we decide on the consistent solution.

Keno · 2021-11-13T23:26:16Z

Updated to use the dir keyword.

Keno · 2021-11-17T04:08:01Z

@jpfairbanks Are we good to go on this?

ViralBShah · 2022-01-07T05:17:17Z

@YingboMa @Keno if this is adequately tested, let's get it merged.

etiennedeg · 2022-01-07T09:57:47Z

Some tests are failing. Call to IncrementalCycleTracker(Gcycle2) without specifying the dir keyword is failing. IncrementalCycleTracker(Gcycle2, dir=:out) is working as expected. I don't know what's going on.
Edit: Ok, we need to replace ::Symbol by ::Union{Nothing, Symbol}.

src/cycles/incremental.jl

oscardssmith · 2022-01-31T21:06:10Z

For reference, the paper Keno linked is on arxiv at https://arxiv.org/abs/1112.0784

oscardssmith · 2022-02-01T02:06:19Z

Change made. Is there anything else that needs to happen to get this merged? (@Keno is having me finish up this PR).

This adds the abstract interface for and a basic, naive implementation of an algorithm to solve the incremental (online) cycle detection problem. The incremental cycle detection problem is as follows: Starting from some initial acyclic (directed) graph (here taken to be empty), process a series of edge additions, stopping after the first edge addition that introduces a cycle. The algorithms for this problem have been developed and improved mostly over the past five years or so, so they are not currently widely used (or available in other libraries). That said, ModelingToolkit was using an implementation in its DAE tearing code (which I intend to replace with a version based on this code) and there are recent papers showing applications in compiler optimizations. The main focus of the current PR is the abstract interface. The algorithm itself is Algorithm N (for "Naive") from [BFGT15] Section 3. The reference develops several more algorithms with differing performance patterns, depending on the sparsity of the graph and there are further improvements to the asymptotics in the subsequent literature. However, Algorithm N is simple to understand and extend and works ok for what is needed in MTK, so I am not currently planning to implement the more advanced algorithms. I do want to make sure the extension point exists to add them in the future, which I believe this approach accomplishes. [BFGT15] Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Robert E. Tarjan. 2015 A New Approach to Incremental Cycle Detection and Related Problems. ACM Trans. Algorithms 12, 2, Article 14 (December 2015), 22 pages. DOI: http://dx.doi.org/10.1145/2756553

ViralBShah · 2022-02-08T14:59:10Z

@simonschoelly How do we give more people rights to run CI workflows?

pfitzseb · 2022-02-08T15:02:44Z

There's a setting under Settings > Actions > General to restrict CI only for new GH accounts:

ViralBShah · 2022-02-08T15:13:42Z

It says require approval for first-time contributors. So hopefully you shouldn't need it for the next PR.

oscardssmith · 2022-02-08T16:04:53Z

Ci should be fixed. (needs rerun)

oscardssmith · 2022-02-08T16:18:53Z

Once CI passes, are people ready to merge this?

etiennedeg

I did not review that much in detail but it seems to be correct, so after these minor corrections, this is ok for me.
YingboMa's reviews are marked as resolved but I don't see the corrections , what is going on ?

src/cycles/incremental.jl

Co-authored-by: Etienne dg <depotetidg@gmail.com>

oscardssmith · 2022-02-09T14:51:55Z

Changes committed. Not sure why Yingbo's was marked resolved, but now it's fixed.

etiennedeg · 2022-02-09T14:53:58Z

Maybe the weak_topological_levels could be useful in general (maybe renamed as a depth function)

Co-authored-by: Yingbo Ma <mayingbo5@gmail.com>

oscardssmith · 2022-02-09T15:15:38Z

The one reason not to give this a more generic name is that some of the more advanced algorithms might not compute the topological_levels. I'd be willing to rename it though if you think it would be better.

etiennedeg · 2022-02-09T15:23:03Z

It was more a thought than a review suggestion, we can still move it elsewhere later, no worries.

ViralBShah · 2022-02-09T16:22:12Z

@oscardssmith Should we merge?

oscardssmith · 2022-02-09T16:24:22Z

As far as I'm concerned, it's ready.

oscardssmith · 2022-02-09T16:49:27Z

Should we tag a new version?

simonschoelly · 2022-02-09T17:06:37Z

Should we tag a new version?

Do you need this as a dependency for something right now? If yes I can tag a new version. Technically SemVer would require a new minor version, but I would rather tag a patch version for now, with the understanding that this also mean, that the api is not fixed.

oscardssmith · 2022-02-09T17:09:37Z

ModelingToolkit would appreciate it (SciML/ModelingToolkit.jl#1450), but it doesn't have to be right now (since MTK has just copied the code from this while waiting for this to be ready).

ViralBShah · 2022-02-09T18:52:04Z

I suggest we do a minor and make MTK depend on this rather than use copied code.

simonschoelly · 2022-02-09T19:12:01Z

Ok, I will quickly release a final version for Julia v1.3 and then create a new minor version with this.

simonschoelly · 2022-02-09T22:22:49Z

Here you go: https://github.com/JuliaGraphs/Graphs.jl/releases

oscardssmith · 2022-02-09T22:27:21Z

thanks!

Keno mentioned this pull request Nov 10, 2021

Refactor tearing code SciML/ModelingToolkit.jl#1338

Merged

Keno force-pushed the kf/onlinecycles branch from 631a14a to 5e0dc27 Compare November 10, 2021 07:59

YingboMa reviewed Nov 10, 2021

View reviewed changes

src/cycles/incremental.jl Outdated Show resolved Hide resolved

YingboMa reviewed Nov 10, 2021

View reviewed changes

Keno force-pushed the kf/onlinecycles branch from 574547f to 391fddd Compare November 11, 2021 23:22

jpfairbanks mentioned this pull request Nov 13, 2021

Consistent approach to graph direction #69

Open

Keno force-pushed the kf/onlinecycles branch from 391fddd to d44fb72 Compare November 13, 2021 23:26

Keno force-pushed the kf/onlinecycles branch from d44fb72 to 2c97a8d Compare November 17, 2021 06:48

ViralBShah closed this Jan 7, 2022

ViralBShah reopened this Jan 7, 2022

etiennedeg requested changes Jan 7, 2022

View reviewed changes

src/cycles/incremental.jl Outdated Show resolved Hide resolved

Keno and others added 2 commits February 1, 2022 15:18

address code review

793a7e7

oscardssmith force-pushed the kf/onlinecycles branch from 28b5020 to 793a7e7 Compare February 1, 2022 20:19

fix ci on 1.3

1b5f6d1

etiennedeg requested changes Feb 9, 2022

View reviewed changes

src/cycles/incremental.jl Outdated Show resolved Hide resolved

src/cycles/incremental.jl Show resolved Hide resolved

oscardssmith and others added 3 commits February 9, 2022 09:49

Update src/cycles/incremental.jl

353727e

Co-authored-by: Etienne dg <depotetidg@gmail.com>

Update src/cycles/incremental.jl

12d27c2

Co-authored-by: Etienne dg <depotetidg@gmail.com>

fix typo

20fcbdb

Update src/cycles/incremental.jl

abacfa1

Co-authored-by: Yingbo Ma <mayingbo5@gmail.com>

etiennedeg approved these changes Feb 9, 2022

View reviewed changes

ViralBShah merged commit 075a01e into JuliaGraphs:master Feb 9, 2022

oscardssmith deleted the kf/onlinecycles branch February 9, 2022 16:32

oscardssmith mentioned this pull request Feb 9, 2022

remove compat incremental_cycles SciML/ModelingToolkit.jl#1450

Merged

Add incremental cycle detection algorithm #36

Add incremental cycle detection algorithm #36

Conversation

Keno commented Nov 10, 2021

codecov bot commented Nov 10, 2021 • edited Loading

Codecov Report

YingboMa Nov 10, 2021

Choose a reason for hiding this comment

Keno Nov 10, 2021

Choose a reason for hiding this comment

jpfairbanks commented Nov 10, 2021

Keno commented Nov 10, 2021

Keno commented Nov 11, 2021

Keno commented Nov 11, 2021

jpfairbanks commented Nov 12, 2021

Keno commented Nov 12, 2021

jpfairbanks commented Nov 13, 2021

Keno commented Nov 13, 2021

Keno commented Nov 17, 2021

ViralBShah commented Jan 7, 2022

etiennedeg commented Jan 7, 2022 • edited Loading

oscardssmith commented Jan 31, 2022

oscardssmith commented Feb 1, 2022

ViralBShah commented Feb 8, 2022

pfitzseb commented Feb 8, 2022

ViralBShah commented Feb 8, 2022

oscardssmith commented Feb 8, 2022

oscardssmith commented Feb 8, 2022

etiennedeg left a comment

Choose a reason for hiding this comment

oscardssmith commented Feb 9, 2022

etiennedeg commented Feb 9, 2022

oscardssmith commented Feb 9, 2022

etiennedeg commented Feb 9, 2022

ViralBShah commented Feb 9, 2022

oscardssmith commented Feb 9, 2022

oscardssmith commented Feb 9, 2022

simonschoelly commented Feb 9, 2022

oscardssmith commented Feb 9, 2022

ViralBShah commented Feb 9, 2022

simonschoelly commented Feb 9, 2022

simonschoelly commented Feb 9, 2022

oscardssmith commented Feb 9, 2022

codecov bot commented Nov 10, 2021 •

edited

Loading

etiennedeg commented Jan 7, 2022 •

edited

Loading