Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add incremental cycle detection algorithm #36

Merged
merged 7 commits into from
Feb 9, 2022

Conversation

Keno
Copy link
Contributor

@Keno Keno commented Nov 10, 2021

This adds the abstract interface for and a basic, naive implementation
of an algorithm to solve the incremental (online) cycle detection problem.
The incremental cycle detection problem is as follows:

Starting from some initial acyclic (directed) graph (here taken to be empty),
process a series of edge additions, stopping after the first edge
addition that introduces a cycle.

The algorithms for this problem have been developed and improved mostly
over the past five years or so, so they are not currently widely used
(or available in other libraries). That said, ModelingToolkit was using
an implementation in its DAE tearing code (which I intend to replace
with a version based on this code) and there are recent papers showing
applications in compiler optimizations.

The main focus of the current PR is the abstract interface. The algorithm
itself is Algorithm N (for "Naive") from [BFGT15] Section 3. The reference
develops several more algorithms with differing performance patterns,
depending on the sparsity of the graph and there are further improvements
to the asymptotics in the subsequent literature. However, Algorithm N is
simple to understand and extend and works ok for what is needed in MTK,
so I am not currently planning to implement the more advanced algorithms.
I do want to make sure the extension point exists to add them in the future,
which I believe this approach accomplishes.

[BFGT15] Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Robert E. Tarjan. 2015
A New Approach to Incremental Cycle Detection and Related Problems.
ACM Trans. Algorithms 12, 2, Article 14 (December 2015), 22 pages.
DOI: http://dx.doi.org/10.1145/2756553

@codecov
Copy link

codecov bot commented Nov 10, 2021

Codecov Report

Merging #36 (abacfa1) into master (0290c71) will increase coverage by 0.02%.
The diff coverage is 98.96%.

@@            Coverage Diff             @@
##           master      #36      +/-   ##
==========================================
+ Coverage   97.52%   97.54%   +0.02%     
==========================================
  Files         111      112       +1     
  Lines        6229     6325      +96     
==========================================
+ Hits         6075     6170      +95     
- Misses        154      155       +1     

src/cycles/incremental.jl Outdated Show resolved Hide resolved
src/cycles/incremental.jl Outdated Show resolved Hide resolved
"""
abstract type IncrementalCycleTracker{I} <: AbstractGraph{I} end

function (::Type{IncrementalCycleTracker})(s::AbstractGraph{I}; in_out_reverse=nothing) where {I}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not in_out_reverse=false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to be able to detect somebody explicitly passing in in_out_reverse=false, in case we want to change these options later, since it's a bit ad hoc. It helps to have a value that represent "no user preference" such that people writing wrappers can use that as a default value if they want without incorrectly signaling user intent. E.g. if we wanted to deprecate this argument, we'd check isa(in_out_reverse, Bool).

@jpfairbanks
Copy link
Member

Sorry for asking a question that could be answered by a more thorough study of the code, but it looks like the in_out_reverse keyword argument is having the effect of interpreting the graph as reversing the in and out neighbors. Is there anything that this algorithm needs to do for that specially?

I would rather have

struct ReverseGraph{G} <: AbstractGraph where G
  g::G
end

# [proxy all the AbstractGraph API onto ReverseGraph.g]
nv(r::ReverseGraph) = nv(r.g)
ne(r::ReverseGraph) = ne(r.g)

# swap the role of in and out
in_neighbors(r::ReverseGraph) = out_neighbors(r.g)
out_neighbors(r::ReverseGraph) = in_neighbors(r.g)

Any algorithm that works on directed graphs could have this flag and I think we should solve this at the type level once and for all. That way we would solve this problem in a central place and not have to put this keyword on every function that expects a DiGraph.

Otherwise, I think this is a good feature to have. It sounds like there has been a big improvement in this area since the last time I seriously kept up with streaming graph literature.

@Keno
Copy link
Contributor Author

Keno commented Nov 10, 2021

Sorry for asking a question that could be answered by a more thorough study of the code, but it looks like the in_out_reverse keyword argument is having the effect of interpreting the graph as reversing the in and out neighbors. Is there anything that this algorithm needs to do for that specially?

So what this flag does is to reverse whether the algorithm uses inneighbors or outneighbors, because the graphs in MTK only have one of the directions defined for performance reasons. For this algorithm, this is conceptually equivalent to reversing the edges of the graph, except for the topological sort, where reversing the graph would give opposite ordering. If you passed in a reversed graph, you'd also have to put the burden of reversing the edge in the API on the user, which can be confusing. Ideally we'd have some sort of trait that defines whether the graph has inneighbors, outneighbors or both and just use that, but that seemed like a bigger change.

@Keno
Copy link
Contributor Author

Keno commented Nov 11, 2021

I played with replacing in_out_reverse with a trait, but then I realized that doesn't actually fix it, because in_out_reverse also determines whether you can batch the source or the dest vertices, which you can't determine from the trait. Given that, I'd prefer to get this in as is and if we ever get a trait along those lines, we can use it to set an automatic default for in_out_reverse, but it doesn't get rid of the flag.

@Keno
Copy link
Contributor Author

Keno commented Nov 11, 2021

Per request from @YingboMa add support for constructing the weak topological ordering when the graph is not empty to start with.

@jpfairbanks
Copy link
Member

I think from the way the AbstractGraph API is designed you are supposed to always support both in and out neighbors, but you might have an O(E) complexity for one direction with O(1) complexity in the other direction.

The idea of reversing the graph with a wrapper struct came up on another issue, and it has previously come up when we first added support for directed graphs and their adjacency matrices. The first place we had to think about this is the orientation of the adjacency matrix, is A[i,j] the number of direct edges from i to j or j to i? It makes sense to me to handle this like LinearAlgebra with a wrapper for transpose.

But in this case not only do you need to reverse the graph, you want the toposort to be the opposite ordering of the edges. There is something about this that feels weird to me. Like the toposort should be in the direction of the edges. Given that this issue comes up for every directed graph algorithm I'd like to have a consistent solution. We use a keyword dir=:in or :out in the traversals code.

https://github.com/JuliaGraphs/Graphs.jl/blob/master/src/traversals/bfs.jl#L35

Maybe that solution is appropriate here.

@Keno
Copy link
Contributor Author

Keno commented Nov 12, 2021

But in this case not only do you need to reverse the graph, you want the toposort to be the opposite ordering of the edges. There is something about this that feels weird to me. Like the toposort should be in the direction of the edges

Well, it's a little specific to this particular algorithm because cycles are preserved under the edge reflection symmetry, so a cycle detection on the reversed graph is a cycle detection on the original graph (but with reversed topsort and a reversed direction of the batch insertion).

I don't disagree that the generic graph reversal wrapper would be useful, I just don't think it's the correct API for this particular function, because the reversal is a bit of an implementation detail of the algorithm. I also think it's mostly unrelated to whether or not outneighbors is defined or not, since you still want to do the reversed traversal to support the batching of the source vertices.

Regarding the generic API, I agree that Graphs.jl is currently written with the assumption that both are implemented, but I don't think that's necessarily a good design. Certainly all the basic graph data structures should have both defined, but I don't think forcing downstream graph implementations to implement both is desirable or necessary. It is of course possible to implement the O(E) version, but I think you'd rather get an error if you're already going through the trouble of using a custom, optimized graph type. After all, you can always just convert it to a SimpleDiGraph and do whatever you want on it and it'll be significantly faster than adding an extra O(E) to the complexity.

That said, I don't think the question of whether graphs that don't implement both directions are part of the API needs to be resolved here, since the reversal is necessary for the batching consideration anyway. I think I'd rather punt that particular question to a future point when addressing it is actually required.

@jpfairbanks
Copy link
Member

Ok, I agree that we shouldn't need to fix that general approach here. We should probably just be consistent with the code in traversals.jl and use dir=[:in|:out] until we decide on the consistent solution.

@Keno
Copy link
Contributor Author

Keno commented Nov 13, 2021

Updated to use the dir keyword.

@Keno
Copy link
Contributor Author

Keno commented Nov 17, 2021

@jpfairbanks Are we good to go on this?

@ViralBShah
Copy link
Contributor

@YingboMa @Keno if this is adequately tested, let's get it merged.

@etiennedeg
Copy link
Member

etiennedeg commented Jan 7, 2022

Some tests are failing. Call to IncrementalCycleTracker(Gcycle2) without specifying the dir keyword is failing. IncrementalCycleTracker(Gcycle2, dir=:out) is working as expected. I don't know what's going on.
Edit: Ok, we need to replace ::Symbol by ::Union{Nothing, Symbol}.

src/cycles/incremental.jl Outdated Show resolved Hide resolved
@oscardssmith
Copy link
Member

For reference, the paper Keno linked is on arxiv at https://arxiv.org/abs/1112.0784

@oscardssmith
Copy link
Member

Change made. Is there anything else that needs to happen to get this merged? (@Keno is having me finish up this PR).

Keno and others added 2 commits February 1, 2022 15:18
This adds the abstract interface for and a basic, naive implementation
of an algorithm to solve the incremental (online) cycle detection problem.
The incremental cycle detection problem is as follows:

Starting from some initial acyclic (directed) graph (here taken to be empty),
process a series of edge additions, stopping after the first edge
addition that introduces a cycle.

The algorithms for this problem have been developed and improved mostly
over the past five years or so, so they are not currently widely used
(or available in other libraries). That said, ModelingToolkit was using
an implementation in its DAE tearing code (which I intend to replace
with a version based on this code) and there are recent papers showing
applications in compiler optimizations.

The main focus of the current PR is the abstract interface. The algorithm
itself is Algorithm N (for "Naive") from [BFGT15] Section 3. The reference
develops several more algorithms with differing performance patterns,
depending on the sparsity of the graph and there are further improvements
to the asymptotics in the subsequent literature. However, Algorithm N is
simple to understand and extend and works ok for what is needed in MTK,
so I am not currently planning to implement the more advanced algorithms.
I do want to make sure the extension point exists to add them in the future,
which I believe this approach accomplishes.

[BFGT15] Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Robert E. Tarjan. 2015
    A New Approach to Incremental Cycle Detection and Related Problems.
    ACM Trans. Algorithms 12, 2, Article 14 (December 2015), 22 pages.
    DOI: http://dx.doi.org/10.1145/2756553
@ViralBShah
Copy link
Contributor

@simonschoelly How do we give more people rights to run CI workflows?

@pfitzseb
Copy link

pfitzseb commented Feb 8, 2022

There's a setting under Settings > Actions > General to restrict CI only for new GH accounts:
image

@ViralBShah
Copy link
Contributor

It says require approval for first-time contributors. So hopefully you shouldn't need it for the next PR.

@oscardssmith
Copy link
Member

Ci should be fixed. (needs rerun)

@oscardssmith
Copy link
Member

Once CI passes, are people ready to merge this?

Copy link
Member

@etiennedeg etiennedeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review that much in detail but it seems to be correct, so after these minor corrections, this is ok for me.
YingboMa's reviews are marked as resolved but I don't see the corrections , what is going on ?

src/cycles/incremental.jl Outdated Show resolved Hide resolved
src/cycles/incremental.jl Show resolved Hide resolved
oscardssmith and others added 3 commits February 9, 2022 09:49
Co-authored-by: Etienne dg <depotetidg@gmail.com>
Co-authored-by: Etienne dg <depotetidg@gmail.com>
@oscardssmith
Copy link
Member

Changes committed. Not sure why Yingbo's was marked resolved, but now it's fixed.

@etiennedeg
Copy link
Member

Maybe the weak_topological_levels could be useful in general (maybe renamed as a depth function)

Co-authored-by: Yingbo Ma <mayingbo5@gmail.com>
@oscardssmith
Copy link
Member

The one reason not to give this a more generic name is that some of the more advanced algorithms might not compute the topological_levels. I'd be willing to rename it though if you think it would be better.

@etiennedeg
Copy link
Member

It was more a thought than a review suggestion, we can still move it elsewhere later, no worries.

@ViralBShah
Copy link
Contributor

@oscardssmith Should we merge?

@oscardssmith
Copy link
Member

As far as I'm concerned, it's ready.

@ViralBShah ViralBShah merged commit 075a01e into JuliaGraphs:master Feb 9, 2022
@oscardssmith oscardssmith deleted the kf/onlinecycles branch February 9, 2022 16:32
@oscardssmith
Copy link
Member

Should we tag a new version?

@simonschoelly
Copy link
Contributor

Should we tag a new version?

Do you need this as a dependency for something right now? If yes I can tag a new version. Technically SemVer would require a new minor version, but I would rather tag a patch version for now, with the understanding that this also mean, that the api is not fixed.

@oscardssmith
Copy link
Member

ModelingToolkit would appreciate it (SciML/ModelingToolkit.jl#1450), but it doesn't have to be right now (since MTK has just copied the code from this while waiting for this to be ready).

@ViralBShah
Copy link
Contributor

I suggest we do a minor and make MTK depend on this rather than use copied code.

@simonschoelly
Copy link
Contributor

Ok, I will quickly release a final version for Julia v1.3 and then create a new minor version with this.

@simonschoelly
Copy link
Contributor

@oscardssmith
Copy link
Member

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants