Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wishlist: How many vertices are reachable from each vertex? #1872

Closed
szhorvat opened this issue Dec 6, 2021 · 9 comments · Fixed by #2549
Closed

Wishlist: How many vertices are reachable from each vertex? #1872

szhorvat opened this issue Dec 6, 2021 · 9 comments · Fixed by #2549
Assignees
Labels
good first issue Candidates for first-time contributors who are already familiar with C theory The math behind the issue needs to be worked out; literature research needed wishlist Feature request that has not been chosen for implementation yet; vote or comment to prioritize it!
Milestone

Comments

@szhorvat
Copy link
Member

szhorvat commented Dec 6, 2021

What is the feature or improvement you would like to see?

A function that counts how many vertices are reachable from each vertex in a directed graph.

Here is an example where every vertex is labelled with how many others are reachable from it:

image

Can we devise a fast (faster than O(|V| ^2)) algorithm that accomplishes this?

Use cases for the feature

  • This value is the same as the out-degree of vertices in the transitive closure of the graph. The transitive closure is often a dense graph that cannot be practically stored.
  • It helps in finding the largest out-component of a graph.

References

  • TODO.
@szhorvat szhorvat added theory The math behind the issue needs to be worked out; literature research needed wishlist Feature request that has not been chosen for implementation yet; vote or comment to prioritize it! labels Dec 6, 2021
@vtraag
Copy link
Member

vtraag commented Dec 6, 2021

It seems that faster than O(|V|^2) is difficult. From https://doi.org/10.1016/j.ipl.2016.05.002:

image

@szhorvat
Copy link
Member Author

szhorvat commented Dec 6, 2021

Do we still implement this if it's O(|V|^2)? I'm not sure it's worth it in that case.

We already have this feature in igraph_closeness(), though it's counterintuitive to look for it there.

The intuitive way to solve it is with repeated calls to igraph_subcomponent(). The drawback is the overhead of building the adjlist internally with every call.

Thus, there may be minor benefits to such a function, but I am not sure it's worth it. Opinions? If we do decide to implement it anyway, it can be marked as a good first issue.

@vtraag
Copy link
Member

vtraag commented Dec 6, 2021

Worst case doesn't mean any implementation will always be so bad. I think there's some merit to such a function, so it wouldn't hurt including it.

@szhorvat szhorvat added good first issue Candidates for first-time contributors who are already familiar with C theory The math behind the issue needs to be worked out; literature research needed and removed theory The math behind the issue needs to be worked out; literature research needed labels Dec 6, 2021
@ayushagr2002
Copy link

Is this issue still open? If yes, I would like to work on it. I hope O(V^2) is fine?

@szhorvat
Copy link
Member Author

Please first finish the PR you are working on right now (#1892). Once that is merged, we can talk about this one.

@goar5670
Copy link

Hello, I think I have an idea for a O(|V| + |E|) algorithm but I am not sure it's 100% correct.
if the graph is a DAG, I think we can easily solve this in O(|V| + |E|) using DP where dp[u] represents the number of reachable vertices from node u and dp[u] = sum(dp[v]) for all v in adjlist of u
Otherwise if the graph is not a DAG we notice that for a strongly connected component (SCC) (a set of vertices where each vertex is reachable from the other) the number of reachable vertices is gonna be the exact same so we can run tarjan's SCC and compress the graph generating a DAG and then solving the problem for the new graph and then update the values for the nodes of each SCC.

@goar5670
Copy link

I would like to work on the issue even if my solution turned to be incorrect. That's of course in case you decided to move forward with the O(|V|^2) solution

@Tagl
Copy link
Contributor

Tagl commented Mar 18, 2024

Hello

I am the coauthor of a task in an algorithmic programming contest that had a problem which is exactly this.
The other coauthor (@atli164) realized after implementing the task that there was an optimization to be made.

We can use a SCC algorithm as mentioned before to obtain a DAG.
However we will still need to perform O(|V|^2) operations after that.
So what we want are the fastest operations possible for us.
By maintaining the set of reachable nodes from each starting vertex in a bitset (or bit string, essentially just a dynamic bit-size integer) we can use bit operations to speed up this step.

The time complexity becomes $O(\frac{|V|^2}{w} + |E|)$ where $w$ is the word size of the of the machine. Of course, on modern machines this is almost always $w=64$. Now in practice this isn't a full $64$ times speedup necessarily, but it is substantial.
Note that this also means we need around $|V|^2/8$ bytes to store these sets.

For the task to be solvable in python we decided on the value of $|V| \leq 30,000$, which in a language like C or C++ will run under a second, given that the graph is not dense enough for the factor of $|E|$ to dominate.

With SIMD registers you may get even higher speeds. I'm not sure whether the compiler would automatically optimize that for us and I've only ever written SIMD code in C++. I imagine it is very similar in C.

@Tagl
Copy link
Contributor

Tagl commented Mar 22, 2024

I will implement the method I described and make a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Candidates for first-time contributors who are already familiar with C theory The math behind the issue needs to be worked out; literature research needed wishlist Feature request that has not been chosen for implementation yet; vote or comment to prioritize it!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants