Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature: finding densest k-node subgraph. Corresponding to issue #999 #1010

Conversation

francis-shuoch
Copy link
Contributor

Answering the wish list #999. I implemented the approximate algorithms proposed in the paper 2 @hagberg mentioned in his comment.

paper 2 is based on previous work of paper 1. Both are referred in the code.

Basically, this problem is NP-hard. The algorithm proposed is a combination of 5 different algorithms (as in the code, they correspond to __trivial(), __greedy(), __walks2(), __walks3(), __walks4(). For reference, better see in paper 1, since paper 2 didn't mention all 5 algorithms in detail), each of which can solve one aspect of this problem, and by choosing the best result out of 5 different results, the paper claim it has constant approximation ratio. Combining first 3 is called algorithmA, while adding the left 2 is called algorithmB. algorithmA alone can give reasonable approximation in average cases, algorithmB is brought up to deal with the worst cases. The work of paper 2 is mainly dealing with worst cases, too, besides more rigorous performance proof.

However, I found the 4th and 5th algorithm (hence algorithmB) too complicated, and slow. Meanwhile, as proven in the paper, I don't think they contribute much to the average case. I did write algorithm 4 (__walks3()), but I found it really consuming since it makes quite a lot of calls on the other 3 algorithms. So I dropped them, waiting for your review and suggestion, for now I don't want to dive into this blindly. I'm not math student, all the formulas made me dizzy already, so forgive me I didn't work it through.

For the tests, I used one graph from the wikipedia page for this program. And one using nx.wheel_graph(), because the centre of wheel_graph must be included if k > 1. in doc tests, I used nx.house_graph(), again, the triangle is the optimum solution when k == 3.

Waiting for your review. I would be happy to listen to your suggestion, and of course, open to bug fixes ;)

@argriffing
Copy link

Thanks for working on this! I'm more interested in slow exact solutions than in fast approximate solutions. For example the subset-sum problem is also NP hard and it has a naive O(N 2^N) brute force solution and a more clever O(2^(N/2)) solution. Maybe I should have just not mentioned anything about approximate solutions in the original github issue; sorry if that was misleading.

TODO: what happens when no such subgraph is founded

I think it is OK to require that the input graph has at least k nodes. If the caller is asking for the set of nodes in the densest subgraph with exactly k nodes, then if their input graph has fewer than k nodes it should be OK to just raise an exception.

@francis-shuoch
Copy link
Contributor Author

@argriffing It's Ok, I'm interested in problems w.r.t graph structures myself. Anyway, you know any better reference about this densest-k subgraph problem? According to that paper I referred, this problem is not solvable yet (exact solutions.).

@argriffing
Copy link

According to that paper I referred, this problem is not solvable yet (exact solutions.).

I'm not sure what you mean by this. The problem is already solvable by brute force, and it has already been proved to not have an efficient solution in the sense of P vs. NP. I'm wondering about strategies that are not efficient but which are faster than the naive brute force algorithm that enumerates all combinations.

@francis-shuoch
Copy link
Contributor Author

@argriffing Oh, now I understand what you mean. You are looking for something like pruning methods. I will see if I can find any. I think there could be some useful heuristics we can come up with.

@hagberg hagberg added this to the networkx-future milestone Apr 19, 2014
Base automatically changed from master to main March 4, 2021 18:20
@MridulS
Copy link
Member

MridulS commented Nov 8, 2023

We are almost at the 10 year anniversary of this PR! Thanks for your work on this @francis-shuoch

Sorry about never getting back to this. A lot has changed in networkx in the last 10 years and it would require some effort to make this work and review with the current main branch. Let me know if you are still interested in getting this in! It's totally fine if you don't have the bandwidth. Thanks again for your contribution!

@MridulS MridulS closed this Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

5 participants