Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding tree broadcasting algorithm in a new module. #6928

Merged
merged 27 commits into from Mar 7, 2024

Conversation

Transurgeon
Copy link
Contributor

@Transurgeon Transurgeon commented Sep 14, 2023

Hello NetworkX developers,

I have implemented a first algorithm for the broadcasting module.
You can find an initial proposal and discussion about this here.

This algorithm retrieves the optimal broadcast time of a tree.
I decided to include it in approximation and heuristicssince my teammate and I are planning to add more algorithms which won't necessarily be optimal. (Broadcasting from an arbitrary graph and random originator is NP-complete after-all)

I tried to follow general good practices by including docstrings, references and I hope the code itself is also efficient enough.
Please provide feedback or let me know if you need clarifications on the algorithm itself.

@Transurgeon
Copy link
Contributor Author

Transurgeon commented Sep 21, 2023

@dschult sorry for the ping but could you take a look at this PR please? Would be really appreciated.

Also I was wondering if you could help for the failing test. What type labels am I missing?

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this -- see the comments below. :}

networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
@Transurgeon
Copy link
Contributor Author

@dschult thank you so much for your review. Could you verify the changes I've made to the code? I will do another review on the docstrings soon.
The next steps on our side will be to try and write a heuristic algorithm for arbitrary graphs and random originator. However, this can only return the time of a "broadcast scheme", I.e. a sequence of broadcast calls. (It is by no means optimal).
Also, I think it would be interesting to add some examples of applications of broadcasting in your guides. Do you think this will be a good idea? And what are the rules/guidelines to make a new guide?

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those changes to the code make sense.
I've got more comments below. I think this is close to being ready. Most of what I'm asking about are nit-picky details or big-picture doc_string and function name kind of questions.
:)

networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Transurgeon Transurgeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dschult Thanks a lot for your patience, please have a look at my recent updates/code review following the convention we have settled for the notation.
Any feedback is welcome as always!

networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
@Transurgeon
Copy link
Contributor Author

Hey @dschult I was wondering if you could take a look at this PR again. (sorry for pressing you)
On my end the code is looking ready for merging. I made changes according to the notation we have settled upon last time.
I am also thinking about adding some graph generators, notably the "de bruin" and shuffle-exchange graphs, would you guys be interested in adding those topologies?

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the big question is whether it makes sense to have this return both b_C and b_T as opposed to having two functions.

networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
@Transurgeon
Copy link
Contributor Author

Hey Dan, thanks for another round of review again. I made some changes to the code but I think I still need to go over the optimization you mentioned for the broadcast time of the graph.
As for separating the first function into two different ones (one for broadcast centre and one for broadcast time) if there is no way to decouple them nicely, I don't mind keeping it that way. Please let me know your thoughts.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for separating the first function into two different ones (one for broadcast centre and one for broadcast time) if there is no way to decouple them nicely, I don't mind keeping it that way. Please let me know your thoughts.

It looks like finding b_T and b_C are sequential: finding b_T needs to happen first. Then using b_T you can find b_C.
Since the function is named tree_broadcast_center, it needs to compute b_T and b_C. But are there cases when you just want b_T? If so, we should have a function to return b_T and tree_broadcast_center can call that function to get b_T and then use it to find b_C.

Is the "minimum broadcast number" (b_T) equal to the "tree broadcast time"? If so, it seems like in the function tree_broadcast_time, the case when node is None should just return b_T. So, I think there is a still a wording issue there. If they are not equal, then the docs need to make that more clear somehow.

@Transurgeon
Copy link
Contributor Author

As for separating the first function into two different ones (one for broadcast centre and one for broadcast time) if there is no way to decouple them nicely, I don't mind keeping it that way. Please let me know your thoughts.

It looks like finding b_T and b_C are sequential: finding b_T needs to happen first. Then using b_T you can find b_C. Since the function is named tree_broadcast_center, it needs to compute b_T and b_C. But are there cases when you just want b_T? If so, we should have a function to return b_T and tree_broadcast_center can call that function to get b_T and then use it to find b_C.

Is the "minimum broadcast number" (b_T) equal to the "tree broadcast time"? If so, it seems like in the function tree_broadcast_time, the case when node is None should just return b_T. So, I think there is a still a wording issue there. If they are not equal, then the docs need to make that more clear somehow.

You are partially right here, but there are some things I need to clarify.
In order to calculate the broadcasting centre, we do indeed need to pass b_T, but we also need to pass the dict of values labels along with the last vertex v (which is the hub of the broadcast centre).
Unfortunately I dont think we would be able to calculate the broadcast centers by just passing in b_T and the graph G, we would need the other parameters as well. And this means that it would be difficult to decouple the two (time and center).
I propose that we keep it this way. Since if someone were to need both, then they would need to go through the computation twice.
Also another clarification about your second question. The minimum broadcast time (b_T) refers to the broadcast time of a vertex in the centre (in other words the best vertex broadcast time). The docs mention that the broadcast time of an entire graph is the worst broadcast time amongst all its vertices (which is actually the opposite of b_T). This is why we need to compute the maximum of the minimum distances from every vertex to the broadcast centre.
I have made some minor changes to the docs accordingly, please let me know if anything is unclear and I will try to explain to you again.

@Transurgeon
Copy link
Contributor Author

I agree with you that the notation is a bit confusing since we use minimum broadcast time. and by definition the broadcast time of a vertex already refers to the minimum amount of time required to complete the broadcast from that vertex. However, the extra minimum in front of broadcast time (at least to my understanding) says that the particular vertex (or vertices) we are dealing with has the fastest broadcast time in the graph compared to the broadcast time of other vertices. Hopefully this makes a bit more sense to you? If not, I am happy to hear if you have suggestions on making the docs a bit more clear on that end.

@Transurgeon
Copy link
Contributor Author

@dschult Sorry for another ping.
However, as the semester is coming to an end for me, I would like you to review the comments I made on notation and the code changes I applied after your suggestions.
I am currently in the process of writing my final report and would be extremely proud to say that I managed to add these algorithms as open source code to NetworkX.
Thanks a lot for your time and continued support.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these in the approximation subfolder? It seems like these are exact methods -- not approximations. You aren't iterating over some approximate method getting closer to the actual value. Should we move these to networkx/algorithms/broadcasting.py

You will need to connect the two new functions to the documentation. Create a new file doc/reference/algorithms/broadcasting.rst. Probably good to copy an existing one and change it to make the names right.

You can check the docs html version by clicking on the "Details" link to the right of the test that is named documentation artifact. The navigate to the page for these functions.

More comments below. Thanks!

networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/approximation/broadcasting.py Outdated Show resolved Hide resolved
@Transurgeon
Copy link
Contributor Author

Why are these in the approximation subfolder? It seems like these are exact methods -- not approximations. You aren't iterating over some approximate method getting closer to the actual value. Should we move these to networkx/algorithms/broadcasting.py

You will need to connect the two new functions to the documentation. Create a new file doc/reference/algorithms/broadcasting.rst. Probably good to copy an existing one and change it to make the names right.

You can check the docs html version by clicking on the "Details" link to the right of the test that is named documentation artifact. The navigate to the page for these functions.

More comments below. Thanks!

You are right, I initially wanted to contribute more algorithms which were heuristics/approximations but I started off with this one since it was the easiest. It is probably better to move these two functions to where you suggested.
I will also apply the other changes you suggested.

@Transurgeon
Copy link
Contributor Author

Hey @dschult thank you so much again for your review. I have tried to generate the docs but don't seem to see the broadcasting subpage in the algorithms section. Could you please help me double check that the algorithm/broadcasting.rst file was properly implemented?

- Each call requires one unit of time.
- A node can only participate in one call per unit of time.
- Each call only involves two adjacent nodes: a sender and a receiver.
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also moved a big part of the docs for the tree_broadcast_center here since it seems like there was a requirement for docstrings for the module. I tried to imitate the example laid out in the boundary module of the algorithms. Please let me know if this looks ok on your end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. The module doc_string is helpful to give some context along with the list of functions.

Would it be worthwhile working through a simple example. I certainly needed that to understand what was going on. Something like a star graph? maybe:

As an example, consider a star graph with 4 edges connecting 5 nodes. Let one of the non-center nodes be the originator. On step one, they broadcast to the center. On step two, the originator has no one else to broadcast to. But the center node broadcasts to one of the non-originator nodes. Each time-step thereafter the center node broadcasts to another "naive" node. The broadcast time for that node is 4 time steps. If the center node is the originator, they broadcast to one node on each time step. The receiving nodes cannot help due to lack of connections to other nodes. The broadcast time for the central node is 4 time steps. That implies that the broadcast time for the star graph with m edges is m.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should add more examples. However the one you pointed to above seems a bit less formal to be included in the docs. I think the examples in the unit tests were quite good, maybe I can include some code in this part of the docs? Like the example below.
image

networkx/algorithms/broadcasting.py Show resolved Hide resolved
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the word "graph" or "tree" in the docs. The function works for trees. Do the framework and definitions work for regular graphs too? It seems to me that it could all work for graphs too, but the resulting tree used by the broadcasting would depend on the originator -- basically a bfs tree (or min spanning tree) rooted at the originator.

If there isn't a larger mature general-graph literature for broadcasting maybe we should use "tree" instead of graph everywhere. And in that case, this should probably be moved again. This time to algorithms/tree/broadcasting.py

To connect the broadcasting.rst file to the docs you need to add an entry in
doc/reference/algorithms/index.rst. If we move it again to the networkx/algorithms/tree directory then the docs rst info should not be in a separate file but in the doc/reference/algorithms/tree.rst file.

networkx/algorithms/broadcasting.py Show resolved Hide resolved
networkx/algorithms/broadcasting.py Outdated Show resolved Hide resolved
networkx/algorithms/broadcasting.py Outdated Show resolved Hide resolved
- Each call requires one unit of time.
- A node can only participate in one call per unit of time.
- Each call only involves two adjacent nodes: a sender and a receiver.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. The module doc_string is helpful to give some context along with the list of functions.

Would it be worthwhile working through a simple example. I certainly needed that to understand what was going on. Something like a star graph? maybe:

As an example, consider a star graph with 4 edges connecting 5 nodes. Let one of the non-center nodes be the originator. On step one, they broadcast to the center. On step two, the originator has no one else to broadcast to. But the center node broadcasts to one of the non-originator nodes. Each time-step thereafter the center node broadcasts to another "naive" node. The broadcast time for that node is 4 time steps. If the center node is the originator, they broadcast to one node on each time step. The receiving nodes cannot help due to lack of connections to other nodes. The broadcast time for the central node is 4 time steps. That implies that the broadcast time for the star graph with m edges is m.

networkx/algorithms/broadcasting.py Show resolved Hide resolved
@Transurgeon
Copy link
Contributor Author

Should we use the word "graph" or "tree" in the docs. The function works for trees. Do the framework and definitions work for regular graphs too? It seems to me that it could all work for graphs too, but the resulting tree used by the broadcasting would depend on the originator -- basically a bfs tree (or min spanning tree) rooted at the originator.

If there isn't a larger mature general-graph literature for broadcasting maybe we should use "tree" instead of graph everywhere. And in that case, this should probably be moved again. This time to algorithms/tree/broadcasting.py

To connect the broadcasting.rst file to the docs you need to add an entry in doc/reference/algorithms/index.rst. If we move it again to the networkx/algorithms/tree directory then the docs rst info should not be in a separate file but in the doc/reference/algorithms/tree.rst file.

There are many more algorithms that work for general classes of graphs, but like I mentioned before, they involve approximations/heuristics since the broadcasting problem is NP-hard. In the future, would it be fine to include these new approximation algorithms in this module?
These two algorithms are to my knowledge the only ones related to trees in the broadcasting literature. So I don't believe they should be included in the algorithms/tree/broadcasting path.
Let me know what are your thoughts on this, I personally feel like it's fine to keep it as is for the moment.
P.S. Thanks to your help, I was able to generate the docs correctly, it seems to be working very well.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the style issue (extra space) that my suggestion put in... So you will need to pull down your fork to local before any future changes.

I approve this PR. Now we can get another set of eyes on it.

I think we should leave the functions in this module because the ideas can clearly be extended to graphs from trees (though it gets more complicated).

Thanks!

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and pushed up a few minor touchups and rebased on main to ensure new tests/lints run fine. This LGTM, thanks @Transurgeon !

@rossbar rossbar merged commit d71ce11 into networkx:main Mar 7, 2024
41 checks passed
@jarrodmillman jarrodmillman added this to the 3.3 milestone Mar 7, 2024
cvanelteren pushed a commit to cvanelteren/networkx that referenced this pull request Apr 22, 2024
New functions to compute tree broadcast time for undirected
graphs.

Co-authored-by: Transurgeon <peter.zijie@gmail.com>
Co-authored-by: Dan Schult <dschult@colgate.edu>
Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants