New algorithm for ranking spanning arborescences for (#4322) #4796

edxu96 · 2021-05-13T09:53:03Z

I have done:

Use black.
Run all the tests for algorithms and docs locally.
Unit tests for core classes and functions used by these algorithms.
Use decorator for graph type.

As far as I know, there are problems:

All the edges are supposed to have the specified numerical attribute.
There is only one test case with 5 spanning arborescences.
The passed graph is deep copied and frozen inside the generator, in case it is modified later.

Allow user to specify an attribute for labels (#15)

Rebase on the latest version

networkx/algorithms/tree/rank.py

dschult

These changes might get this through the tests...
Also, the test file should be treated by black...

dschult · 2021-06-13T17:57:09Z

This looks like it is going to be very hard to review.
You've got Type Hinting which we don't use in the package itself.
You are subclassing the base classes with multiple subclasses or subclasses when all it seems like you are doing is adding data structures and methods to to find the source and target of edges. My first quick look suggests that all of this can be handled using NetworkX's paradigm of edges as 2-tuples (source, target). No complex class structure is needed. The type hinting suggests that nodes and edges are going to be strings, which makes it hard to incorporate into the NetworkX environment where nodes are any non-None hashable and edges are 2-tuples (or 3-tuples for MultiEdges). It would also be helpful if you didn't put what are essentially doc_strings in the middle of code as if they are comments. Put the docstrings just after the viewable function or class signature so they get processed by our autodoc sphinx tools into the reference manual for the package. docstrings on __init__ are ok... but will never get seen, so you should put most everything into the class doc_string, especially the parameters that the user should provide. It might also help to have a general overview of the whole module in the docstring at the top of the module so someone reading the code (or looking at this module to try to figure out whether it is what they want) can get a handle on the almost 1700 lines of code. There seems to be a lot of unnecessary code in this module. Subclasses that only provide an __init__ method that only calls the __init__ from the base class aren't needed. Also, use Python and NetworkX data structures where possible -- like 2-tuples for edges.

I guess a summary of all these comments is a request that you go through the code again with the goal of refactoring it into a minimalist, sleak set of functions and data structures.

The code clearly is a lot of work and it passes the tests and provides a nice feature. Thank you for this!

edxu96 · 2021-06-13T21:29:15Z

Thanks for the reply.

You've got Type Hinting which we don't use in the package itself.

No problem. Will adapt it to networkx's style.

You are subclassing the base classes with multiple subclasses or subclasses when all it seems like you are doing is adding data structures and methods to to find the source and target of edges. My first quick look suggests that all of this can be handled using NetworkX's paradigm of edges as 2-tuples (source, target).

This is the first thing I thought about when implementing this algorithm. The data structure with three dictionaries for sources, targets, weights of edges is used in the paper [camerini1980ranking] all along. There is no other paper or code implementation in the past 41 years, as far as I know. I learnt basic graph algorithms systematically before and (I think) am familiar with networkx. The way I see it, to redesign and code this algorithm in a more efficient way with a better data structure will take a long time.

The type hinting suggests that nodes and edges are going to be strings, which makes it hard to incorporate into the NetworkX environment where nodes are any non-None hashable and edges are 2-tuples (or 3-tuples for MultiEdges).

I would suggest users to stick to string as node name. This is not an algorithm that someone can play around ... and at least I am not able to design any unit tests for more generic cases.

It would also be helpful if you didn't put what are essentially doc_strings in the middle of code as if they are comments. Put the docstrings just after the viewable function or class signature so they get processed by our autodoc sphinx tools into the reference manual for the package. docstrings on init are ok... but will never get seen, so you should put most everything into the class doc_string, especially the parameters that the user should provide. It might also help to have a general overview of the whole module in the docstring at the top of the module so someone reading the code (or looking at this module to try to figure out whether it is what they want) can get a handle on the almost 1700 lines of code.

No problem. Will do that.

There seems to be a lot of unnecessary code in this module. Subclasses that only provide an init method that only calls the init from the base class aren't needed.

Will do that. (I didn't implement this algorithm alone. It was integrated in a huge program in my thesis, and it did took me a long time to decouple it from the rest.)

refactoring it into a minimalist, sleak set of functions and data structures.

Will see what I can do after solving those easy ones. It might be very hard to break them even smaller. In [camerini1980ranking], three giant algorithms, written in extremely brief mathematical terms, are present directly, along with some weird definitions ...

I did learn a lot by implementing it. It works okay with a larger case in my thesis, so I pass the defense :-)

[camerini1980ranking] Camerini, P. M., Fratta, L., & Maffioli, F. (1980). Ranking arborescences in O (Km log n) time. European Journal of Operational Research, 4(4), 235-242.

rossbar

It seems like this one is stuck. I'm +1 for the idea, but -1 for the implementation that introduces many new graph classes which (IMO) makes the code very difficult to understand. Even if the newly-introduced subclasses were made private (which would be necessary IMO), it still wouldn't help the readability very much as the OO design introduces a lot of indirection and new interfaces to learn before you can start to get down to the algorithm itself.

mjschwenne · 2022-06-28T14:41:39Z

I believe that this functionality has already be implemented with the ArborescenceIterator, documentation here, although the complexities of the two methods are different with this one performing better for graphs with more arborescences (either because they are larger or denser).

To get the $K$ maximum arborescences create a maximum ArborescenceIterator and break the loop after $K$ arborescences have been generated.

edxu96 · 2022-06-28T18:56:17Z

I believe that this functionality has already be implemented with the ArborescenceIterator, documentation here, although the complexities of the two methods are different with this one performing better for graphs with more arborescences (either because they are larger or denser).

To get the K maximum arborescences create a maximum ArborescenceIterator and break the loop after K arborescences have been generated.

I agree this algorithm does the same thing as ArborescenceIterator. They also both create an iterator. Thank both of you for pointing it out.

I agree the implementation is not good, because I exactly followed [camerini1980ranking]. The paper also mentioned that it was not efficient, because the algorithm was found for the first time. The type annotations I added are not accurate, so some work is needed for mypy tests.

edxu96 and others added 18 commits January 3, 2021 15:09

Format w/ black

f3a2b74

initial upload existing code

22bf99a

Add API in docs (#6)

735bdc2

Use any numerical edge attribute as weights (#2)

2c26a7c

Merge branch 'master' of https://github.com/edxu96/networkx

cab68a9

Returned SA still use weight attr (#14)

b79f976

Avoid freezing the original graph (#8)

1144382

Label edges automatically if told so (#13)

80e23de

Allow user to specify an attribute for labels (#15)

74d2cff

Merge pull request #16 from edxu96/label

e0a50e7

Allow user to specify an attribute for labels (#15)

Auto label edges if no user-specified label (#17)

039ce2b

Remove dependency on pandas (#11)

8b4a664

Descending and ascending (#5)

8d23634

Use decorator for accepted graph type (#10)

45c3300

Use built-in networkx exceptions (#19)

3993e5c

Merge pull request #20 from networkx/main

466dc7b

Rebase on the latest version

Remove dependency on loguru (#7)

68b3dd5

Update tree.rst and release_dev.rst (#21)

9408ce8

edxu96 mentioned this pull request May 13, 2021

New algorithm for ranking spanning arborescences #4322

Closed

Merge branch 'main' into algo-rank-for-issue-4322

8a304ec

dschult reviewed Jun 12, 2021

View reviewed changes

networkx/algorithms/tree/rank.py Outdated Show resolved Hide resolved

dschult reviewed Jun 12, 2021

View reviewed changes

networkx/algorithms/tree/rank.py Outdated Show resolved Hide resolved

dschult reviewed Jun 12, 2021

View reviewed changes

edxu96 and others added 4 commits June 13, 2021 17:59

Merge branch 'networkx:main' into algo-rank-for-issue-4322

52f04b2

minor changes

4d9ec1c

shorted cls docstring

23b525a

merge latest

260b56a

merge from the latest

1f0924f

Replace isomorphism/matchhelpers.close with math.isclose (#23)

9cd9dde

rossbar reviewed Jun 28, 2022

View reviewed changes

edxu96 closed this Jun 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New algorithm for ranking spanning arborescences for (#4322) #4796

New algorithm for ranking spanning arborescences for (#4322) #4796

edxu96 commented May 13, 2021

dschult left a comment •

edited

Loading

dschult commented Jun 13, 2021

edxu96 commented Jun 13, 2021 •

edited

Loading

rossbar left a comment

mjschwenne commented Jun 28, 2022

edxu96 commented Jun 28, 2022 •

edited

Loading

New algorithm for ranking spanning arborescences for (#4322) #4796

New algorithm for ranking spanning arborescences for (#4322) #4796

Conversation

edxu96 commented May 13, 2021

dschult left a comment • edited Loading

Choose a reason for hiding this comment

dschult commented Jun 13, 2021

edxu96 commented Jun 13, 2021 • edited Loading

rossbar left a comment

Choose a reason for hiding this comment

mjschwenne commented Jun 28, 2022

edxu96 commented Jun 28, 2022 • edited Loading

dschult left a comment •

edited

Loading

edxu96 commented Jun 13, 2021 •

edited

Loading

edxu96 commented Jun 28, 2022 •

edited

Loading