Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated functions in core.py #7027

Merged
merged 9 commits into from Nov 17, 2023
Merged

Conversation

Schefflera-Arboricola
Copy link
Contributor

@Schefflera-Arboricola Schefflera-Arboricola commented Oct 18, 2023

  1. Added examples for core_number(G) and k_core(G, k) algorithms.
  2. Added @not_implemented_for("multigraph") to k_core() , k_shell() , k_crust() and k_corona()

@Schefflera-Arboricola Schefflera-Arboricola marked this pull request as draft October 18, 2023 18:41
@Schefflera-Arboricola Schefflera-Arboricola marked this pull request as ready for review October 18, 2023 18:42
@dschult
Copy link
Member

dschult commented Oct 18, 2023

The functions now raise for a multigraph, but they still don't raise when self-loops are present. (except for k_truss for some reason.)

Also note that this changes the previous behavior -- before this, a MultiGraph object with no multi-edges worked. So, if we decide to do it this way, we'll need to deprecate that behavior for two minor version releases. That means we can't use the not_implemented_for decorator until the deprecation period ends.

The wording in the doc_strings does not align with NetworkX terminology either. It talks about parallel edges instead of multiedges, for example. sigh

Seems like k_truss at least checks for no self-loops. but I don't see any tests for either selfloops or multiedges.

@Schefflera-Arboricola
Copy link
Contributor Author

Schefflera-Arboricola commented Nov 5, 2023

@dschult , In the recent 3 commits, I have made the following changes :

  • added test_k_truss_self_loop
  • some fixes in the docs and added Examples for all the functions in core.py.
  • removed some points from Notes because they were also in the Raises section.
  • removed “parallel edges” from some places because I don’t think we can have parallel edges in undirected graphs bcoz :
import networkx as nx
G = nx.Graph()
G.add_edge(1, 2, weight=5, id=1)
G.add_edge(2, 1, weight=8, id=2)
path=nx.shortest_path(G, weight='weight')
print(path) 
for u, v in nx.utils.pairwise(path):
    print(G[u][v]['id']) 

# output :
# {1: {1: [1], 2: [1, 2]}, 2: {2: [2], 1: [2, 1]}}
# 2  # this should have been 1 

The k_core, k_shell, k_crust, and k_corona doesn't raise any exceptions when the given core_number dictionary is different from the core_number dictionary generated by the core_number(G)(see this comment). Is that a bug that needs to be fixed or do we want the user to have the flexibility to define core_numbers as they want? even if it's the latter, I think we should still have some checks to validate the given core_number dictionary. If it’s the former, then we would have to generate the core_number dictionary to check if it’s the same as the given dictionary, so then I guess it would be better to not take the core_number dictionary and instead just compute it inside the functions. Please let me know your thoughts on this.

Thank you :)

@dschult
Copy link
Member

dschult commented Nov 5, 2023

I think your example for parallel edges does give the correct paths, but not the correct edge id because you need a MultiGraph to store multiple edges between 1 and 2. So the graph you created just overwrote the original edge and there is still only one edge between 1 and 2. If you change the first line to G = nx.MultiGraph(), I think it gets the result you want.

As for the optional core_number input argument, this is to avoid computing the core_numbers many times for the same information. Since that it is the time-consuming part of the other functions, a user that wants to use more than one of those functions can compute core_number once and then use the result to compute the other values. They should not expect to create their own core_numbers. It just saves some work if the work is already available to the user.

@Schefflera-Arboricola
Copy link
Contributor Author

@dschult I've updated the docs according to the code. I added another exception in the Raises section of the functions using the core_number() function internally, for when the core_number dictionary is not given and the graph is a Multigraph object. Please let me know if this looks good to you.

Thank you :)

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make all appropriate errors NetworkXNotImplemented (e.g. by changing line 581). But in all the functions, we can make it raise that exception for self-loops and for multigraph and/or directed.

The history is that NetworkXError used to be the only exception type with messages determining what the problem is. We shifted to multiple exception types later and have only incrementally been updating exceptions. This seems like a good time to do that. It also simplifies the Raises section.

Also, I think we should not mention that the code could work for multigraphs if the core_number is provided. While a carefully constructed dic for core_number might let the code give an answer, there is no well-defined core number for multigraphs and so the returned answer would not make sense. The core_number input is supposed to be created by core_number and that doesn't allow multigraphs. So, let's just return to the raises comment that the function is not defined for graphs with self loops or parallel edges. (with directed in there too when appropriate).

The wording involving dicts and the meaning of the key and value was carefully chosen to be short and hopefully consistent with the following syntax:

... a dict keyed by  <some description of the key> to  <some description of the value> 

e.g.: A dict keyed by node to its core number.

The import statements in each example should be removed. All examples in NetworkX assume that import networkx as nx has been done before the example is used.

I like the vertex -> node changes and the examples seem good! :}
Thanks

Comment on lines 545 to 548
NetworkXError
The onion decomposition is not implemented for graphs with self loops
or parallel edges or for directed graphs.
The onion decomposition is not implemented for graphs with self loops.
NetworkXNotImplemented
If `G` is a Multigraph or Directed graph.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make both of these errors NetworkXNotImplemented (line 581).
The other functions can be handled similarly. The history is that NetworkXError used to be the only exception type with messages determining what the problem is. We shifted to multiple exception types later and have only incrementally been updating exceptions. This seems like a good time to do that.

@Schefflera-Arboricola
Copy link
Contributor Author

@dschult thank you for the review. I've made the suggested changes. Please let me know if this looks good to you.

I think we should add @not_implemented_for("multigraph") decorator to the k_core, k_shell, k_crust and k_corona functions because all these are not defined for multigraphs and it would be less confusing this way. I can create a PR to deprecate the behavior of these 4 algorithms once this PR is merged to avoid any merging conflicts or add a commit to this PR. Let me know your thoughts on this.

Thank you :)

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we combine the two reasons for nx.NetworkXNotImplemented since the type of the error is the same? Something like:
The k-core is not implemented for graphs with self-loops or for multigraphs.

I will ask that you not use capital letters for Multigraph and Directed graph so they don't get confused with the classes (which has capital letters to start each part of the name). Capitals at the start of the sentence are fine. Otherwise lets pretend that multigraph is a real word and use lowercase. :) Pretty nit-picky... but it's good to be nit-picky sometimes. :}

For the calls of the exceptions, can we use the prefix nx. instead of the special import at the top? This might change other code, but it is arguably more clear and avoids an import at the top. There is a simmilar issue/question about whether to import not_implemented_for specially at the top, or use nx.utils.not_implemented_for(... in every call. I prefer the latter because if you forget where the function is from, you don't have to scroll to the top to find out. It says where it is from. Both forms show up a lot in networkx for sure. And I'm not sure if there is universal agreement on this issue. :}

Thanks for these changes!

@Schefflera-Arboricola
Copy link
Contributor Author

@dschult thank you for the review. I've made the suggested changes. Please let me know if this looks good to you :)

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this -- looks good.

Just as I say that, I notice that you used .nodes() in the examples to create the NodeView. That is probably preferred to be the attribute (though the method formulation is available for backward compatibility). So, just .nodes is perhaps more aligned with .nodes and .edges being attributes. I'm Ok with leaving it this way too.

I think this is ready for another pair of eyes! :}

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and applied the "parallel edges" -> "multigraph" feedback to the other remaining instances in the docstrings.

Let's make all appropriate errors NetworkXNotImplemented (e.g. by changing line 581). But in all the functions, we can make it raise that exception for self-loops and for multigraph and/or directed.

One unfortunate consequence of changing the exception type is that this will break folks who are explicitly catching NetworkXError exceptions in try-except statements. If NetworkXNotImplementedError inherited from NetworkXError I don't think this would be an issue, but that's not currently the case.

This is a very minor API break as a result of the cleanup, so I'm not inclined to have it be a blocker - I just wanted to make sure others are aware of the behavior change!

Everything else LGTM so I'll approve but let someone else merge after seeing this comment!

@Schefflera-Arboricola
Copy link
Contributor Author

@rossbar , but we can pass multigraph objects in the k_core , k_shell , k_crust and k_corona like this :

import networkx as nx
G = nx.Graph()
G.add_edges_from([(1, 2),(2, 3)])
cn = nx.core_number(G)
G = nx.MultiGraph(G)
print(nx.k_shell(G, k=1, core_number = cn))
# output : MultiGraph with 3 nodes and 2 edges

So, there won't be any NetworkXNotImplemented exceptions raised in the above case.

Also, earlier @dschult suggested to keep it parallel edges only, when I added this note in the Raisessection about the multigraphs :

Also, I think we should not mention that the code could work for multigraphs if the core_number is provided. While a carefully constructed dic for core_number might let the code give an answer, there is no well-defined core number for multigraphs and so the returned answer would not make sense. The core_number input is supposed to be created by core_number and that doesn't allow multigraphs. So, let's just return to the raises comment that the function is not defined for graphs with self loops or parallel edges. (with directed in there too when appropriate).

So that's why I suggested maybe we should consider adding @not_implemented_for("multigraph") decorator to these 4 algorithms.

I think we should add @not_implemented_for("multigraph") decorator to the k_core, k_shell, k_crust and k_corona functions because all these are not defined for multigraphs and it would be less confusing this way. I can create a PR to deprecate the behavior of these 4 algorithms once this PR is merged to avoid any merging conflicts or add a commit to this PR. Let me know your thoughts on this.

Thank you :)

@rossbar
Copy link
Contributor

rossbar commented Nov 14, 2023

but we can pass multigraph objects in the k_core , k_shell , k_crust and k_corona like this

The issue doesn't have to do with the multigraph instances, but rather with the changing exception type itself. Here's a concrete example; the following code will run fine in networkx 3.2:

>>> import networkx as nx
>>> G = nx.complete_graph(5)
>>> G.add_edge(1, 1)
>>> try:
...     nx.core_number(G)
... except nx.NetworkXError:
...     print("Caught it")
Caught it

However, because this PR changes the exception type from NetworkXError -> NetworkXNotImplementedError, the above try/except statement will no longer gracefully catch the exception:

>>> try:
...     nx.core_number(G)
... except nx.NetworkXError:
...     print("Caught it")
Traceback (most recent call last)
   ...
NetworkXNotImplemented: Input graph has self loops which is not permitted; Consider using G.remove_edges_from(nx.selfloop_edges(G)).

In practice, this means that any user code which is catching exceptions related to the k_ functions may now result in exceptions when previously the code did not fail. Because there are many users of NetworkX, we always try to keep a look out for instances where changing code within NetworkX is likely to break the code of downstream users who depend on it!

Just to restate my opinion from above though: I'm willing to try sneaking this breaking change in, since IMO it is likely to affect only a relatively small number of users (i.e. those explicitly catching exceptions from k_ functions).

@Schefflera-Arboricola
Copy link
Contributor Author

@rossbar I was able to understand your point on the effects of changing NetworkXError to NetworkXNotImplementedError. But, in the above comment, I was referring to the "parallel edges" -> "multigraph" commit that you made. Sorry for not explicitly mentioning it.

Thank you :)

@dschult
Copy link
Member

dschult commented Nov 14, 2023

I think this looks good as it is right now.

It is true that people can get the wrong results if they submit a core_number that is not accurate. And by messing with this in strange ways they can provide a core_number input that is valid for a multigraph (so long as it doesn't have any multiedges). But that actually gives the right answer! :} To make it give the wrong answer you have to change the graph in some way between computing the core_number of G and computing e.g. it's k_core while using that core_number. But this is natural. If you change the graph the k_core will change, so I would hope that people would only use the input core_number if they haven't changed the graph.

The term multigraph versus parallel edges in the doc_string doesn't bother me. I want to avoid MultiGraph because that refers to a class, but multigraph is fine. Another related term is multiedges which I know is used in other places. I don't think we use parallel edges very often. So this shift to multigraph makes the docs align better with the rest of the package. (I think the term "parallel" in "parallel edges" is strange because edges aren't always straight. So brining in terms from geometry might imply things we don't mean. Playing with the words -- and making up new ones like multiedge -- is more fun anyway.)

:)

@Schefflera-Arboricola Schefflera-Arboricola changed the title added examples in core.py updated functions in core.py Nov 16, 2023
@dschult
Copy link
Member

dschult commented Nov 17, 2023

OK... Thanks for this!
With two approvals and a thumbs up by the OP, I'm going to merge it. :)

@dschult dschult merged commit ca32cca into networkx:main Nov 17, 2023
38 checks passed
@jarrodmillman jarrodmillman added this to the 3.3 milestone Nov 17, 2023
cvanelteren pushed a commit to cvanelteren/networkx that referenced this pull request Apr 22, 2024
* added examples in core.py

* updated core.py docs

* added test_k_truss_self_loop

* small bug fix in test_k_truss_self_loop

* updated core.py

* updated core.py file

* inline implementation for decorators and exceptions and modified  sections

* changed .nodes() to .nodes

* Replace parallel edges with multigraph in docstrings.

---------

Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants