Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polymorphic isomorphism #5031

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pinselimo
Copy link
Contributor

This PR is a proposal to resolve issue #4978 by enabling the isomorphism functions to distinguish multiedge input from other input and act accordingly.
For backwards compatibility the old style multiedge functions can still be called, albeit internally using the new style functions.

The second commit of this PR changes two for loops with if clauses to use the all function as I find it much more readable and it short-circuits just fine, so speed should be equal. Feel free to remove this change from the PR however.

@dschult
Copy link
Member

dschult commented Sep 9, 2021

Thanks for this!
Do these need tests of some sort? What is the status of tests for match helpers?

@dschult dschult added this to the networkx-2.7 milestone Sep 9, 2021
@pinselimo
Copy link
Contributor Author

Of the three match helpers only two are tested currently. So if this is an objective, coverage could be improved significantly. Testing did however not get worse with the additions.

@rossbar
Copy link
Contributor

rossbar commented Dec 12, 2021

I think this makes sense... I wonder if we couldn't go a step further and boil the three methods (*_node_match, *_edge_match, and *_multiedge_match) down into one. AFAICT the _edge_match functions are just programatically generated copies of the _node_match functions. If we can fold in the handling of multiedges, then we could potentially deprecate the _edge_, _node_, and _multiedge_ distinctions and just have categorical_match, numerical_match, and generic_match.

Any thoughts on this @pinselimo ? Note that I'm not suggesting this has to be done in this PR, I'm only asking because you've clearly looked at this some.

Resolves issue networkx#4978 by enabling the isomorphism functions to
distinguish multiedge input from other input and act accordingly.
For backwards compatibility the old style multiedge functions can still
be called, albeit internally using the new style functions.
@pinselimo
Copy link
Contributor Author

I'm all for it. This would definitely simplify the API. Also, removing the doc-patching would clean the module up a lot.
Most importantly, consider the following case: A graph has both edge and node attributes with key A. Using a categorical_node_match conveys to the user that only node attributes are compared. This is not the case, right?
So categorical_match would also help avoid such confusion.

Should I open another PR?

(and sorry for the double force push, forgot to run black first)

@rossbar
Copy link
Contributor

rossbar commented Dec 14, 2021

I'm all for it. This would definitely simplify the API. Also, removing the doc-patching would clean the module up a lot.

Agreed, this is exactly what I was thinking. I've never used the matchhelpers in practice, nor does it seem like they are used internally in any of the isomorphism functionality, so I have no baseline for determining whether there are cases where the current API is favorable. Does anyone have thoughts/opinions on this or maybe any insight on the history here, @boothby @dschult ? I looked at the git history for the matchhelpers module but it wasn't terribly informative.

@dschult
Copy link
Member

dschult commented Dec 26, 2021

I understand the desire for fewer functions overall. But in this case it comes at the price of putting the if clauses inside a critical function for the speed of the algorithm. The match functions created here are called a huge number of times. Generally, they will be called with the same types of inputs (e.g. multigraph data dicts). Is it really better to include conditionals inside these inner-loop functions? Maybe some performance studies are in order.

We could reduce the number of functions and still make them fast by including keyword arguments for each option of type. We'd just have to make sure the if clauses stay outside of the match function. But even with that, which is the better API? match_functions(attr, default, value_type="numerical", object_type="node", multiedge=False) or the collection of functions: numerical_node_match(attr, default).

@rossbar rossbar modified the milestones: networkx-2.7, networkx-3.0 Feb 12, 2022
@rossbar rossbar modified the milestones: networkx-3.0, networkx-3.1 Nov 29, 2022
@jarrodmillman jarrodmillman removed this from the 3.2 milestone Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants