-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored graph matching code and added many new features #960
Conversation
@daxpryce I think I've addressed everything as well as all the stuff I said I was gonna do, lmk if you have any other comments. I bumped the major version since this is API-breaking - hoping to release soon after this gets in |
@dokato would you have any interest in reviewing this PR (no pressure if not). If so, lmk and I can get you the permissions. |
@bdpedigo sure, I'll have a look. I tested parts of it already anyway. |
Big note: I think it would be useful to include the actual estimated permutation matrix, Phat, as a return for all graph matching related things. The place this might become hairy is in the case of paddings being performed, so perhaps there is some minor discussion to take place (possibly) by zoom later today. Some notes:
|
@ebridge2 some responses you thoughts above:
Can you explain why? I can see why for the book, but I think for most practices you never actually want to see this matrix, and it can be greated in one line. And I do feel it adds some complexity, as you mention. We could show how to make this matrix easily, and explain?
I designed these to behave exactly like I dislike the idea of including
Are you talking about |
Sorry, busy weekend. I accepted the invite now @bdpedigo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, I tested it on a few experiments, so far all run smoothly. A few minor things to consider.
Reference Issues/PRs
Fixes #959
Fixes #858
Closes #792
Closes #425
Closes #346
What does this implement/fix? Highlights
Design decisions
A
and the second to the input matrixB
. For cases where the two matrices are the same, then the first set of indices is basically worthless: it will always be just the sorted indices ofA
, or in otherwordsnp.arange(n)
. This is much like what happens withlinear_sum_assignment
in scipy. However, for differently sized matrices (in particular whenA
is bigger thanB
), then not everyone inA
will get a match, so this matters. See Understanding output of Graph Matching for unequal sized matches #925 for some discussion (it was confusing what happens with differently sized matrices).transport
functionality simply as extra parameters - this is mainly because all of the other parameters would be the same, and I felt it would be annoying to have a completely separate function to do this small adjustment to the matching algorithm.transport=False
. Tons of parameters can be lame. But I did try to alleviate this by labeling those parameters with thetransport_
prefix.numpy.random.Generator
syntax. This is because theRandomState
syntax is considered legacy by numpy so I don't think any new code should bend over backwards to support it, IMO https://numpy.org/doc/stable/reference/random/legacy.html#legacynp.einsum
but it is very hard to read and reason about, IMO.linear_sum_assignment
is called, since this is the only place this matters. I think the computational cost of the shuffle is minimal, and this makes the rest of the code easier to reason about since we dont have as many shuffle/unshuffle operations to do.Remaining work [Done now]
This is currently a work in progress, making the PR to document progress and what needs to be done: