ENH: Allow QAP to accept adjacency matrices of different sizes #13034

asaadeldin11 · 2020-11-02T16:45:16Z

Reference issue

What does this implement/fix?

This PR implements "padding" to the quadratic_assignment() function in scipy.optimize, allowing users to input matrices A and B of different sizes. Two different padding schemes are implemented here, and the user is given the option to choose between the two. adopted (default) padding matches the appropriate induced subgraphs, and naive matches the appropriate subgraphs. More information can be found in section 2.5 of "Seeded Graph Matching"

Additional information

asaadeldin11 · 2020-11-10T15:27:56Z

@mdhaber please take a look when you have a chance!

scipy/optimize/tests/test_quadratic_assignment.py

scipy/optimize/_qap.py

mdhaber · 2020-11-10T16:21:40Z

scipy/optimize/_qap.py

@@ -507,6 +519,29 @@ def _quadratic_assignment_faq(A, B,
    return OptimizeResult(res)


+def _adj_pad(A, B, method):
+    # pads the matrix with less nodes such that A & B are same size


I'm confused by "less nodes"?

would "fewer nodes" be more clear

That would be the better word, but "fewer" relative to what?
Do you mean it pads the smaller of A and B to be the same sized as the larger?

Yes. Your wording is more clear, so I'll replace what I had written with that.

mdhaber · 2020-11-10T16:32:09Z

scipy/optimize/_qap.py

+        B = 2 * B - np.ones((B_n, B_n))
+
+    if A.shape[0] == n[0]:
+        A = pad(A, n)


Suggested change

A = pad(A, n)

A = np.pad(A, (0, n[1]-n[0]))

mdhaber · 2020-11-10T16:32:41Z

scipy/optimize/_qap.py

+    def pad(X, n):
+        X_pad = np.zeros((n[1], n[1]))
+        X_pad[: n[0], : n[0]] = X
+        return X_pad


Suggested change

def pad(X, n):

X_pad = np.zeros((n[1], n[1]))

X_pad[: n[0], : n[0]] = X

return X_pad

mdhaber · 2020-11-10T16:33:40Z

scipy/optimize/_qap.py

+    if A.shape[0] == n[0]:
+        A = pad(A, n)
+    else:
+        B = pad(B, n)


Suggested change

B = pad(B, n)

B = np.pad(B, (0, n[1]-n[0]))

mdhaber · 2020-11-10T16:40:24Z

scipy/optimize/_qap.py

+    A_n = A.shape[0]
+    B_n = B.shape[0]
+    n = np.sort([A_n, B_n])
+    if method == "adopted":


Should the n's in the paper have subscripts?

mdhaber · 2020-11-10T16:41:42Z

scipy/optimize/_qap.py

@@ -507,6 +519,29 @@ def _quadratic_assignment_faq(A, B,
    return OptimizeResult(res)


+def _adj_pad(A, B, method):
+    # pads the matrix with less nodes such that A & B are same size
+    # schemes according to section 2.5 of [2]


Yes, this does seem to do the following:

mdhaber · 2020-11-10T16:45:28Z

scipy/optimize/tests/test_quadratic_assignment.py

+        n = 50
+        p = 0.4
+        G1 = _er_matrix(n, p)
+        G2 = G1[: (n - 1), : (n - 1)]  # remove two nodes


Suggested change

G2 = G1[: (n - 1), : (n - 1)] # remove two nodes

G2 = G1[: -1, -1] # remove two nodes

Doesn't this remove one node?

jovo · 2020-12-08T16:27:42Z

@mdhaber let us know if you need anything else from else prior to reviewing this, thanks!

mdhaber · 2020-12-08T20:04:25Z

@jovo Thanks for your patience. It would help if we could improve the testing.

First, some questions:
- Can you explain how the existing test works? As I'm not very familiar with all this, it doesn't jump out to me what the assertion is after and how strong of a test that is (e.g. how wrong the code could be and still pass the test).
- Also, why is an "Erdos-Renyi graph" used?
- I haven't looked carefully - does the padding work correctly with all of the existing options (e.g. partial_match and P0)?
Can the existing test be adapted so that more than one node is removed? Perhaps it doesn't matter at all, but without thinking carefully about what is going on, one might assume that a test where the inputs are nearly the same size is "easier" to pass than one in which the inputs are very different sizes.
The existing test is only on col_ind . Is there some sort of test you can perform on the resulting objective function value (even if it's just a bound of some sort)? I know that may be difficult.
Would it be possible to add a test in which you pass in a matrix that has been manually padded in a way that is obviously correct and, e.g., confirm that you get the same result as if you let quadratic_assignment do the padding?
Ideally, can you add a test that validates against a published result or code? I know that's not always possible. If not, can you construct an example in which the optimal solution is obvious and check the result?
Especially if we can't do 3-5, let's add a few tests that check the result of _adj_pad. I'm sure we can check that against an expected result.
The current test only checks the case in which the second input needs to be padded. Can we add a test in which the first input needs to be padded? (It's ok if it's essentially the same but the inputs are swapped.)
Please consider the ideas above for both naive and adopted. We need to test both.

The logic looks right; I'm just trying to think of ways we can confirm that this will do the right thing.

jovo · 2020-12-08T20:08:25Z

@mdhaber no worries, I understand the (often thankless) difficulties of maintaining high quality packages. I'll talk to @asaadeldin11 and we'll address all those concerns right away, thanks!

mdhaber · 2020-12-08T20:15:32Z

scipy/optimize/_qap.py

+        padding and low-density subgraphs of `B`.
+
+        "naive" : matches `A` to the best fitting subgraph of `B`.
+


Maybe point the reader to more detail in Section 2.5 of [2].

asaadeldin11 · 2020-12-10T17:03:36Z

Thanks for the feedback!

First, some questions:

Can you explain how the existing test works? As I'm not very familiar with all this, it doesn't jump out to me what the assertion is after and how strong of a test that is (e.g. how wrong the code could be and still pass the test).

This is a relatively weak test, it was basically just a sanity check to make sure nothing was broken. I agree that we need stronger tests, please see my suggestions below.

Also, why is an "Erdos-Renyi graph" used?

No particular reason, just that it is a simple graph model. The main strength of this padding is to find subgraphs in a larger graph, so it's applications are mainly in the graph matching side of things, though it does also work with the QAP

I haven't looked carefully - does the padding work correctly with all of the existing options (e.g. partial_match and P0)?

P0 now must be the size of the larger graph (among A and B) so I will add that to the docs. For partial_match, I will clarify in the type check that the indices must be within the sizes of original input graphs pre-padding (all entries of seedsA must less than len(A), as with B). Other than that these options work as intended.

Can the existing test be adapted so that more than one node is removed? Perhaps it doesn't matter at all, but without thinking carefully about what is going on, one might assume that a test where the inputs are nearly the same size is "easier" to pass than one in which the inputs are very different sizes.

Sure. I can generate three 25 x 25 ER graphs of different probabilities (say 0.6, 0.1, 0.2), and construct A as the block graph [[0.6, 0.1],[0.1, 0.2]] and B as just 0.6, so A is 50 x 50 and B is 25 x 25. Then i would expect the nodes from 1-25 to map to each other from A to B.

The existing test is only on col_ind . Is there some sort of test you can perform on the resulting objective function value (even if it's just a bound of some sort)? I know that may be difficult.

Yes, for the above test I can make sure that the objective function value is between 0 and 25^2 as well

Would it be possible to add a test in which you pass in a matrix that has been manually padded in a way that is obviously correct and, e.g., confirm that you get the same result as if you let quadratic_assignment do the padding?

sure. I can make small graphs of different sizes and manually pad one, then just make sure they get the same objective function value, if you think that's sufficient?

Ideally, can you add a test that validates against a published result or code? I know that's not always possible. If not, can you construct an example in which the optimal solution is obvious and check the result?

Since the padding section of the SGM paper is pretty short, the only result they show is essentially figure 3, which I recreated here. This is more of a structural/visual result, so I'm not sure if it would be appropriate to put it in the tests.

Especially if we can't do 3-5, let's add a few tests that check the result of _adj_pad. I'm sure we can check that against an expected result.

The current test only checks the case in which the second input needs to be padded. Can we add a test in which the first input needs to be padded? (It's ok if it's essentially the same but the inputs are swapped.)

Please consider the ideas above for both naive and adopted. We need to test both.

Sounds good on the above three.

The logic looks right; I'm just trying to think of ways we can confirm that this will do the right thing.

@mdhaber Let me know if have comments on my above suggestions, thanks!

mdhaber · 2020-12-11T01:33:19Z

Thanks for the responses. Yeah, I think all that would help. Only comment is about:

objective function value is between 0 and 25^2 as well

Sure. I suppose I was hoping for something a little stronger. Isn't this algorithm supposed to get within twice the optimal objective function value or something? Maybe that's only for the graphs being the same size?

asaadeldin11 · 2020-12-11T15:00:53Z

Sure. I suppose I was hoping for something a little stronger. Isn't this algorithm supposed to get within twice the optimal objective function value or something? Maybe that's only for the graphs being the same size?

The actual objective function range in the tests will be much more narrow than this, I just wasn't exactly sure what those values would be off the top of my head. I just meant that the range would be between 0 and 25^2 since those are the absolute bounds (in the tests it will likely be something like 250-350 to be more exact)

mdhaber · 2020-12-11T16:09:49Z

I know. I just wonder whether you can provide a bound from theory that is tighter than the (what I called trivial) absolute bound.

mdhaber · 2022-10-08T22:37:35Z

@asaadeldin11 @jovo Were you still interested in this? I may be able to help finish it up now if the new tests are stronger.

On the other hand, we haven't received any bug reports, enhancement requests, or attention from other maintainers for QAP, so I'm not sure how much the enhancement would be used. We could consider keeping this on the backburner until there is more interest.

jovo · 2022-10-19T14:27:10Z

@bdpedigo what do you think? you are in charge of this now :)

asaadeldin11 added 2 commits November 2, 2020 11:03

add padding function to _qap

89dee5c

add tests

ed5f912

asaadeldin11 marked this pull request as ready for review November 10, 2020 15:27

mdhaber reviewed Nov 10, 2020

View reviewed changes

scipy/optimize/tests/test_quadratic_assignment.py Outdated Show resolved Hide resolved

mdhaber reviewed Nov 10, 2020

View reviewed changes

scipy/optimize/tests/test_quadratic_assignment.py Show resolved Hide resolved

mdhaber reviewed Nov 10, 2020

View reviewed changes

scipy/optimize/_qap.py Show resolved Hide resolved

mdhaber reviewed Nov 10, 2020

View reviewed changes

asaadeldin11 added 3 commits November 10, 2020 12:35

addressing review comments

1263d89

resolve test failure

25ecb03

fix bug when calculating score after adopted padding

e4c8859

asaadeldin11 requested a review from mdhaber November 11, 2020 19:59

rgommers added enhancement A new feature or improvement scipy.optimize labels Nov 15, 2020

mdhaber reviewed Dec 8, 2020

View reviewed changes

asaadeldin11 added 2 commits February 1, 2021 11:46

update tests

932ee4c

add more tests

294c41e

asaadeldin11 requested a review from mdhaber February 15, 2021 16:50

mdhaber closed this Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Allow QAP to accept adjacency matrices of different sizes #13034

ENH: Allow QAP to accept adjacency matrices of different sizes #13034

asaadeldin11 commented Nov 2, 2020

asaadeldin11 commented Nov 10, 2020

mdhaber Nov 10, 2020

asaadeldin11 Nov 10, 2020

mdhaber Nov 10, 2020

asaadeldin11 Nov 10, 2020

mdhaber Nov 10, 2020

mdhaber Nov 10, 2020

mdhaber Nov 10, 2020

mdhaber Nov 10, 2020

mdhaber Nov 10, 2020 •

edited

Loading

mdhaber Nov 10, 2020

jovo commented Dec 8, 2020

mdhaber commented Dec 8, 2020 •

edited

Loading

jovo commented Dec 8, 2020

mdhaber Dec 8, 2020

asaadeldin11 commented Dec 10, 2020

mdhaber commented Dec 11, 2020

asaadeldin11 commented Dec 11, 2020

mdhaber commented Dec 11, 2020

mdhaber commented Oct 8, 2022

jovo commented Oct 19, 2022

	G2 = G1[: (n - 1), : (n - 1)] # remove two nodes
	G2 = G1[: -1, -1] # remove two nodes

		padding and low-density subgraphs of `B`.

		"naive" : matches `A` to the best fitting subgraph of `B`.

ENH: Allow QAP to accept adjacency matrices of different sizes #13034

ENH: Allow QAP to accept adjacency matrices of different sizes #13034

Conversation

asaadeldin11 commented Nov 2, 2020

Reference issue

What does this implement/fix?

Additional information

asaadeldin11 commented Nov 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdhaber Nov 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jovo commented Dec 8, 2020

mdhaber commented Dec 8, 2020 • edited Loading

jovo commented Dec 8, 2020

Choose a reason for hiding this comment

asaadeldin11 commented Dec 10, 2020

mdhaber commented Dec 11, 2020

asaadeldin11 commented Dec 11, 2020

mdhaber commented Dec 11, 2020

mdhaber commented Oct 8, 2022

jovo commented Oct 19, 2022

mdhaber Nov 10, 2020 •

edited

Loading

mdhaber commented Dec 8, 2020 •

edited

Loading