[WIP] Gromov-Wasserstein with asymmetric cost matrices #399
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Types of changes
Motivation and context / Related issue
In [12] "Gromov-Wasserstein Averaging of Kernel and Distance Matrices", the authors make no explicit assumption about the symmetry of the cost mastrices C and C'. However, their formula (9) for the computation of the gradient misses at least a 2 factor in the case of symmetric matrices, and even a second term in the case of asymmetric matrices, as explained in subsection 5.1 of "Multilingual Alignment of Word Embedding Spaces", Dan Meller and Gonzague de Carpentier, 2021 (https://www.researchgate.net/publication/357505092_Multilingual_alignment_of_word_embedding_spaces). In the code of ot/gromov.py, the 2 factor has been corrected, but it is not sufficient to support asymmetric cost matrices, which can be useful in the case of directed graphs for example. This contribution corrects the expression of the gradient in ot.gromov.gwggrad to support asymmetric distance matrices, by adding the appropriate term instead of just multiplying by 2.
How has this been tested (if it applies)
PR checklist