-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error about transform #53
Comments
@pbarbarant @alexisthual can you take a look ? |
Thanks @TuDou-PK for your interest in our repo! FUGW basically consists in solving a Gromov-Wasserstein problem between graphs, leading to weighted assignments between source and target nodes. The problem thus differs from a domain adaptation setting as there is no geometric displacement of the points. The idea of Going back to your example, if you have The notion of ambient dimensions in which each graph is embedded is only relevant for the computation of the geometric cost and is not related to the number of features. @alexisthual It might be worth to emphasize this in the documentation. I hope my explanation is clear :) |
@pbarbarant Thanks for your reply, please help me to confirm if I understand correctly: In your graph application, your aim is to map the 3000-node graph to the 9000-node graph, then you can do further process to compare with the mapped 9000-node graph and the target 9000-node graph. As for the node features, no matter if it's 50-dim or 100-dim, you don't care, right? Your aim only cares about the node(graph) geometry information. Ah! Maybe I got you, that's why in your code, the array shape is like [position, node_number], then actually you treat the "node_number" as the feature, that looks make sense. So the conclusion is POT and FUGW all are not wrong, just the aim is different? |
I believe the easiest example is an image where the underlying graph is the 2D grid and over each pixel (node of the graph) you will have a set of features which corresponds to the three RGB channels for example. In POT's DA example, you would only work in the RGB space, thus your RGB points would move geometrically in this color space, therefore you would get a mapping between colors. With FUGW, you also take into account the graph geometry between pixels, thus a slice of your data array at pixel |
OK, Thanks for your explanation 👍 |
Hi, I think something may be wrong when computing the mapped result after getting matrix pi.
please see the transform code from line 338 to 342 in fugw/src/fugw/mappings/dense.py
You use$pi^{T} \cdot S^{T}$
But the formula should be$pi \cdot Target$ , not source data. Please check the transform code from POT
I can show you the proofs based on the application and theory.
Proof 1 - application
Here is an example based on your example Transport distributions using dense solvers
After training and getting the pi, you can show the training points to compare with the mapped points,
The plot will be like:
You can see the mapped data actually close to source data,
and if you use POT way,
Then the plot will be:
You can see the mapped data close to the target data.
Proof 2 - theory
Here I can show you the result does not make sense.
We assume:
OT matrix pi, the shape is [3000, 9000].
POT code
From POT code, if we want to get the mapped source data in target space$S_{t}$ , we can use:
The$S_{t}$ shape will be [3000, 100], the details of shapes according to the formula before:
$$[3000, 100] = [3000, 9000] \cdot [9000, 100]$$
The source data shape from [3000, 50] in the 50-dim space map to [3000, 100] in the 100-dim space, the point number does not change. Each point just moves from the 50-dim space to the 100-dim space.
So the explanation of the OT algorithm is:
OT algorithm can map the data from the source space to the target space, without point number change.
FUGW code
According to FUGW code, if we want to get the mapped source data in target space$S_{t}$ , we can use:
The$S_{t}$ shape will be [3000, 100], and the details of shapes according to the formula will be:
$$[9000, 50] = [9000, 3000] \cdot [3000, 50]$$
So the source data from [3000, 50] in the 50-dim map to [9000, 50] is still in 50-dim space, the data not in the 100-dim target space! It does not make sense!
Please let me know if I was wrong :)
Btw, thanks a lot for the contribution to FUGW, it helps me a lot.
The text was updated successfully, but these errors were encountered: