How I could possibly use geomstasts for finding correspondences between gaussian distributions of different ranking #1979

ttsesm · 2024-03-19T17:18:54Z

Hi,

I've recently discovered your community and thus I am quite new to geomstats, to manifolds and/or optimal transport in general thus I am not sure whether this is the right place to write but in any case I hope I can get some feedback.

Briefly, I am trying to compute the distance between gaussian distributions of different dimensions (in my case 2D and 3D) and find their coupling (matchings or the best correspondences between the two sets). Taking into account that and after doing some research in the literature, I considered using the Gromov-Wasserstein (GW) distance based on cost matrices computed on the Bures manifold (I've tried using different optimal transport libraries like POT or OTT for playing with these distances and manifolds).

In a more practical way, imagine that you have the following test case where I have a set of 3D gaussian distributions defined as ellipsoids by their mean value μ (being a 1x3 vector) and their corresponding covariance matrices Σ1 (being a 3x3 matrix). Then I have a set of 2D gaussian distributions defined as ellipses again by their mean value ν (being a 1x2 vector) and their corresponding covariance matrices Σ2 (being a 2x2 matrix). So in practice the ellipses are the projected ellipsoids in a plane through a linear map P, $\mathbb{R}^3 \rightarrow \mathbb{R}^2$.

So, in simple words I want to find which ellipsoid corresponds to which ellipse by using their covariance information. In the ideal case scenario, the coupling is one-to-one, however in our real case scenario we will have outliers. Meaning that not all ellipsoids will match with all ellipses for that reason I considered using an approach based on unbalanced Gromov-Wasserstein distance or something similar. Considering though that the two distributions are not in the same space we need somehow to bring them into the same space where we can apply the matching and find the coupling and the transport plan.

For this purpose we thought that a suitable approach would be to first compute the cost matrices of each set in the Bures Wasserstein manifold (this should create something similar to a weighted graph of all distributions with all distributions) and then compute their correspondence with the Gromov-Wasserstein distance. I've tried to play a bit with an oracle test ablation study of 50 ellipsoids with 50 ellipses where there is always a one-to-one correspondence, meaning the first 2D gaussian corresponds to the first 3D gaussian, the second with the second, the third with the third and so on. So at the end the coupling should be an identity matrix. However as you can see from the attached image I am not really getting the desired output.

My data are in the following format (in case that someone wants to play):

# 2D Gaussians
# Covariances, 2x2 matrices
ct = np.array([[[0.00049392, 0.00018833],
    [0.00018833, 0.00007765]],
   [[0.00037171, 0.00021303],
    [0.00021303, 0.00012497]],
   [[0.00004997, 0.00002954],
    [0.00002954, 0.00012977]],
   [[0.00005111, 0.00000863],
    [0.00000863, 0.00002055]],
   [[0.00022878, 0.00014253],
    [0.00014253, 0.00023158]],
   [[0.00010538, 0.00007804],
    [0.00007804, 0.00030141]],
   [[0.00001159, 0.00000829],
    [0.00000829, 0.00000615]],
   [[0.00007754, 0.00002211],
    [0.00002211, 0.00011025]],
   [[0.00070302, 0.00060664],
    [0.00060664, 0.00052477]],
   [[0.00109267, 0.00122162],
    [0.00122162, 0.00137384]]], dtype=np.float32)
 
# Means, 1x2 vectors
mt = np.array([[-0.34575066, -0.05927338],
   [-0.16067392, -0.15656751],
   [ 0.0884882 , 0.7348885 ],
   [-0.09218572, 0.05595953],
   [ 0.21521865, 0.38351792],
   [-0.22240582, 0.30064186],
   [-0.0520223 , -0.18196055],
   [-0.4030493 , 0.22060394],
   [-0.17406262, -0.09412329],
   [ 0.3939966 , 0.21849754]], dtype=np.float32)
 
# 3D Gaussians
# Covariances, 3x3 matrices
Cs = np.array([[[ 0.00055704, 0.0002181 , 0.00020697],
    [ 0.0002181 , 0.00026268, 0.00017932],
    [ 0.00020697, 0.00017932, 0.00013152]],

   [[ 0.00029736, 0.00036497, -0.00025849],
    [ 0.00036497, 0.0005222 , -0.00034746],
    [-0.00025849, -0.00034746, 0.00028704]],

   [[ 0.00013889, -0.00002427, 0.00009434],
    [-0.00002427, 0.000106 , -0.00010998],
    [ 0.00009434, -0.00010998, 0.00020017]],

   [[ 0.00013997, 0.0000322 , -0.00003177],
    [ 0.0000322 , 0.00013866, 0.0000159 ],
    [-0.00003177, 0.0000159 , 0.00008226]],

   [[ 0.00015876, 0.00015987, 0.00013226],
    [ 0.00015987, 0.00046588, -0.0000928 ],
    [ 0.00013226, -0.0000928 , 0.00045325]],

   [[ 0.00014214, -0.00009106, 0.00025811],
    [-0.00009106, 0.00042213, 0.00002785],
    [ 0.00025811, 0.00002785, 0.00058036]],

   [[ 0.00003409, -0.00000352, 0.00000672],
    [-0.00000352, 0.00000058, -0.00000025],
    [ 0.00000672, -0.00000025, 0.00000226]],

   [[ 0.00025886, -0.00007174, -0.00000447],
    [-0.00007174, 0.00020796, 0.00000071],
    [-0.00000447, 0.00000071, 0.00017473]],

   [[ 0.00000316, -0.00008337, -0.00002749],
    [-0.00008337, 0.00292826, 0.00008949],
    [-0.00002749, 0.00008949, 0.00106939]],

   [[ 0.00080745, -0.00143929, -0.00031194],
    [-0.00143929, 0.00750104, 0.00179461],
    [-0.00031194, 0.00179461, 0.0016364 ]]], dtype=np.float32)
 
# Means, 1x3 vectors
Ms = np.array([[ 0.17411666, 0.5035763 , 0.42941484],
   [ 0.5251834 , -0.27657598, -0.18892948],
   [-0.5073861 , 0.46957424, -0.90752286],
   [ 0.56422734, -0.4835634 , -0.91173404],
   [-0.5370846 , 0.22988465, -0.2466668 ],
   [ 0.04097709, 0.48167574, -0.23595962],
   [ 0.49995327, -0.5364788 , -0.3129182 ],
   [ 0.4078352 , 0.46962914, -0.3338474 ],
   [ 0.5331596 , -0.24272342, -0.30946448],
   [-0.57427835, -0.20426048, -0.28931147]], dtype=np.float32)

My question now, and considering your experience, is whether I can optimize somehow the matching output so that I can get more reliable correspondences between the ellipsoids and the ellipses of my two sets by using the functionality of geomstats and possibly any different manifolds. I skipped through the tutorials but I couldn't find something completely relevant but considering that geomstarts includes different implementations of manifolds and distances I am curious whether there is something I could try in order to obtain a better result or geomstats cannot help me much here with my task.

Thanks.

p.s. btw there is a nice publication here, which in principle seems to examine my issue, in case someone is interested. Quite a few math to check on, but I guess it is normal for such a task.

The text was updated successfully, but these errors were encountered:

mwilson221 · 2024-04-06T15:13:11Z

Bures-Wasserstein is a distance on symmetric-positive semi-definite matrices, so you could just treat the 2-d Gaussians as degenerate 3-d gaussians and calculate it that way - but you would need a way to locate the plane containing the 2-d gaussians in 3-d space. I'm working with linear optimal transport for GMMs, which is a way to get deterministic (1-to-1) couplings, but it requires full rank covariances. But I think you might be able to leverage the fact that the Gaussians are orthogonally projected onto the 2-d plane to get around the full rank requirement... it gives you a unique 3rd eigenvector for the (2-d) covariances.

Is this something to do with Gaussian Splatting?

Here's some relevant resources:
https://arxiv.org/abs/1907.05254
https://arxiv.org/abs/2105.09755
https://arxiv.org/abs/2104.07970

ttsesm · 2024-04-08T09:48:26Z

@mwilson221 thanks for the feedback.

Actually, I have also thought about this idea of degenerate gaussians. Since in principle as I understand it, in simple words the 3D gaussians contain the information of the 2D gaussians plus the linear transformation, (transformation matrix, transport plan, you can call it as you wish). Well if you locate the plane in the 3D space so in principle I've got what I need. Based on your suggestion, in principle by saying to leverage the fact that the gaussians are orthogonally projected you mean to add an extra dimension to my 2D gaussians with a small value so that I can fulfill the full rank requirement and compare now 3D-to-3D gaussians through the wasserstein distance, which possibly should give me something?

The resources that you've linked I've had already found them plus this one which is also quite interesting. However, trying them in practice as I've mentioned does not give stable results.

It is related to gaussian splatting but not explicitly ;-).

For your gaussian mixture coupling are you using POT, OTT or geomstats?

mwilson221 · 2024-04-08T16:10:07Z

I don't think my idea in the last post actually gets you anything. If you make the 2-d gaussians 3-d and then use SPD_Matrices(3).project() you can do the following to get a 1-to-1 mapping (but it won't necessarily be a coupling);

Let the 3-d gaussians be a gaussian mixture $\nu = \sum_{i} a_iN(m_i, \Sigma_i)$

Let the projected 2-d gaussians be a gaussian mixture $\mu = \sum_{j} b_j N(m_j, \Sigma_j)$

Calculate the a distance matrix with $D_{ij} = ||m_i - m_j||^2 + Bures(\Sigma_i, \Sigma_j)$

Calculate the optimal coupling $\gamma = ot.emd(a,b,D)$

Define $T((m_i,\Sigma_i)) = Exp_{(m_i,\Sigma_i)} \big(\sum_{j} \frac{\gamma_{ij}}{a_i} Log_{(m_i\Sigma_i)}((m_j,\Sigma_j)) \big)$

Exp and Log are with respect to the manifold $R^d \times Sym_d^+$. The means will all be in the plane.

I'm using POT for couplings.

ttsesm · 2024-04-09T08:20:37Z

Interesting, thanks a lot I will have a look and give an update ;-)

ttsesm · 2024-06-17T10:15:36Z

@mwilson221 a quite late update, but was occupied with other tasks.

So, we've tried to use this degenerative approach of 2D gaussians to 3D and then using the SPD_Matrices(3) projection but it didn't really make any difference.

To be honest, to me it is really strange and at the same time interesting that it is that hard to find this kind of correspondence between such gaussian distributions that in principle are linearly connected 🤔

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How I could possibly use geomstasts for finding correspondences between gaussian distributions of different ranking #1979

How I could possibly use geomstasts for finding correspondences between gaussian distributions of different ranking #1979

ttsesm commented Mar 19, 2024 •

edited

Loading

mwilson221 commented Apr 6, 2024 •

edited

Loading

ttsesm commented Apr 8, 2024 •

edited

Loading

mwilson221 commented Apr 8, 2024 •

edited

Loading

ttsesm commented Apr 9, 2024

ttsesm commented Jun 17, 2024

How I could possibly use geomstasts for finding correspondences between gaussian distributions of different ranking #1979

How I could possibly use geomstasts for finding correspondences between gaussian distributions of different ranking #1979

Comments

ttsesm commented Mar 19, 2024 • edited Loading

mwilson221 commented Apr 6, 2024 • edited Loading

ttsesm commented Apr 8, 2024 • edited Loading

mwilson221 commented Apr 8, 2024 • edited Loading

ttsesm commented Apr 9, 2024

ttsesm commented Jun 17, 2024

ttsesm commented Mar 19, 2024 •

edited

Loading

mwilson221 commented Apr 6, 2024 •

edited

Loading

ttsesm commented Apr 8, 2024 •

edited

Loading

mwilson221 commented Apr 8, 2024 •

edited

Loading