-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Orthogonal Procrustes Distance + similarity->distance #11
Conversation
warning this pull request makes anatome fail my sanity checks e.g. when D is really large (much larger than # data points) sim should be 1.0 since there is a lot of power for the linear model to correlate the two data sets. |
Code to do sanity check |
|
Something in this pull request breaks anatome... |
Thank you for reporting. I introduced |
for now I will use the current version... If I have time I will test or try to fix the new one, but I'm also busy hehehe :) hope this helps though, at least the santity check should allow us to chat obvious bugs. |
I see other differences too like:
instead of
|
I don't think you need to
it only assumes centering. In short, pearson-correlation already normalizes in the demoninator. Though, I don't think this should have made a difference. Centering is for sure needed since the product only gives the covariance when things are centered. |
if I may suggest this implementation for OPD - since it re-uses other code you already wrote:
|
I can confirm that normalizing by the forbenius norm breaks one of the CCA santity checks: normalizing by the forbenius norm breaks the sanity check when D is really large for cca. See:
Code:
but this didn't break the plots, surprisingly. |
No that is not it (according to my sanity checks above that ran with |
ok found the bug! You need to do the centering correct because
or even better reuse your My sanity checks (for all metrics pass) now:
|
final sanity check code:
|
Note, it's better to divide by centered data. The accuracy of OPD increases dramatically. Comparing the same matrix twice finally gives 1.0 up to 1e-4 instead of 1e-2 |
results:
|
x = _zero_mean(x, dim=0) | ||
x /= frobenius_norm(x) | ||
y = _zero_mean(y, dim=0) | ||
y /= frobenius_norm(y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get your point, do you mention this part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but that is not the code you are using for CCA or CKA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also at some point anatome was using:
def _matrix_normalize(input: Tensor,
dim: int
) -> Tensor:
from torch.linalg import norm
return input - input.mean(dim=dim, keepdim=True) / norm(input, 'fro')
which doesn't center correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find where I got that code...I am 100% I didn't write it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually don't write from torch.linalg import norm
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting! who knows, something weird might have happened in the a merge attempt I did in my fork.
But I think the message is clear, if we normalize making sure centering first and then using that centered data is important (especially if using it for all metrics).
I decided I will only normalize for OPD and only center for CCA.
Undecided for CKA.
anatome/distance.py
Outdated
x = _zero_mean(x, dim=0) / frobenius_norm(x) | ||
y = _zero_mean(y, dim=0) / frobenius_norm(x) | ||
x = frobenius_norm(x) ** 2 | ||
y = frobenius_norm(y) ** 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moskomule I think this is one source of the bugs...it's better to wrap one function and have all metrics use that.
x = _zero_mean(x, dim=0) | ||
x /= frobenius_norm(x) | ||
y = _zero_mean(y, dim=0) | ||
y /= frobenius_norm(y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also at some point anatome was using:
def _matrix_normalize(input: Tensor,
dim: int
) -> Tensor:
from torch.linalg import norm
return input - input.mean(dim=dim, keepdim=True) / norm(input, 'fro')
which doesn't center correctly.
@moskomule I am curious. What is the final conclusion for you for normalizing the matrices before computing the distances.
My hunch is that OPD is the only one that needs it and only centering is enough for the other two. |
I agree with it and if I remember correctly, I implemented so. |
in the risk of being redudant I do want to note that that is not what the authors of the OPD paper do [see here] (js-d/sim_metric#4 (comment)) (they normalize all the time) but with my sanity checks I doubt the difference will be large and I will do what you do and just center for CCA and CKA but only normalize for OPD. Thanks for discussions! :) |
#10