Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question on the equation 1 of origin paper #1

Closed
littleredxh opened this issue Jan 11, 2021 · 3 comments
Closed

question on the equation 1 of origin paper #1

littleredxh opened this issue Jan 11, 2021 · 3 comments

Comments

@littleredxh
Copy link

I just doubt the correctness of equation 1 in the origin paper. if f_i and f_j (feature before the normalization) is set to have the norm of 1, then the derivate in equation 1 should be f_j. But the right side of the equation looks like not to be f_j. Can you explain it?

@Dyfine
Copy link
Owner

Dyfine commented Jan 11, 2021

Hi! In equation 1, we assume that \hat{f_i} = f_i / ||f_i|| and \hat{f_j} = f_j / ||f_j||, and these normalization operations are differentiable. What we want to obtain is the derivative of the inner product between two normalized embeddings (i.e. \hat{f_i} and \hat{f_j}) to one unnormalized embedding (i.e., f_i), thus we get the result in the paper. On the other hand, if we consider the inner product between two unnormalized embeddings (i.e. f_i and f_j), then f_j is correct.

@littleredxh
Copy link
Author

littleredxh commented Jan 11, 2021

Suppose we have ||f_i||=||f_j||=1, then \hat{f_i} = f_i and \hat{f_j} = f_j.
In equation 1, the left side should be partial{<\hat{f_i},\hat{f_j}>}/partial{f_i} = partial{<f_i,f_j>}/partial{f_i} = f_j; the right side should be 1/1*(f_j-cos_theta_ij f_i).

Problem: In this special case, the right part cannot equal the left part

@Dyfine
Copy link
Owner

Dyfine commented Jan 11, 2021

In fact, the left side of equation 1 should be partial{< f_i/||f_i||, f_j/||f_j|| >} / partial{f_i}. When we have ||f_i||=||f_j||=1, <\hat{f_i},\hat{f_j}>=<f_i,f_j> is correct, however, we don't have partial{<\hat{f_i},\hat{f_j}>}/partial{f_i} = partial{<f_i,f_j>}/partial{f_i}, since we always use the normalization operation which should always be considered in the derivative calculation.

@Dyfine Dyfine closed this as completed Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants