Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Question on the Gradient in Equation (10) #34

Closed
SenZHANG-GitHub opened this issue Apr 21, 2021 · 5 comments
Closed

A Question on the Gradient in Equation (10) #34

SenZHANG-GitHub opened this issue Apr 21, 2021 · 5 comments

Comments

@SenZHANG-GitHub
Copy link

Thansk for the great work! However, it seems the gradient for |ux| < 1 in Equation (10) is incorrect?

We are taking gradient of flow_x_at_q w.r.t. 1 - |u_x|, say \partial{1 - |u_x|} / \partial{F_x}, and u_x = p_x - (q_x + F_x_at_q)

Then it should be \partial{1 - |u_x|} / \partial{u_x} * \partial{u_x} / \partial{F_x}.

\partial{1 - |u_x|} / \partial{u_x} = - sgn(u_x) while \partial{u_x} / \partial{F_x} = -1. Thus shouldn't the gradient be +sgn(u_x) rather than -sgn(u_x)? I'm wondering whether this is the factor leading to failure for finetuning.

Thanks!

@SenZHANG-GitHub
Copy link
Author

SenZHANG-GitHub commented Apr 21, 2021

It seems to be a typo in the paper? I check softsplat.py (line 141-153) and the codes seem to be implemented using +sgn(u_x) and +sgn(u_y) accordingly : )

@sniklaus
Copy link
Owner

I am not so sure, https://www.wolframalpha.com/input/?i=derivate+1-abs%28x%29 clearly lists "-sgn(x)" as one of the alternative forms and I trust a computer's math more than I trust my own. 🙃

Also, see equation 7 of https://arxiv.org/abs/1506.02025 which similarly uses "-sgn(x - m)". As for the implementation, I employed a grad checker to validate its correctness. Be aware that the math in the paper is "p-centric" ("I am a pixel, what other pixels map to me?") whereas the implementation is "q-centric" ("I am a pixel, where does my flow point to?") for efficiency.

I am also not sure what you are referring to by "failure for finetuning" since we very much fine-tune the optical flow estimator (end-to-end training) and we show significant gains by doing so in Table 1 (see "Ours - PWC-Net" versus "Ours - PWC-Net-ft").

@SenZHANG-GitHub
Copy link
Author

SenZHANG-GitHub commented Apr 21, 2021

I am not so sure, https://www.wolframalpha.com/input/?i=derivate+1-abs%28x%29 clearly lists "-sgn(x)" as one of the alternative forms and I trust a computer's math more than I trust my own.

Also, see equation 7 of https://arxiv.org/abs/1506.02025 which similarly uses "-sgn(x - m)". As for the implementation, I employed a grad checker to validate its correctness. Be aware that the math in the paper is "p-centric" ("I am a pixel, what other pixels map to me?") whereas the implementation is "q-centric" ("I am a pixel, where does my flow point to?") for efficiency.

I am also not sure what you are referring to by "failure for finetuning" since we very much fine-tune the optical flow estimator (end-to-end training) and we show significant gains by doing so in Table 1 (see "Ours - PWC-Net" versus "Ours - PWC-Net-ft").

Thanks a lot for your reply! Yes derivating 1 - |u_x| w.r.t u_x is -sgn(u_x) for sure. But since Equation (10) is derivating 1 - |u_x| w.r.t. F_x _q rather than w.r.t. u_x, shouldn't we multiply the derivative of u_x w.r.t. F_x_q which is -1 ? And multiplying -sgn(u_x) with -1 leads to sgn(u_x).

If we use wolframealpha: https://www.wolframalpha.com/input/?i=d%2Fdx%281+-+%7Cp+-+%28q%2Bx%29%7C%29 will give us sgn(p - q - x), which is essentially sgn(u_x), recalling u_x = p - (q + x).

The code is great and is correct to me on the other hand. For instance, at line 142, we want to flow q' (fltOutput) to p' (fltNorthwest) and calculate the gradient in Equation (10) w.r.t. q' and p'. Here

u_x = intNorthwestX - fltOutputX = (int)(floor(fltOutputX)) - fltOutputX < 0, sgn(u_x) = -1 and

1 - |u_y| = intSoutheastY - fltOutputY = (int)(floor(fltOutputY )) + 1 - fltOutputY.

And the code is fltNorthwest = ((float) (-1.0)) * ((float) (intSoutheastY) - fltOutputY );

It seems the implementation is using sgn(u_x) rather than -sgn(u_x).

I previously thought the gradient might be incorrect and l might be one of the reasons for the difficulty of experiments in other issues. Now I think the code is correct and their problems are due to other factors : )

@sniklaus
Copy link
Owner

@Spartan09 This isn't really a good spot for raising this issue. Either create a new/separate issue in this repository, or in the CuPy repository. I would recommend that you do the latter since the issue seems CuPy, not SoftSplat.

@SenZHANG-GitHub Sorry for not having replied yet, I am swamped with work and haven't had time to look into this more closely.

@sniklaus
Copy link
Owner

sniklaus commented Oct 4, 2021

My apologies for the delay, I looked into this again and I agree with you. Huge thanks for pointing this out and explaining your reasoning in great detail! I accordingly added a note to the bottom of the project page: https://sniklaus.com/softsplat

@sniklaus sniklaus closed this as completed Oct 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants