Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjoint function for complex functions #540

Closed
guochu opened this issue Mar 10, 2020 · 2 comments
Closed

adjoint function for complex functions #540

guochu opened this issue Mar 10, 2020 · 2 comments

Comments

@guochu
Copy link

guochu commented Mar 10, 2020

 The gradient of complex loss functions, which are functions with complex (intermediate) variables and with real output, should be defined in the way as if the complex numbers are simply tuples of real numbers. To fulfill this, the adjoint functions of complex functions should satisfy certain rules. I found that the definition of the adjoint function for "dot(x, y)" is incorrect in this sense, which is defined as v -> (v*x, v*y) in Zygote, but should really be defined as v->(v'*x, v*y). This would lead to wrong results for complex gradients. Actually, in https://arxiv.org/pdf/2003.04295.pdf a simple rule is provided to defined the adjoint function of complex functions (Eq.15 in the paper), which is almost as simple as the real case. In special cases of holomorphic, anti-holomorphic functions, function with real input or real output, simplified rules are also provided. 
     I checked many holomorphic functions in Zygote and it seems that it obeys this rule at least in these cases, but I am not sure if this is true for all the functions. 
@guochu
Copy link
Author

guochu commented Mar 12, 2020

Basically I was saying that a rule proposed to defined the adjoint function of complex functions, although this may not be new. Some of the adjoint functions defined in Zygote are not compatible with this rule, the function "dot" is such a case, which is a very frequently used one. If one build a complex loss function on top of dot function, then it is very likely that one would get a wrong gradient.

@MikeInnes
Copy link
Member

See the section of the Zygote docs about this. There are multiple consistent interpretations, we have to pick one and let you convert to a different format if you want it (usually just a case of taking an adjoint).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants