You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The gradient of complex loss functions, which are functions with complex (intermediate) variables and with real output, should be defined in the way as if the complex numbers are simply tuples of real numbers. To fulfill this, the adjoint functions of complex functions should satisfy certain rules. I found that the definition of the adjoint function for "dot(x, y)" is incorrect in this sense, which is defined as v -> (v*x, v*y) in Zygote, but should really be defined as v->(v'*x, v*y). This would lead to wrong results for complex gradients. Actually, in https://arxiv.org/pdf/2003.04295.pdf a simple rule is provided to defined the adjoint function of complex functions (Eq.15 in the paper), which is almost as simple as the real case. In special cases of holomorphic, anti-holomorphic functions, function with real input or real output, simplified rules are also provided.
I checked many holomorphic functions in Zygote and it seems that it obeys this rule at least in these cases, but I am not sure if this is true for all the functions.
The text was updated successfully, but these errors were encountered:
Basically I was saying that a rule proposed to defined the adjoint function of complex functions, although this may not be new. Some of the adjoint functions defined in Zygote are not compatible with this rule, the function "dot" is such a case, which is a very frequently used one. If one build a complex loss function on top of dot function, then it is very likely that one would get a wrong gradient.
See the section of the Zygote docs about this. There are multiple consistent interpretations, we have to pick one and let you convert to a different format if you want it (usually just a case of taking an adjoint).
The text was updated successfully, but these errors were encountered: