Automatic differenciation for complex numbers #29

GiggleLiu · 2018-10-25T10:42:34Z

I find Zygote supports complex number AD, I like it more now! e.g.

julia> gradient(imag, x)
((re = nothing, im = 1),)

But I have several questions,

Question 1:

Why it returns a NamedTuple rather than a complex number.

Question 2:

Why the following expression complaints not implemented error

gradient(x-> log(x) |> abs, x)

log is holomophic and complex gradient for abs is already defined in Zygote, there is no reason to break here. I suggest to create some traits for functions (is it possible?!) to help program know log is holomophic.

Question 3:

About convention, is the way of Zygote defining gradients for complex numbers proper?
We have two conventions here

zygote: gradient
partial derivative: https://giggleliu.github.io/2018/02/01/complex_bp.html

I think they are conjugate to each other.

I have implementation of the latter one here
https://github.com/GiggleLiu/poorman_nn/blob/44b54d4a86f689f528a301e3a4db6b05210bb16a/poornn/functions.py#L1034

I choose the latter convention because we want gradient(log, x) = log'(x) = 1/x rather than gradient(log, x) = log'(x)' = 1/x^*, it is more convenient for holomophic functions, right?

The text was updated successfully, but these errors were encountered:

MikeInnes · 2018-10-26T11:50:23Z

So this is actually working entirely by accident; from Zygote's perspective a complex number is just a pair of reals (re, im), and the gradient value reflects that.

I'd be happy to help you start to add real support for this though. The first step would be to add gradient definitions for real and imag that return complex numbers rather than structs like this. Then we just need gradient definitions for things like log. Many definitions – like abs2 – might work automatically if they are fairly simple.

The other thing apart from making this work is developing a real API for it; but see here for some good thinking on that.

MikeInnes · 2018-10-30T17:08:22Z

I just added those definitions in f2a8560, so a couple of simple things work now.

julia> gradient(x -> abs2(log(x)), 1+2im)
(0.21478062759933578 - 0.29549955871733735im,)

Let me know if I can help with any more support for this stuff.

GiggleLiu · 2018-10-30T18:42:02Z

Zygote is like a magic! I sincerely hope it can be stablized and documented as soon as possible, so that we can try cool stuffs on it.

Since I am not familiar with this fresh new framework. Could you please explain a bit why it is able to treat complex as a tuple by accident? Is it decoding the underlying structure of an instance?

MikeInnes · 2018-10-31T13:30:51Z

Glad you like it :)

That's the gist of it. As a source-to-source AD, Zygote sees (and has to differentiate through) everything in your code, including things like the creation of structs. So if you write something like

struct Foo; x; end
foo.x

This effectively has a gradient definition like

@grad getfield(f::Foo, :x) = f.x, dx -> ((x = dx,),)

Where we've by default decided that the "gradient" of the foo struct should be a named tuple. (You can of course change this, which is what I did in the complex numbers commit).

antoine-levitt · 2018-11-04T11:43:26Z

Even with the fix on master I think this is incorrect?

using Zygote

myeps = 1e-8
mydiff(f,z) = (f(z+myeps)-f(z))/myeps, (f(z+im*myeps)-f(z))/myeps

f(z) = abs2(log(z))
@show gradient(f,1.0+2im)
@show mydiff(f,1.0+2im)

gradient(f, 1.0 + 2im) = (1.2076065567220926 - 0.20091567785600395im,)
mydiff(f, 1.0 + 2im) = (-0.5638313371747472, 1.0866346933369186)

I'm not 100% sure how that @grad thing works, but in general, the derivative of a complex to complex map such as log cannot be expressed using a single complex number (it's a 2x2 matrix). In the case of holomorphic functions it can, but C->R functions are not holomorphic and so reverse-mode AD should not take holomorphism into account (except possibly as an optimization); see more details in JuliaDiff/Capstan.jl#1. Differentiating it as a tuple of reals is the way to go (at least at the beginning).

antoine-levitt · 2018-11-04T12:04:27Z

To clarify what's going on, a simple example : f(z) = imag(log(z)) at 1+im, giving

gradient(f, 1.0 + 1im) = (0.5 + 0.5im,)
mydiff(f, 1.0 + 1im) = (-0.49999998585903427, 0.4999999969612645im)

The gradient computation is (with h an infinitesimal)

f(z+h) = imag(log(z+h)) = f(z) + imag(h/z) = f(z) + real(h)*imag(1/z) + imag(h)*real(1/z)

hence the gradient is (imag(1/z), real(1/z)) = -0.5, 0.5. I'm guessing currently the AD code does something like assuming imag is holomorphic, hence the error?

GiggleLiu · 2018-11-04T17:57:44Z

@antoine-levitt Thanks for explaination, but I think Zygote has already treated an imaginary number as a tuple of real numbers. The problem here is probably in log?

One thing we should notice is gradients are different from derivatives for complex numbers (the first equation in
https://giggleliu.github.io/2018/02/01/complex_bp.html).
Which means grad(log(x)) is 1/x* rather than 1/x, quite non-intuitive, right? This is exactly the question 3 wants to emphasis.

GiggleLiu · 2018-11-04T18:08:28Z

An easy fix should be wrapping a ‘conj’ for every holomophic functions gradients in backward pass.
Hopping for real numbers, this wrapper can always be optimized away by compiler.

antoine-levitt · 2018-11-04T20:19:22Z

Indeed, that was the problem; the correct definition is

Zygote.@grad log(z::Complex) = (log(z), y -> (y*conj(1/z),))

I got confused by the mentions of holomorphism; this has nothing to do with holomorphism (it's just that for holomorphic functions the gradient operation happens to be a complex multiplication).

There's the possibility of nasty bugs here, with gradients defined with real numbers in mind getting used with complex numbers. As long as the real functions are typed appropriately this should be fine.

GiggleLiu · 2018-11-05T02:03:15Z

You are right, and it seems the function in f2a8560 should be fixed.

MikeInnes · 2018-11-05T10:12:34Z

Yeah this is probably just a bug in the gradient definition. I've pushed a fix but I haven't been that careful about implementing this (we really need gradient checking tests and so on); I'm happy to help anyone who wants to hack on that or improve things generally.

ssfrr · 2019-03-29T19:08:34Z

Would this be expected to work in the latest release?

using Zygote

f(z) = z^2
gradient(f, 1+5im)

Currently it gives an error "Function output is not scalar" because of this check. Now I know the complex derivative isn't defined for non-holomorphic functions so I'm not sure if gradient is expected to work, but in the case of holomorphic functions like this that do have a well-defined complex derivative ~~it would be slick if the expected answer popped out~~.

edit: I no longer think it's a good idea to have gradient work this way for holomorphic functions, see below

antoine-levitt · 2019-03-31T12:56:27Z

Seems fishy to use gradient for holomorphic derivatives, since gradient strongly suggests a different input and output space. For holomorphic derivatives you probably want forward-mode anyway?

MikeInnes · 2019-04-01T13:24:58Z

I think it'd be fine to just change that restriction to Number and have this work. I'm a little wary of that being misleading, given that people sometimes effectively want a Jacobian, but it's an obvious enough generalisation that it might be ok, and you can always build anything else on top of that.

PhilipVinc · 2019-04-08T11:23:49Z

Bump. I Really would like this to happen.
And from past discussion I seem to recall that for holomorphic functions the current infrastructure is mostly capable of handling complex number.

The following errors on Zygote#master

using Zygote #(master)
f(z)=z^2
y, back = Zygote.forward(x->f(x), 1.0+5.0im)
back(1)

julia> back(1)
ERROR: MethodError: no method matching (::Zygote.Jnew{Complex{Float64},Nothing,false})(::Int64)
Closest candidates are:
  Jnew(::Union{Nothing, NamedTuple}) where {T, G} at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/lib/lib.jl:209
Stacktrace:
 [1] (::getfield(Zygote, Symbol("##319#back#187")){Zygote.Jnew{Complex{Float64},Nothing,false}})(::Int64) at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/lib/grad.jl:45
 [2] Type at ./complex.jl:12 [inlined]
 [3] (::typeof(∂(Complex{Float64})))(::Int64) at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/compiler/interface2.jl:0
 [4] Type at ./complex.jl:12 [inlined]
 [5] (::typeof(∂(Complex)))(::Int64) at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/compiler/interface2.jl:0
 [6] * at ./complex.jl:268 [inlined]
 [7] (::typeof(∂(*)))(::Int64) at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/compiler/interface2.jl:0
 [8] literal_pow at ./intfuncs.jl:243 [inlined]
 [9] f at ./none:1 [inlined]
 [10] (::typeof(∂(f)))(::Int64) at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/compiler/interface2.jl:0
 [11] #11 at ./none:1 [inlined]
 [12] (::typeof(∂(getfield(Main, Symbol("##11#12"))())))(::Int64) at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/compiler/interface2.jl:0
 [13] (::getfield(Zygote, Symbol("##71#72")){typeof(∂(getfield(Main, Symbol("##11#12"))()))})(::Int64) at /Users/filippovicentini/.julia/packages/Zygote/VpeBE/src/compiler/interface.jl:38
 [14] top-level scope at none:0

If some changes to master are needed and they are not extraordinarily complicated, if you guide me I'm more than happy to do it myself.

MikeInnes · 2019-04-08T12:06:46Z

Fixed that example on 2c30968; I just forgot the gradient for the Complex constructor earlier. I also did the ::Number change there.

julia> gradient(x -> x^2, 1+5im)
(2 - 10im,)

I'm closing this issue for now, I think the core is there now (broadcast works too thanks to #100) so we can open new ones for any specific things that break.

antoine-levitt · 2019-04-09T13:38:24Z

I think this is a really bad idea. It will silently fail if the function is not holomorphic:

julia> gradient(x -> x^2, 1+5im)
(2 - 10im,)

julia> gradient(x -> conj(x)^2, 1+5im)
(2 - 10im,)

or even if the function is holomorphic but has non-holomorphic parts inside:

julia> gradient(x -> (real(x)+im*imag(x))^2, 1+5im)
(4 - 20im,)

The only target this could plausibly aim at is the optimized computation of holomorphic C^n to C functions (https://en.wikipedia.org/wiki/Several_complex_variables). This seems to me to be a marginal use case compared to computation of derivatives of holomorphic C->C functions (better tackled by forwarddiff) or gradients of C^n->R functions (which already works), and can be done efficiently by computing the gradient of the real part and using the Cauchy-Riemann equations.

MikeInnes · 2019-04-09T14:15:42Z

The second example is a bug that would have come up anyway, I pushed a fix.

julia> gradient(x -> (real(x)+imag(x)*im)^2, 1+5im)
(2 - 10im,)

In general any issues that come up with this are going to come up if you take the gradient of real(f(x)); the question really comes down to whether this is the only reasonable interpretation when the output is complex (as opposed to whether that use case is in itself common or not, since if you wanted those other things you'd write them down differently).

For the conj(x) example, what would be your expected output?

antoine-levitt · 2019-04-09T18:20:06Z

I just don't think gradients of complex-valued functions should be defined at all. Assuming that you still want to do it, there are three cases, for scalar-valued functions:

R^n -> C function. This is straightforward
holomorphic C^n -> C. There the holomorphic derivative would be natural.
non-holomorphic C^n -> C. There the derivative is a 2n x 2 jacobian.

Given a C^n to C function, how do you choose what to return between the last two? This change chose version 2. Given this, there are two options: either blacklist non-holomorphic operations (real, imag, conj, abs) on complex numbers, or accept the fact that a wrong result will be silently returned on functions that are non-holomorphic in any part of their computational graph, even if they are themselves holomorphic (the (re(z)+im(z))^2 above). Both these options seem to me unreasonable, especially when users can get holomorphic derivatives easily from the gradient of z -> real(f(z)) and the Cauchy-Riemann equations.

This is why I think the gradient of complex-valued functions should just be left undefined. Users can choose if they want to compute jacobians or holomorphic derivatives themselves.

In general any issues that come up with this are going to come up if you take the gradient of real(f(x));

Why? The gradient of a C^n to R function is well-defined, and follows by just seeing C as R^2.

antoine-levitt · 2019-04-09T18:25:33Z

Hmm, I just tried and:

julia> gradient(x -> abs2(x), 1+5im)
(2 - 10im,)

This is wrong, according to the only "reasonable" definition of gradient for a C^n to R function (nabla_f[i] = re(df/dxi)+im(df/dxi)*i). This is the definition that follows from seeing ComplexF64 as a struct of two reals, and the definition that is useful for e.g. gradient descent optimization.

MikeInnes · 2019-04-09T20:12:44Z

Ok, so we need to be careful about any non-holomorphic functions in DiffRules. Fixed in e1b00f7.

I agree that neither of those options are good, but I don't yet see that they are inevitable. Let's zero in on:

a wrong result will be silently returned on functions that are non-holomorphic in any part of their computational graph, even if they are themselves holomorphic (the (re(z)+im(z))^2 above).

I'd like to see an example, work through any issues with our current gradient definitions and see that there's a fundamental conflict. I don't currently see how this can be the case: there's only one gradient for a holomorphic function, and if Zygote doesn't get that right then it's a bug.

Krastanov · 2019-04-09T20:32:15Z

Related to the just submitted fix in DiffRules: Is it really possible to approach this with the suggested black list? Isn't a white list a better solution as most functions will not be holomorphic?

To the more general question: I would be eager to provide examples, but as @antoine-levitt alluded to, currently Zygote does not seem to have an official stance on what is the definition of "derivative of C^n -> C" function. One definition is "a 2n by 2 Jacobian of real values" (it is the only general one I can think of). Other definitions have to deal with "is the function holomorphic or not".

In particular, here are some confusing results on current master:

All of these should be 2x2 real matrices in my opinion:

julia> gradient(z -> real(z), 10+20im) # 
(1 + 0im,)

julia> gradient(z -> imag(z), 10+20im) # I do not understand this result
(0 + 1im,)

julia> gradient(z -> real(z)+imag(z), 10+20im) # this seems definitely wrong
(1 + 1im,)

Compare these two, which are supposedly the same function, but one of them is explicitly holomorphic

julia> gradient(z -> z, 10+20im) # makes sense if you decide "Zygote derivative" means "complex derivative for holomorphic functions"
(1,)

julia> gradient(z -> real(z)+imag(z), 10+20im) # should this be considered a holomorphic function?
(1 + 1im,)

Krastanov · 2019-04-09T20:41:47Z

I'd like to see an example, work through any issues with our current gradient definitions and see that there's a fundamental conflict. I don't currently see how this can be the case: there's only one gradient for a holomorphic function, and if Zygote doesn't get that right then it's a bug.

I think I disagree with this statement. There are derivative-like operations that make sense in mathematical sets that have nice structure (e.g. lie derivatives or directional derivatives in differential geometry; complex derivatives of holomorphic functions in complex analysis). These derivative-like operations share the important property that they are basis-independent. When implemented on a computer (unless we are using symbolic algebra systems), we usually choose a basis (e.g. x+iy or r*exp(iφ) for complex analysis). The moment we have chosen a basis, all the nice basis-independent entities can be expressed numerically, but we also loose all promises that the derivative-like operations continue to make sense.

The real numbers are an exception, because they are 1D where these issues simply do not arise. For everything else we have to explicitly choose to interpret expressions in basis-independent form (e.g. holomorphic functions act like that) or as basis-dependent form (e.g. using real and imag and conj). There is no general way to know whether a particular basis-dependent expression is actually equivalent to a basis-independent one. It is in general unsolvable to know whether a mixture or imag/real/conj/abs/angle ends up being a holomorphic function.

Edit: The use of the word "basis" above was mathematically imprecise / abusive. The correct term would be a "coordinate patch" or something similar.

MikeInnes · 2019-04-09T21:22:28Z

The cases you've listed are actually the easy ones: if the function output l is real then the gradient of x+yi is perfectly well defined as the pair dl/dx, dl/dy (which we can also express as a complex number). The Jacobian would be trivial here (second column is zero). This is also identical to treating Complex as if it were a struct of two reals; being that way is a design constraint since Zygote mechanically composes adjoint operations for machine numbers, and isn't designed to handle arbitrarily abstract notions of a derivative. I don't think the real-valued-function case is under issue here, only the complex-valued one.

Zygote's current position on complex gradients is that gradient is actually just sugar for

y, pullback = forward(f, x)
dx = pullback(1)

This naturally generalises to complex numbers; it's actually disabling them that involves a special case. You can also think of this as being the gradient for real(f(x) or the first column of the Jacobian; to get the other column we'd use pullback(im). Antoine's concerns notwithstanding, I don't think the issue here is whether this is a valid generalisation, but whether it's potentially confusing given that users may expect a different generalisation.

The only other option we have, AFAICT, is to return a Jacobian. But I think this has to be an explicit, separate function. Reasoning: writing gradient(x -> f(x) + 0im, x) shouldn't totally change the structure that gradient returns, it should either be essentially equivalent to the f(x) case or error out. So I think those are our two options. I'm not that opposed to making people write real(f(x)) if they want the current behaviour, but I'd also like to properly understand the concerns around doing that by default.

Krastanov · 2019-04-09T23:53:22Z

Thanks, that clarified some things.

The gradient as defined for a C -> R function is confusing to me, but it is indeed consistent and well defined. I guess it is useful for gradient descent, which is probably the main use case.

The derivative of explicitly holomorphic C -> C functions currently works as expected in complex analysis. It is not an obvious generalization of the C -> R case and it is useless for gradient descent (as you can not order complex numbers), but as long as it does not conflict with other things, I guess it is a nice thing to have.

The intermediate regime between the two (functions that are holomorphic, but the way they are written does not make that explicit) is very confusing for me... Could you please help me reason through this example (I think it is a bug):

# this is the z->z function written in polar coordinates
julia> gradient(z -> sqrt(abs2(z))*exp(1im*angle(z)), 10+20im)
(0.19999999999999996 + 0.4000000000000001im,)

EDIT: there were a couple more examples, that were not actually as problematic. I erased them in order to not distract.

Krastanov · 2019-04-10T00:47:14Z

To rephrase, for C -> C function Zygote currently does two different things and it is not obvious how it chooses which one to do.

It either returns pullback(1) which does not seem useful (as you already mentioned, pullback(im) is also necessary). Maybe gradient raising an error would make more sense for those cases.

Or it returns the holomorphic derivative, which is useful for holomorphic functions. This also happens to be the first Wirtinger derivative. And it happens to be a sum of pullback(1) and pullback(im).

In the language of this post, currently Zygote switches between the "realistic" and the "native" view of the problem, and it is not clear how it makes that choice: http://www.ekinakyurek.com/complex-derivatives-wirtinger/

Krastanov · 2019-04-10T01:02:38Z

Maybe gradient of C->C should always raise an error and the error message should say "use complex_jacobian or wirtingen_derivatives".

complex_jacobian = pullback(1), pullback(im)

wirtingen_derivatives = (pullback(1) - im pullback(im))/2, (pullback(1) + im pullback(im))/2

antoine-levitt · 2019-04-10T07:15:10Z

I'd like to see an example, work through any issues with our current gradient definitions and see that there's a fundamental conflict. I don't currently see how this can be the case: there's only one gradient for a holomorphic function, and if Zygote doesn't get that right then it's a bug.

Hm, thinking a bit more, I see I was straw-manning against something that was proposed for forwarddiff, and so I was confused: if done correctly, Zygote will actually do the right thing: (implicitly) propagate the 2x2 jacobian inside the computational graph, and only assume holomorphicness for the last step. This is actually quite clever. Then OK, this thing should actually work: it's just that for non-holomorphic functions it will not return the full derivative info (that would be the 2x2 jacobian). This has the potential to be very confusing and should be documented appropriately at some point, but at least it's consistent. Now there's just to make sure it works correctly.

MikeInnes · 2019-04-10T10:37:59Z

Ok, great, I'm glad we're all on roughly the same page – this has certainly been an enlightening discussion for me. @Krastanov yes, that was indeed a bug which is fixed now. I'll also cross reference that ForwardDiff issue: JuliaDiff/ForwardDiff.jl#157.

To summarise: we have two derivatives we might be interested in, the Wirtinger derivatives and the "Jacobi" ones (i.e. columns of the equivalent R^2 -> R^2 Jacobian; what Zygote natively operates on). Function output can be:

Complex holomorphic. Easy, because both first Jacobi and Wirtinger derivatives are conjugate. Moreover they are complete: the second pair of derivatives are trivial (Wirtinger is 0 and Jacobi is proportional by -i).
Complex non-holomorphic. Trickier: the holomorphic derivative is not allowed, and Jacobi and Wirtinger diverge.
Real. Clearly a special case of the above, but interesting in that Jacobi is now clearly more useful (e.g. gradient descent) in the non-holomorphic case, and is complete again.

Hopefully it's clear that Zygote isn't switching behaviour at any point here, it just happens that the current Jacobi derivative lines up with more standard things where they exist. Ideally, where it doesn't (complex non-holomorphic), we'd just throw an error, but of course we can't tell automatically.

antoine-levitt · 2019-04-10T11:37:37Z

One thing that might be confusing is that the gradient (as defined here, ie the gradient of the real part of f) of a holomorphic function is the conjugate of the usual derivative. As long as this is well-documented somewhere it's fine.

MikeInnes · 2019-04-10T12:10:20Z

Yes, @ssfrr just raised that point on slack as well. It's an extra complication. I think this is probably in some sense consistent with the fact that Zygote produces adjoints, but we should think through it carefully.

Hope he won't mind me reposting his thoughts:

C^n->R functions have a gradient but no derivative

C^n->C^m holomorphic functions have a derivative but no gradient (even for m==1)

C->C nonholomorphic functions have neither a gradient nor a complex derivative, but you can capture their local linearization with a pair of complex derivatives (the Wirtinger approach) or a 2x2 real jacobian

I need to think more about the C^n->C^m nonholomorphic case

I’ve been thinking of proposing separate derivative and gradient functions that would capture these cases and throw the appropriate errors (I think you can catch an attempted derivative on nonholomorphic functions by checking the Cauchy-Riemann equations at the end). The nice thing is that derivative generalizes nicely between real/complex/scalar/vector (see my post from yesterday with the 3 cases)

I think it makes some sense for gradient to expose what Zygote is doing in a fairly raw way, and have things like jacobian and derivative return more mathematically interesting things on top of that. Including taking adjoints where necessary, and so on.

GiggleLiu · 2019-04-10T12:27:14Z

non-holomorphic C^n -> C. There the derivative is a 2n x 2 jacobian.

Complex non-holomorphic. Trickier: the holomorphic derivative is not allowed, and Jacobi and Wirtinger diverge.

Normally, what we care about is the back propagation of adjoint. In which case the loss must be Real. Then there is no intrinsic hardness.

Consider a function z = f(x), where x ~ C^n and z ~ C^m. Given adjoint(z) := 2(dL/dz*), we know dL/dz = adoint(z)*/2 immediately. With both dL/dz and dL/dz*, it is easy to obtain adjoint(x) = 2(dL/dx*). I wonder why it is nessesary to consider an ambitious C^n -> C autodiff?

BTW: Is conj(x) supported now? (I can not precompile master branch now...)

Can't agree more on seperating derivative and gradient.

@antoine-levitt @MikeInnes

MikeInnes · 2019-04-10T12:31:31Z

Yes, to be clear the core technical problem is solved, and so gradients of Real functions are done. I don't think anything is going to change there. conj works now as well as a bunch of other things. It's more of an interface issue about how/if we expose other kinds of derivatives.

ssfrr · 2019-04-10T12:36:25Z

From @MikeInnes

I think it makes some sense for gradient to expose what Zygote is doing in a fairly raw way

I'd think gradient would throw an error for functions that don't return Real. I'm not sure what a gradient means for functions returning Complex, in the same way that it's not meaningful for functions with nonscalar output.

MikeInnes · 2019-04-10T12:40:33Z

Yes, I've disabled that in d82581f (that statement wasn't meant as an argument for or against that choice.). Although I've specifically disabled Complex, rather than all non-Real, as things like measurements or TaylorSeries can still be useful.

GiggleLiu · 2019-04-10T13:13:36Z

Great, I checked several cases quickly (master branch), the gradient turns out to be correct.
Can't wait to try out new features. 👍

Krastanov · 2019-04-10T17:14:09Z

@MikeInnes, thank you, this is pretty amazing. And yes, your last post made it very clear that "it's clear that Zygote isn't switching behaviour at any point here".

MikeInnes · 2019-04-11T19:02:54Z

Ok, we now have docs up that clarify these things: http://fluxml.ai/Zygote.jl/dev/complex/

Thanks again for the discussion, and please do report any bugs!

antoine-levitt mentioned this issue Nov 5, 2018

Fix complex gradients #35

Merged

ssfrr mentioned this issue Jan 10, 2019

Complex Differentiation JuliaDiff/Capstan.jl#1

Open

Roger-luo mentioned this issue Feb 6, 2019

Looks like complex tensor is broken right now Roger-luo/pytorch-complex#8

Closed

MikeInnes closed this as completed Apr 8, 2019

antoine-levitt referenced this issue Apr 10, 2019

be careful about non-holomorphic diffrules

e1b00f7

antoine-levitt mentioned this issue Apr 10, 2019

More complex tests, and some words of explanation #141

Merged

MikeInnes mentioned this issue Apr 10, 2019

Complex Number Interfaces #142

Closed

Krastanov mentioned this issue Sep 18, 2019

Explain relationship of Zygote's complex gradients with the Wirtinger calculus #328

Open

GiggleLiu mentioned this issue Mar 10, 2020

Complex valued AD for TensorOperations #151

Open

fuyangfeng mentioned this issue Aug 5, 2022

How do I customize the derivative of a matrix using Zygote: @adjoint #1282

Open

Automatic differenciation for complex numbers #29

Automatic differenciation for complex numbers #29

Comments

GiggleLiu commented Oct 25, 2018 • edited Loading

Question 1:

Question 2:

Question 3:

MikeInnes commented Oct 26, 2018 • edited Loading

MikeInnes commented Oct 30, 2018

GiggleLiu commented Oct 30, 2018

MikeInnes commented Oct 31, 2018

antoine-levitt commented Nov 4, 2018

antoine-levitt commented Nov 4, 2018

GiggleLiu commented Nov 4, 2018 • edited Loading

GiggleLiu commented Nov 4, 2018

antoine-levitt commented Nov 4, 2018

GiggleLiu commented Nov 5, 2018 • edited Loading

MikeInnes commented Nov 5, 2018

ssfrr commented Mar 29, 2019 • edited Loading

antoine-levitt commented Mar 31, 2019

MikeInnes commented Apr 1, 2019

PhilipVinc commented Apr 8, 2019 • edited Loading

MikeInnes commented Apr 8, 2019

antoine-levitt commented Apr 9, 2019

MikeInnes commented Apr 9, 2019

antoine-levitt commented Apr 9, 2019

antoine-levitt commented Apr 9, 2019

MikeInnes commented Apr 9, 2019

Krastanov commented Apr 9, 2019

Krastanov commented Apr 9, 2019 • edited Loading

MikeInnes commented Apr 9, 2019

Krastanov commented Apr 9, 2019 • edited Loading

Krastanov commented Apr 10, 2019

Krastanov commented Apr 10, 2019

antoine-levitt commented Apr 10, 2019

MikeInnes commented Apr 10, 2019 • edited Loading

antoine-levitt commented Apr 10, 2019

MikeInnes commented Apr 10, 2019 • edited Loading

GiggleLiu commented Apr 10, 2019 • edited Loading

MikeInnes commented Apr 10, 2019

ssfrr commented Apr 10, 2019

MikeInnes commented Apr 10, 2019 • edited Loading

GiggleLiu commented Apr 10, 2019 • edited Loading

Krastanov commented Apr 10, 2019

MikeInnes commented Apr 11, 2019

GiggleLiu commented Oct 25, 2018 •

edited

Loading

MikeInnes commented Oct 26, 2018 •

edited

Loading

GiggleLiu commented Nov 4, 2018 •

edited

Loading

GiggleLiu commented Nov 5, 2018 •

edited

Loading

ssfrr commented Mar 29, 2019 •

edited

Loading

PhilipVinc commented Apr 8, 2019 •

edited

Loading

Krastanov commented Apr 9, 2019 •

edited

Loading

Krastanov commented Apr 9, 2019 •

edited

Loading

MikeInnes commented Apr 10, 2019 •

edited

Loading

MikeInnes commented Apr 10, 2019 •

edited

Loading

GiggleLiu commented Apr 10, 2019 •

edited

Loading

MikeInnes commented Apr 10, 2019 •

edited

Loading

GiggleLiu commented Apr 10, 2019 •

edited

Loading