Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient complex differentiation by structured 2x2 matrices #87

Open
YingboMa opened this issue Jan 10, 2020 · 8 comments
Open

Efficient complex differentiation by structured 2x2 matrices #87

YingboMa opened this issue Jan 10, 2020 · 8 comments
Labels
Complex Differentiation Relating to any form of complex differentiation

Comments

@YingboMa
Copy link
Member

For efficient complex differentiation, we need to express the following structured matrices:

holomorphic:

[a -b]
[b  a]

anti-holomorphic:

[a  b]
[b -a]

C->R:

[a b]
[0 0]

R->C:

[0 0]
[a b]

general:

[a c]
[b d]

Wirtinger derivative archives this by doing a basis transformation from x, y to z, z̄. However, that would introduce more FLOPs, since you need to transform them to x, y when multiplying a complex number. I.e. (x = z+z̄) and (y = i(z - z̄))

IMO, structured matrices are far more transparent than Wirtinger derivative, and they don't require a change of basis before multiplying with a complex number.

To implement structured matrices, we could do

struct Holomorphic{T,S}
    a::T
    b::S
end

Base.:*(::Holomorphic, ::Complex) = ...
@oxinabox
Copy link
Member

these structured matrixes would subtype AbstractDifferential rather than AbstractMatrix?
Not that differentials have to subtype AbstractDifferential.

But they are not matrixes as such since they don't multiply with scalars as matrixes do, do they?
How do they multiply with complex numbers? \

@YingboMa
Copy link
Member Author

YingboMa commented Jan 10, 2020

A complex number can be interpreted as

[a -b]
[b  a]

So multiply with a complex number is the same as multiplying with a ::Holomorphic matrix.

@oxinabox
Copy link
Member

oxinabox commented Jan 10, 2020

Ah, ok So all parameters in the matrixs are Real / differentials for real primals.

@YingboMa
Copy link
Member Author

Yes, a, b, c, d ∈ ℝ.

@simeonschaub
Copy link
Member

I have been thinking about exactly this as well and can definitely see your point. On the other hand, what's definitely nice about Wirtinger is that it is very straightforward to take advantage of a function being holomorphic or anti-holomorphic by just using Zero() instead of another type. In these cases, it's also not less efficient than what you're proposing and by being able to just use complex numbers, IMO it makes it easier to read and write rules and to implement arithmetic on them. The case of R->C and C->R should just not use Wirtinger at all, I believe it is much more intuitive to use just a complex number and a wrapper type containing a complex number here, so that this information about the range and domain is transparent at a type level.
What I'm then arguing is the following: Currently, ChainRules doesn't contain any primitive functions that aren't either anti-, holomorphic, R->C or C->R, so in rules, we basically always use Wirtinger together with Zero(). In these cases, Wirtinger saves us from needing two extra structured matrix types and having to implement the corresponding arithmetic on each. The only situation I can think of, where you would need a full Wirtinger object is if you're adding the differentials of a holomorphic and an antiholomorphic function, and in these cases we can definitely decide down the road to construct a 2x2 Jacobian matrix instead as an optimization. We should definitely keep this discussion in mind, but I believe it still makes sense to keep working on R->C, C->R and Wirtinger differentials first and worry about this, once we have worked these things out properly.

@YingboMa
Copy link
Member Author

YingboMa commented Jan 10, 2020

I am not sure about using Zero() and Wirtinger to avoid more types is a good thing. The fact that you can use Zero() and Wirtinger to represent complex derivatives is because you remember how to translate Wirtinger and the position of Zero to different types of functions, e.g. holomorphic or anti-holomorphic.

One could implement ComplexDifferentiationResult{1..5} to represent 2x2 matrices, and that is just one type. Plus, it is inconvenient to compute the Wirtinger derivative of a function, however, computing the Jacobian matrix of a function is straightforward.

The point of using different types of 2x2 matrices is exactly not to use Zero and Wirtinger, because it unnecessarily obfuscates the computation, as in, it forces people to learn about Wirtinger derivative (which is just another basis, and that is not necessary to make AD work well). In addition, doing :: Wirtinger * ::Complex takes more operations than :: GeneralMatrix * ::Complex because Wirtinger derivative is on the z, z̄ basis.

All in all, I still don't see a compelling argument for using Wirtinger derivative.

@nickrobinson251
Copy link
Contributor

xref #40

@oxinabox
Copy link
Member

I am not sure about using Zero() and Wirtinger to avoid more types is a good thing.

Indeed types are cheap.


One thing I will note:
Its totally possible for us to support multiple differential types that represent that same derivitive.
We basically have to, see e.g. #85

Wether this is wise or not, I am not sure.
But we can have both schemes since it is possible to convert between the two.

If it is useful to have both, which I can imagine.
E.g. if some physics math is much easier to write with Wirtinger,
we can provide the functionality to convert rules that returns ComplexDifferentiationResult to Wirtinger to do the end math, or the reverse to use rules written with Wirtinger in to a larger system using ComplexDifferentiationResult

@nickrobinson251 nickrobinson251 added the Complex Differentiation Relating to any form of complex differentiation label Jan 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Complex Differentiation Relating to any form of complex differentiation
Projects
None yet
Development

No branches or pull requests

4 participants