Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matrix-valued DualNumber: coding problems #5

Closed
bsxfan opened this issue May 2, 2013 · 8 comments
Closed

Matrix-valued DualNumber: coding problems #5

bsxfan opened this issue May 2, 2013 · 8 comments

Comments

@bsxfan
Copy link

bsxfan commented May 2, 2013

Hi All

I would love to contribute to the autodiff tools, but I have hit a hickup that is making me so despondent that I am seriously considering giving up on Julia for a while until it has matured.

I have spent a large effort on developing a dualnumber type (similar to my existing MATLAB implementation) that can take matrix-valued fields. This allows one get forward-mode AD to work smoothly through all kinds of matrix operations, which will be executed efficiently in BLAS/LAPACK. This has worked really well for me in MATLAB.

I have gone to the trouble of translating most of the functionality I have in MATLAB, but now I find I have a piece if code that I simply cannot debug.

If anybody is interested, my code is here:
https://github.com/bsxfan/ad4julia/blob/master/DualNumbers.jl

There are certainly many bugs and missing features remaining in this code, but at the moment I am stuck at the cat() function, which overloads vertical and horizontal concatention of dualnumber matrices.

This function behaves really weirdly. It is supposed to return a DualNum object. It manages to construct that object and can display it, via show() inside my cat() function. But when I return that value, it has disappeared!

julia> require("DualNumbers.jl")
julia> using(DualNumbers)
julia> a = dualnum(1)
standard part: 1.0
differential part: 0.0

julia> b = dualnum(2)
standard part: 2.0
differential part: 0.0

julia> c = [a b]
here in cat
ST:
1x2 Float64 Array:
1.0  2.0
DI:
1x2 Float64 Array:
0.0  0.0
D:
standard part: 1x2 Float64 Array:
1.0  2.0
differential part: 1x2 Float64 Array:
0.0  0.0    

Note that the stuff that gets displayed is my debug info from cat(). Nothing is returned:

julia> typeof(c)
Nothing

I don't know how to fix this.

@papamarkou
Copy link
Contributor

Cheers bsxfan. Although I am not one of the Julia developers, my modest view is that Julia has matured enough to be used for scientific computing. It needs to advance further in terms of graphics and ease of installation on various platforms, yet I think it is reasonable to claim that Julia is mature enough to be used for technical computing.

From my short experience from reading and coding AD in the forward mode, if you want to use dual arithmetic, it is rather efficient to implement it by storing as few values as possible. For univariate cases, it suffices to define dual numbers indeed, see for instance the Dual type in the DualNumbers package

https://github.com/scidom/DualNumbers.jl

which does what you were intending to do via the DualNum type. In functions defined on a Cartesian domain, it is not efficient to use dual arithmetic. You can store the value of the function once, which results in dual-alike types, such as the ones in src/gradual.jl or src/fad_hessian.jl. Check how the relevant types were defined there.

There is an alternative route if you want to make use of BLAS, the complex step differentiation. Although conceptually complex step differentiation is identical to the forward mode AD, the implementations differ. In the former case, you use a small step h with a value numerically close to zero so as to revert to complex arithmetic to perform AD by using BLAS for instance. If that is what you have in mind, it would be a nice contribution.

As a final note, I wanted to suggest not to invest too much time in forward AD :) It is rather slow and it is mostly useful for testing and benchmarking :) For instance, I am currently focusing my effort on source code transformation, which is in general much faster and can hopefully compete against state of the art AD tools.

@tshort
Copy link

tshort commented May 2, 2013

That is one weird bug. I can't figure it out.

On Thu, May 2, 2013 at 9:41 AM, Theodore Papamarkou <
notifications@github.com> wrote:

Cheers bsxfan. Although I am not one of the Julia developers, my modest
view is that Julia has matured enough to be used for scientific computing.
It needs to advance further in terms of graphics and ease of installation
on various platforms, yet I think it is reasonable to claim that Julia is
mature enough to be used for technical computing.

From my short experience from reading and coding AD in the forward mode,
if you want to use dual arithmetic, it is rather efficient to implement it
by storing as few values as possible. For univariate cases, it suffices to
define dual numbers indeed, see for instance the Dual type in the
DualNumbers package

https://github.com/scidom/DualNumbers.jl

which does what you were intending to do via the DualNum type. In
functions defined on a Cartesian domain, it is not efficient to use dual
arithmetic. You can store the value of the function once, which results in
dual-alike types, such as the ones in src/gradual.jl or src/fad_hessian.jl.
Check how the relevant types were defined there.

There is an alternative route if you want to make use of BLAS, the complex
step differentiation. Although conceptually complex step differentiation is
identical to the forward mode AD, the implementations differ. In the former
case, you use a small step h with a value numerically close to zero so as
to revert to complex arithmetic to perform AD using BLAS. If that is what
you have in mind, it would be a nice contribution.

As a final note, I wanted to suggest not to invest too much time in
forward AD :) It is rather slow and it is mostly useful for testing and
benchmarking :) For instance, I am currently focusing my effort on source
code transformation, which is in general much faster and can hopefully
compete against state of the art AD tools.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5#issuecomment-17338405
.

@bsxfan
Copy link
Author

bsxfan commented May 2, 2013

I have since found at least two similar issues that were recently reported and since fixed:
JuliaLang/julia#2161
JuliaLang/julia#2562
Those two have in common with mine that construction of a parametrized type is involved.

@tshort
Copy link

tshort commented May 2, 2013

I managed to get these definitions to work:

vcat(x::DualNum,y::DualNum) = dualnum([x.st, y.st],[x.di, y.di])
hcat(x::DualNum,y::DualNum) = dualnum([x.st  y.st],[x.di  y.di])

@bsxfan
Copy link
Author

bsxfan commented May 2, 2013

Thanks Tom!

Those work when you don't stress them too much, but if for example x::DualNum{Float64} and y::DualNum{Float32}, then we get the same problem again. The two components are concatenated sucessfully, via automatic type promotion inside the existing cat functions, but it seems when the stack gets too deep, something breaks.

@papamarkou
Copy link
Contributor

bsxfan, I looked into your code more carefully and it seems it offers an alternative way of using BLAS, which may be more efficient than I imagined, sorry about the first erroneous impression. I hope the bug will be sorted so that we can check how it performs in Julia :)

@bsxfan
Copy link
Author

bsxfan commented May 3, 2013

@tshort thanks very much for your effort in reporting this bug! The speed at which some Julia bugs get fixed is impressive. For me, at GMT+2 it feels like the shoemaker (http://en.wikipedia.org/wiki/The_Elves_and_the_Shoemaker) who wakes up in the morning to find his problem solved by the work of nocturnal elves :-)

So now I'll continue to work on my DualNum implementation and let you know in this forum when I think it has reached a useable state. One big chunk of work I still need to do is to add bsxfun capability.

@scidom

  • Yes, dual numbers and forward-mode AD in general has limited scope for large problems, or when speed is important. And yes, complex-step differentiation is very similar. Nevertheless, I have found my dual number implementation in MATLAB to be a very valuable tool. There is many a slip twixt the cup and the lip with complex-step differentation and if one has another option avaliable, then it is much easier to diagnose such problems.
  • Yes, my first Julia dual number implementation, with scalar components was very similar to yours (and to complex.jl). But that plan only works up to a point. For example, matrix multiplication works, but inversion, linear equation solution, determinants etc don't. This is why one needs to explicitly overload matrix operators and functions. So if one wants a more generally applicable dual number solution, whatever route you choose, you need to write quite a lot of code. I'm using this coding exercise as an opportunity to learn Julia.

@bsxfan bsxfan closed this as completed May 3, 2013
@papamarkou
Copy link
Contributor

Thanks bsxfan, this is all very useful. I wish you will contribute even more. It is good to have a general forward-mode AD tool. I haven't tried to implement complex step differentiation, so didn't know that it can lead to complications.

In the meantime, working with expressions to do source code transformation involves a steep learning curve - once I have made some progress I will share it here.

Thanks Tom for helping, really appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants