Get autodiff working #2

rodluger · 2018-02-26T00:42:27Z

Ultimately we want to use starry for inference, so having analytic derivatives of the light curve will be super useful.

ericagol · 2018-04-26T00:59:53Z

I got autodiff working on the s_n(r,b) components. I just finished coding up the s_n function in Julia, and I implemented autodiff using the ForwardDiff package. I haven't added analytic derivatives of the elliptic integrals: ForwardDiff diffs those as well.

I still haven't gotten the transformation and rotation matrices computed, but these should be straightforward.

rodluger · 2018-04-26T01:20:34Z

Fantastic! @dfm and I are going to try to get that working in C++ on Friday.

…

On Wed, Apr 25, 2018, 9:59 PM Eric Agol ***@***.***> wrote: I got autodiff working on the s_n(r,b) components. I just finished coding up the s_n function in Julia, and I implemented autodiff using the ForwardDiff package. I still haven't gotten the transformation and rotation matrices computed, but these should be straightforward. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AI5FK0oQx6XifoQ-VTWVNWUV0C0WCAqoks5tsRwKgaJpZM4SSiiw> .

rodluger · 2018-04-26T10:38:47Z

@ericagol Can you create a julia folder in the top level of the repository and place your code in there?

ericagol · 2018-04-26T17:39:48Z

Yes. I just created a folder called julia/ at the top level of the repo where I placed this code and a preliminary README.md.

rodluger · 2018-04-26T17:41:13Z

Great, thanks!

rodluger · 2018-05-01T18:44:29Z

@dfm Autodiff is working beautifully! Thanks for the help.

rodluger · 2018-05-01T18:56:17Z

Currently computing gradients of the flux in this test file. Run make test_autodiff to compile it. I'm going to add a gradient option to the pybind interface so we can start using this!

rodluger · 2018-05-02T00:01:50Z

@dfm How do I tell Eigen to not compute derivatives for a given variable? Say I have the function

template <typename T>
T testfunction(T& x1, T& x2, T& x3) {
    return x1 + x2 * x2 + x3 * x3 * x3;
}

and I want to compute derivatives w/ respect to x1 and x2, but not x3. Because of the way I've templated this, all three must have the same type, so if x1 and x2 are AutoDiffScalars, so too must x3. A cryptic comment in this example,

/**
* Important note:
* All ActiveScalars which are used in one computation must have
* either a common derivative vector length or a zero-length
* derivative vector.
*/

led me to believe that I could just resize x3.derivatives() to size zero, but that leads to weird assertion errors. Any tips?

rodluger · 2018-05-02T13:17:02Z

@dfm I'm pretty sure there's a bug in Eigen: it's related to the reason we had to typecast many of the scalars in function calls to step() and the elliptic integrals to get the code to compile with autodiff. Check out this open Eigen issue and the corresponding code. I'm guessing what happens is that there's an issue with the * operator when the AutoDiffScalar variable has no derivatives. Long story short, if I change my function to

template <typename T>
T testfunction(T& x1, T& x2, T& x3) {
    return T(x1) + T(x2 * x2) + T(x3 * x3 * x3);
}

then everything works as expected. Here's a MWE:

#include <iostream>
#include <iomanip>
#include <Eigen/Core>
#include <cmath>
#include <unsupported/Eigen/AutoDiff>
#include <vector>
using namespace std;
using Grad = Eigen::AutoDiffScalar<Eigen::VectorXd>;

// Dummy function
template <typename T>
T testfunction(T& x1, T& x2, T& x3, T& x4) {
    // This leads to an "Assertion failed" error:
    //return x1 + x2 * x2 + x3 * x3 * x3 * x3 + x4;

    // This compiles and runs fine:
    return T(x1) + T(x2 * x2) + T(x3 * x3 * x3) + T(x4 * x4 * x4 * x4);
}

// Instantiate a Grad type with or without derivatives
Grad new_grad(string name, double value, vector<string>& gradients, int& ngrad) {
    if(find(gradients.begin(), gradients.end(), name) != gradients.end()) {
        return Grad(value, gradients.size(), ngrad++);
    } else {
        return Grad(value);
    }
}

// Let's roll
int main() {

    // The user will supply this vector of parameter names
    // for which we will compute derivatives
    vector<string> gradients;
    gradients.push_back("x1");
    gradients.push_back("x2");

    // Declare our parameters: only the ones the user
    // wants will be differentiated!
    int ngrad = 0;
    Grad x1 = new_grad("x1", 4., gradients, ngrad);
    Grad x2 = new_grad("x2", 3., gradients, ngrad);
    Grad x3 = new_grad("x3", 2., gradients, ngrad);
    Grad x4 = new_grad("x4", 1., gradients, ngrad);

    // Compute the function
    Grad result = testfunction(x1, x2, x3, x4);

    // Print the flux and all the derivatives
    cout << result.value() << endl;
    cout << result.derivatives() << endl;

    return 0;
}

Curiously, if I declare the number of derivatives at compile time using Vector2d instead of VectorXd, then there is no issue. This is specifically a bug when the derivative size is dynamic. But that's the whole point: we want the user to choose which and how many derivatives to compute...

I'm going to keep digging -- I'd rather not add T() to everything in my code!

rodluger · 2018-05-02T13:33:51Z

FYI: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=1281#c1
Looks like we might just have to add T() to everything...

dfm · 2018-05-02T14:06:50Z

You definitely won't need to manually cast everything to T. The problem appears when you have inline operations on AutoDiffScalars. It is slightly annoying, but I think you'll be able to fix it pretty fast.

rodluger · 2018-05-03T17:10:03Z

@dfm Dude:

dfm · 2018-05-03T17:45:00Z

Dude! 🎉🎈🍻

ericagol · 2018-05-04T00:32:37Z

Shouldn’t we be saying “Doctor” rather than “Dude”? That looks great! Eric Agol Astronomy Professor University of Washington

…

On May 3, 2018, at 10:45 AM, Dan Foreman-Mackey ***@***.***> wrote: Dude! 🎉🎈🍻 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

rodluger · 2018-05-04T16:16:13Z

@dfm This is how I'm currently structuring the code:

>>> import starry

>>> map1 = starry.Map()

>>> map1[1, 0] = 1

>>> map1.flux(axis=(0, 1, 0), theta=0.3, xo=0.1, yo=0.1, ro=0.1)

0.9626882655504516

>>> map2 = starry.grad.Map()

>>> map2[1, 0] = 1

>>> map2.flux(axis=(0, 1, 0), theta=0.3, xo=0.1, yo=0.1, ro=0.1)

array([[ 9.62688266e-01,  4.53620580e-04,  0.00000000e+00,
        -6.85580453e-05, -2.99401131e-01, -3.04715096e-03,
         1.48905485e-03, -2.97910667e-01]])

The modules starry and starry.grad are compiled from the same chunk of code, but with a healthy sprinkling of #ifdef STARRY_AUTODIFF to handle AutoDiffScalar-specific implementation stuff. They therefore have all the same classes, methods, properties, and docstrings, but their outputs are of course different. To get this to work, I'm #includeing that chunk of code twice, once with STARRY_AUTODIFF undefined, and once with it defined.

What do you think of this? It certainly hasn't helped my code legibility...

PS: I haven't finished implementing this, but starry.grad.Map().flux() is working on the master branch.

rodluger · 2018-05-07T00:05:23Z

Quick update on this. I'm slowly getting things to work with dynamically-sized derivative vectors, which is the ideal way to do this. The most important thing I've learned is that casting to type T is not enough, since type T has an Eigen::Dynamic vector length. I need to actually force the derivative vector of all intermediate variables -- and all function outputs -- to have the correct length. For instance, the following line in flux() doesn't work:

if (b <= ro - 1) return 0;

nor does

if (b <= ro - 1) return T(0);

since neither allocates space for the derivatives of the result. What I have to do is this:

if (b <= ro - 1) return 0 * ro;

where ro is one of the variables I'm differentiating.

For some reason I no longer get any compiler errors -- just segfaults when I finally run the code. Debugging this is therefore super tedious. But I'm getting the hang of it.

dfm · 2018-05-07T00:45:14Z

That does sound tedious! Let me know if there's anything that I can do to help out.

rodluger · 2018-05-07T03:33:38Z

Got the flux calculation to work all the way through! Now the user can dynamically choose which and how many derivatives to compute. Gonna take a while for me to clean the code up and push to the master branch, but I think it's downhill from here!

rodluger · 2018-05-09T04:11:23Z

Quick update on this: I'm switching back to compile-time defined derivative vector sizes. AutoDiffScalar<VectorXd> is riddled with issues and it's 20-30 times slower than the same calculation with fixed-size derivative vectors. It's actually more efficient to compute all derivatives and let the user choose which ones to output than to selectively compute only a few derivatives.

rodluger · 2018-05-10T16:29:38Z

@dfm @ericagol Just need to write it up!

rodluger · 2018-05-10T21:57:55Z

Closing this issue. There are things that can still be optimized, but I'm happy!

rodluger assigned dfm, dflemin3 and rodluger Feb 26, 2018

rodluger added the enhancement New feature or request label Apr 7, 2018

ericagol self-assigned this Apr 26, 2018

This comment has been minimized.

Sign in to view

rodluger closed this as completed May 10, 2018

rodluger added a commit that referenced this issue Sep 12, 2019

sigh #2...

125285c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get autodiff working #2

Get autodiff working #2

rodluger commented Feb 26, 2018

ericagol commented Apr 26, 2018 •

edited

Loading

rodluger commented Apr 26, 2018 via email

This comment has been minimized.

rodluger commented Apr 26, 2018

ericagol commented Apr 26, 2018

rodluger commented Apr 26, 2018

rodluger commented May 1, 2018

rodluger commented May 1, 2018 •

edited

Loading

rodluger commented May 2, 2018 •

edited

Loading

This comment has been minimized.

rodluger commented May 2, 2018 •

edited

Loading

rodluger commented May 2, 2018 •

edited

Loading

dfm commented May 2, 2018

This comment has been minimized.

rodluger commented May 3, 2018

dfm commented May 3, 2018

ericagol commented May 4, 2018 via email

rodluger commented May 4, 2018 •

edited

Loading

rodluger commented May 7, 2018

dfm commented May 7, 2018

rodluger commented May 7, 2018

rodluger commented May 9, 2018

rodluger commented May 10, 2018

rodluger commented May 10, 2018

Get autodiff working #2

Get autodiff working #2

Comments

rodluger commented Feb 26, 2018

ericagol commented Apr 26, 2018 • edited Loading

rodluger commented Apr 26, 2018 via email

This comment has been minimized.

rodluger commented Apr 26, 2018

ericagol commented Apr 26, 2018

rodluger commented Apr 26, 2018

rodluger commented May 1, 2018

rodluger commented May 1, 2018 • edited Loading

rodluger commented May 2, 2018 • edited Loading

This comment has been minimized.

rodluger commented May 2, 2018 • edited Loading

rodluger commented May 2, 2018 • edited Loading

dfm commented May 2, 2018

This comment has been minimized.

rodluger commented May 3, 2018

dfm commented May 3, 2018

ericagol commented May 4, 2018 via email

rodluger commented May 4, 2018 • edited Loading

rodluger commented May 7, 2018

dfm commented May 7, 2018

rodluger commented May 7, 2018

rodluger commented May 9, 2018

rodluger commented May 10, 2018

rodluger commented May 10, 2018

ericagol commented Apr 26, 2018 •

edited

Loading

rodluger commented May 1, 2018 •

edited

Loading

rodluger commented May 2, 2018 •

edited

Loading

rodluger commented May 2, 2018 •

edited

Loading

rodluger commented May 2, 2018 •

edited

Loading

rodluger commented May 4, 2018 •

edited

Loading