Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derivatives of General Functions #17

Closed
kewp opened this issue Mar 16, 2019 · 8 comments
Closed

Derivatives of General Functions #17

kewp opened this issue Mar 16, 2019 · 8 comments
Labels
question Further information is requested

Comments

@kewp
Copy link

kewp commented Mar 16, 2019

I think I might have made a mistake in my understanding of autodiff. Taking the reverse single variable example as a starting point:

var f(var x)
{
    return 1 + x + x*x + 1/x + log(x);
}

int main()
{
    var x = 2.0;                         // the input variable x
    var u = f(x);                        // the output variable u

    Derivatives dud = derivatives(u);    // evaluate all derivatives of u

    var dudx = dud(x);                   // extract the derivative du/dx

    cout << "u = " << u << endl;         // print the evaluated output u
    cout << "du/dx = " << dudx << endl;  // print the evaluated derivative du/dx
}

This outputs

u = 8.19315
du/dx = 5.25

Now look what happens when I replace this with the following:

var get_x()
{
    var x = 0.0;
    int i; for (i=0; i<2; i++)
        x += 1;
        
    return x;
}

and then

    var x = get_x();                         // the input variable x
    var u = f(x);                        // the output variable u

So I'm getting x with a function which is a loop. This is now the output:

u = 8.19315
du/dx = 0

What happened? Even though x is still equal 2.0 I can't get a derivative. How did my var x suddenly not be usable for the target derivative variable?

@allanleal
Copy link
Member

Hi @kewp - thanks for reporting this.

Could you please post a complete example to help me test and identify what is going on?

@kewp
Copy link
Author

kewp commented Mar 16, 2019

No problem. Here is the full code:

// C++ includes
#include <iostream>
using namespace std;

#include <eigen3/Eigen/Core>
using namespace Eigen;

#include <autodiff/reverse.hpp>
#include <autodiff/reverse/eigen.hpp>
using namespace autodiff;

var get_x()
{
    var t = 0.0;
    int i; for (i=0; i<2; i++)
        t += 1;

    return t;
}

var f(var x)
{
    return 1 + x + x*x + 1/x + log(x);
}

int main(int argc, char *argv[])
{
    var x = get_x();                         // the input variable x
    var u = f(x);                        // the output variable u

    Derivatives dud = derivatives(u);    // evaluate all derivatives of u

    var dudx = dud(x);                   // extract the derivative du/dx

    cout << "u = " << u << endl;         // print the evaluated output u
    cout << "du/dx = " << dudx << endl;  // print the evaluated derivative du/dx

    return 0;
}

I'm getting the output

u = 8.19315
du/dx = 0

To "fix" this change var t = 0.0; to double t = 0.0; on line 14.

@allanleal
Copy link
Member

In the earlier implementations of the reverse algorithm, one could retrieve the derivative of a variable with respect to any intermediate variable in the expression tree. For example, it would be possible to retrieve the derivative of u w.r.t. z in the code below in the Derivatives container:

var x = 1.0; // x is a leaf variable (in a leaf node of the expression tree)
var y = 2.0; // y is a leaf variable
var z = x * y; // z is an intermediate variable (in an intermediate node of the expression tree)
var u = sin(z); // u is the root variable (in the node of the expression tree)

Derivatives dud = derivatives(u);

However, this is not an appropriate default behavior, because it can dangerously grow the container Derivatives to sizes beyond the control and awareness of the user. Your for-loop, for example, could go on 1000s of iterations, and then we would need to save/record all these derivatives w.r.t to intermediate, temporary variables, for later retrieval.

The decision was then to limit the record of derivatives wrt to leaf variables only (e.g., x and y above).

Maybe what could be envisioned in the future is the implementation of a method, say, allderivatives, that would then fully expose to the user all derivatives wrt to intermediate variables/expressions (i.e., we would be able to retrieve the derivative of u w.r.t. z in the above example).

I hope this clarifies! :)

@kewp
Copy link
Author

kewp commented Mar 17, 2019

Thank you, that helps me understand.

Is this the same behavior you see in forward mode, only tracking the leaf nodes?

@kewp
Copy link
Author

kewp commented Mar 17, 2019

Oh and just to clarify - a leaf variable is one that is not built up from other variables (vars), only things like double and int... ?

@allanleal
Copy link
Member

allanleal commented Mar 18, 2019

Oh and just to clarify - a leaf variable is one that is not built up from other variables (vars), only things like double and int... ?

Yes.

Is this the same behavior you see in forward mode, only tracking the leaf nodes?

In a forward mode, once you define a function f, you'll specify the function argument with respect to (using wrt method) to compute the derivative.

To give an example, both codes below are equivalent:

dual x = 8.0;
dual y = 10.0;

double dudx = derivative(f, wrt(x), x, y);
dual t = 2.0;
dual x = t*t*t;
dual y = 5*t;

double dudx = derivative(f, wrt(x), x, y);  // this will work just fine!

@allanleal
Copy link
Member

I'm reconsidering what I wrote earlier:

However, this is not an appropriate default behavior, because it can dangerously grow the container Derivatives to sizes beyond the control and awareness of the user. Your for-loop, for example, could go on 1000s of iterations, and then we would need to save/record all these derivatives w.r.t to intermediate, temporary variables, for later retrieval.

Since the expression tree stored in the var variable is already (by its nature) allowed to grow as much as possible to represent all history of operations, it does not make much sense that we limit the size of Derivatives container to only that of the number of leaf var variables.

In short - I'm starting to think that Derivatives should be as long as the number of nodes in the expression tree, and not only the number of leaf nodes.

@allanleal
Copy link
Member

This issue has now been resolved with PRs #111 and #112 . There are some regressions in the API, though, and the following is the modified code that would work with these fixes:

// C++ includes
#include <iostream>
using namespace std;

#include <Eigen/Core>
using namespace Eigen;

#include <autodiff/reverse.hpp>
#include <autodiff/reverse/eigen.hpp>
using namespace autodiff;

var get_x()
{
    var t = 0.0;
    int i; for (i=0; i<2; i++)
        t += 1;

    return t;
}

var f(var x)
{
    return 1 + x + x*x + 1/x + log(x);
}

int main(int argc, char *argv[])
{
    var x = get_x();                         // the input variable x
    var u = f(x);                        // the output variable u

    double [dudx] = derivatives(u, wrt(x));    // evaluate all derivatives of u

    cout << "u = " << u << endl;         // print the evaluated output u
    cout << "du/dx = " << dudx << endl;  // print the evaluated derivative du/dx

    return 0;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants