Solve systems of nonlinear equations #2023

bgoodri · 2016-08-23T03:11:48Z

Summary:

Support solving a system of non-linear equations using one of the better supported unsupported modules in Eigen. An alternative would be to use KINSOL in Sundials, but I think Eigen would be easier.

Description:

To start with, let's assume the user has written a user-defined function that inputs a vector of length n and outputs a vector of length n that contains the values of the equations that are supposed to be zero at the solution. Also, let's assume the user has written a user-defined function that inputs a vector of length n and outputs a square matrix of order n that represents the Jacobian, i.e. the partial derivative of the i-th equation with respect to the j-th input goes in the i,j cell of this matrix.

(This is easier if these two functions are defined in a local functions block such that data and transformed data are in scope, but I need to create another issue for that.)

The call in the transformed parameters (or model) block of the Stan program would look like

transformed parameters {
  vector[n] solution;
  {
    vector[n] starting_values;
    // fill starting_values with something intelligent
    solution = dogleg(starting_values, equations_functor, jacobian_functor);
  }
}

where the signature of the dogleg function (or we could call it powell or something else) is

vector dogleg(vector, functor, functor);

As an overview, the dogleg C++ function would do these steps:

Eigen::VectorXd theta = static_cast<double>(starting_values)
Instantiate a Eigen::HybridNonLinearSolver with a suitable hybrj_functor
Call the hybrj1 method of a Eigen::HybridNonLinearSolver with theta as the initial point
Return a var vector after using the implicit function theorem to figure out the derivatives

In detail, for step 2, see the example at
https://bitbucket.org/eigen/eigen/src/5a47e5a5b02e4d6ae1da98c2348f9c1cb01bdaf9/unsupported/test/NonLinearOptimization.cpp?at=default&fileviewer=file-view-default#NonLinearOptimization.cpp-245
The necessary hybrj_functor has an int operator()(const VectorXd &x, VectorXd &fvec) and an int df(const VectorXd &x, MatrixXd &fjac) that each return 0 on success and use the second argument to store the function values and Jacobian respectively. So, we need to create a hybrj_functor whose operator() calls the functor that is the second argument to dogleg and assigns the resulting vector to fvec while whose df() method calls the functor that is the third argument to dogleg and assigns the resulting matrix to fjac.

For step 4, it is just like
https://en.wikipedia.org/wiki/Implicit_function_theorem#Application:_change_of_coordinates
We need to evaluate the Jacobian at the solution and multiply its inverse by the negative of the solution vector to get the partial derivatives.

Reproducible Steps:

Does not currently exist

Current Output:

Does not currently exist

Expected Output:

A var vector such that if you pass that vector to equations_functor it returns a numerically zero vector and if you pass that vector to jacobian_function it returns a non-singular matrix.

Additional Information:

We need this for the models the Federal Reserve wants us to estimate and lots of economics models generally.

Current Version:

v2.11.0

The text was updated successfully, but these errors were encountered:

bob-carpenter · 2016-08-23T10:43:29Z

I'm happy to code up the the language parts, but you don't want me anywhere near any Jacobian or linear algebra code that matters.

We want to coordinate with @charlesm93 on this because there are PK/PD applications we want to be sure to handle. At least I think this is the same feature (I'm really not that good with all this linear algebra and solving stuff).

billgillespie · 2016-08-23T13:40:25Z

This sounds like a good starting point for the kind of root finding functionality we would want for pharmacometrics (PMX) applications. Some initial thoughts:

The solver function would need to accept other parameters as arguments and return a var vector with gradients wrt those parameters.
I think we will also want a version that automatically generates the Jacobian.
The primary PMX application is the calculation of amounts in each compartment at a periodic steady-state resulting from multiple equal doses at equal intervals. This involves numerically solving a system of nonlinear equations that themselves involve the numerical solution of a system of ODEs.
The function should allow for the starting values to be parameters but coerce them to double. This permits intelligent automatic calculation of starting values based on model parameters. For example when calculating the steady-state solution for a pharmacokinetic model we might calculate initial estimates based on scaling of a single dose calculation.
I don't have any experience with Eigen::HybridNonLinearSolver, so I don't know how it compares with KINSOL wrt computational efficiency, robustness (e.g., sensitivity to initial estimates), etc.

bgoodri · 2016-08-23T19:43:43Z

I think we are saying the same things here. Having the Jacobian be automatic would be nice, but I think it would require fwd mode, which isn't fully in place yet. I haven't used KINSOL either but my sense is that Powell's method, which Eigen implements, is the consensus choice for this sort of thing.

bob-carpenter · 2016-08-24T11:20:20Z

Why would we need forward mode for an embedded Jacobian? In
the ODE solver, we just use nested reverse mode.

Forward mode's ready to go, though it could use a lot
of optimization. When it's written more efficiently, it
should be much faster to calculate N x N Jacobians with
N forward-mode calls rather than N backward-mode ones.

bgoodri · 2016-08-25T15:48:50Z

Eigen has a signature that omits the functor for the Jacobian and calculates the Jacobian by numerical differences. That might be less accurate than autodiff, but I think it is about as many flops as N reverse mode calls. In any event, we should start with the case where there is an analytical Jacobian and once that is done, the rest will be easy.

bob-carpenter · 2016-08-25T15:56:23Z

I'd think that would depend on how they do finite diffs to calculate a Jacobian. It could be O(N^2) rather than O(N) if they do each derivative separately.

bgoodri · 2016-08-28T18:04:13Z

@bob-carpenter Is this
https://github.com/stan-dev/math/compare/feature/dogleg
about what it needs to be on the Math side in order for you to generate the code for it on the Stan side?

bob-carpenter · 2016-10-04T17:59:31Z

Yes, that would enable a functor dogleg() to be written as
a special expression like integrate_ode() if that's what you're
asking.

It'll be a while before I can get to higher-order functions
within Stan!

Bob

On Aug 28, 2016, at 2:04 PM, bgoodri notifications@github.com wrote:

@bob-carpenter Is this
https://github.com/stan-dev/math/compare/feature/dogleg
about what it needs to be on the Math side in order for you to generate the code for it on the Stan side?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

charlesm93 · 2016-11-07T22:10:21Z

I wanted to ask what the status of the dogleg function was. From what I can tell, the math code has been written but not tested (didn't find unit tests), and we need to expose it to Stan's grammar, in a similar manner than was done for the ODE integrators. I'm happy to get my hands dirty with both tasks.

We can start with a function that requires an analytical jacobian, though in the long run we'll want an automatic approximation of the Jacobian for the function to have broader applications.

bgoodri · 2016-11-07T22:25:49Z

I think that is about the state of it. Doing an autodiffed Jacobian is easy
to implement in reverse mode because the .jacobian() method is already
implemented.

On Mon, Nov 7, 2016 at 5:10 PM, Charles Margossian <notifications@github.com

wrote:

I wanted to ask what the status of the dogleg function was. From what I
can tell, the math code has been written but not tested (didn't find unit
tests), and we need to expose it to Stan's grammar, in a similar manner
than was done for the ODE integrators. I'm happy to get my hands dirty with
both tasks.

We can start with a function that requires an analytical jacobian, though
in the long run we'll want an automatic approximation of the Jacobian for
the function to have broader applications.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2023 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADOrqjj0Xa_izX8cSVG71qoJKPl2sdpYks5q76HQgaJpZM4Jqgvr
.

charlesm93 · 2016-11-07T22:31:02Z

Actually let's take a step back. We need a way to pass in parameters and data. We could use the following signature:

vector equation(vector, real[], real[], int[])

vector dogleg(vector, functor, functor, real[], real[], int[]);

where the additional arguments contain parameters, real data, and integer data. These arguments should also work for the jacobian, which should depend on the same variables as the equation function.

bgoodri · 2016-11-07T22:43:06Z

I personally would rather us implement the "local functions" block between
the transformed data and parameters blocks so that the functions would be
defined as part of the class and anything from the data and transformed
data blocks would be in scope. Going the route of how the signatures for
the integrate_ode_* functions are defined is really cumbersome if the
equations involve matrices.

On Mon, Nov 7, 2016 at 5:31 PM, Charles Margossian <notifications@github.com

wrote:

Actually let's take a step back. We need a way to pass in parameters and
data. We could use the following signature:

vector equation(vector, real[], real[], int[])

vector dogleg(vector, functor, functor, real[], real[], int[]);

where the additional arguments contain parameters, real data, and integer
data. These arguments should also work for the jacobian, which should
depend on the same variables as the equation function.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2023 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADOrqjNbqBqT33zL_0fq2y5ldFbsqqq6ks5q76aogaJpZM4Jqgvr
.

syclik · 2016-11-07T22:46:51Z

+1 regarding additional functors.

What do you mean by "defined as part of the class"? I take it you mean the generated C++? I think we should think design first, then constrain design with implementation details later.

On Nov 7, 2016, at 5:43 PM, bgoodri notifications@github.com wrote:

I personally would rather us implement the "local functions" block between
the transformed data and parameters blocks so that the functions would be
defined as part of the class and anything from the data and transformed
data blocks would be in scope. Going the route of how the signatures for
the integrate_ode_* functions are defined is really cumbersome if the
equations involve matrices.

On Mon, Nov 7, 2016 at 5:31 PM, Charles Margossian <notifications@github.com

wrote:

Actually let's take a step back. We need a way to pass in parameters and
data. We could use the following signature:

vector equation(vector, real[], real[], int[])

vector dogleg(vector, functor, functor, real[], real[], int[]);

where the additional arguments contain parameters, real data, and integer
data. These arguments should also work for the jacobian, which should
depend on the same variables as the equation function.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2023 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADOrqjNbqBqT33zL_0fq2y5ldFbsqqq6ks5q76aogaJpZM4Jqgvr
.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

bgoodri · 2016-11-07T22:53:49Z

I thought we had a consensus on the design a few months ago:

Introduce a "local functions" block between transformed data and
parameters
Such functions would be member functions of the model class (like
log_prob and whatnot) rather than floating around in the namespace. Thus,
local functions can access objects declared in data and transformed data
without having to pass them as arguments. The user would define a local
function that inputs a vector of parameters and outputs a vector that is
numerically zero at the solution.
Then functors like dogleg() would only need a minimal number of
arguments, as in

vector dogleg_with_jacobian(function, function, vector)
vector dogleg(function, vector)

syclik · 2016-11-07T23:19:15Z

Where is the spec? I don't remember seeing an actual design. (I could have missed it and the decision.)

On Nov 7, 2016, at 5:53 PM, bgoodri notifications@github.com wrote:

I thought we had a consensus on the design a few months ago:

Introduce a "local functions" block between transformed data and
parameters

Such functions would be member functions of the model class (like
log_prob and whatnot) rather than floating around in the namespace. Thus,
local functions can access objects declared in data and transformed data
without having to pass them as arguments. The user would define a local
function that inputs a vector of parameters and outputs a vector that is
numerically zero at the solution.

Then functors like dogleg() would only need a minimal number of
arguments, as in

vector dogleg_with_jacobian(function, function, vector)
vector dogleg(function, vector)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

bgoodri · 2016-11-07T23:24:34Z

https://github.com/stan-dev/stan/wiki/Functionals-spec

On Mon, Nov 7, 2016 at 6:19 PM, Daniel Lee notifications@github.com wrote:

Where is the spec? I don't remember seeing an actual design. (I could have
missed it and the decision.)

On Nov 7, 2016, at 5:53 PM, bgoodri notifications@github.com wrote:

I thought we had a consensus on the design a few months ago:

Introduce a "local functions" block between transformed data and
parameters

Such functions would be member functions of the model class (like
log_prob and whatnot) rather than floating around in the namespace. Thus,
local functions can access objects declared in data and transformed data
without having to pass them as arguments. The user would define a local
function that inputs a vector of parameters and outputs a vector that is
numerically zero at the solution.

Then functors like dogleg() would only need a minimal number of
arguments, as in

vector dogleg_with_jacobian(function, function, vector)
vector dogleg(function, vector)
—
You are receiving this because you commented.

Reply to this email directly, view it on GitHub, or mute the thread.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2023 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADOrqryIH9s39WqwEGmYvMVmTDN8ydiUks5q77H1gaJpZM4Jqgvr
.

bob-carpenter · 2016-11-08T03:26:13Z

On Nov 7, 2016, at 5:53 PM, bgoodri notifications@github.com wrote:

I thought we had a consensus on the design a few months ago:

Yes, we do. I could work on this next. I don't think it'd
be that hard.

I'm realizing that a lot of the type inference we've been
doing in the main body of Stan programs falls down for
functions because of their templating. Can't tell when
things are double or not, for instance, at compile time.
It'd require reasoning about the instantiations of the
functions. That was the bug in the conditional operator,
by the way.

I've been thinking about adding the list/tuple type and
it's going to mess with a lot of the basic code which
assumes a type is (int|real|vector|row_vector|vector|matrix)
with a number of array dimensions. The constraints don't
play into the type system. But adding a tuple is different,
because we need to define the types of the elements.

I can start thinking about general functional programming.
It'd be awesome to add that. And with functors in C++, I
think it might be possible.

charlesm93 · 2016-11-08T13:52:10Z

I didn't realize local functions were an option we were contemplating. I agree functionals would be awesome and, among other things, would help a lot for the generalized event handler.

implement the "local functions" block between
the transformed data and parameters blocks

You mean between the parameters and transformed parameters block, right? That way parameters can be used in the function.

charlesm93 · 2016-12-05T19:00:15Z

Two points:

Are we sticking with Powell's method? Michael suggested we would be better off with a fully gradient-based method since we'll be computing derivatives anyways. Powell's method is hybrid. We could go for Newton's method but it might not be stable enough for a lot of problems.
What is the time frame for implementing local functions? I'd rather do it "right" in the first go, but I'm shooting for a working prototype of the solver in Torsten before the end of January (deliverable for a grant), and I might create a quick and dirty working version.

bob-carpenter · 2016-12-05T20:13:46Z

No way we're going to get local functions by end of January. Too much going on between now and then and they're going to involve a ton of testing. But probably not long after that if it's the next big thing I do after the AST and generator refactor. - Bob

…

On Dec 5, 2016, at 2:00 PM, Charles Margossian ***@***.***> wrote: Two points: • Are we sticking with Powell's method? Michael suggested we would be better off with a fully gradient-based method since we'll be computing derivatives anyways. Powell's method is hybrid. We could go for Newton's method but it might not be stable enough for a lot of problems. • What is the time frame for implementing local functions? I'd rather do it "right" in the first go, but I'm shooting for a working prototype of the solver in Torsten before the end of January (deliverable for a grant), and I might create a quick and dirty working version. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

charlesm93 · 2017-02-02T18:53:08Z

As discussed in the meeting, I'm going ahead and developing a first version of the solver. I'll keep it modular, and I'll test with the dogleg method Ben began working on. Should I create a new branch or continue working on feature/dogleg?

bgoodri · 2017-02-02T19:17:41Z

Same branch

…

On Thu, Feb 2, 2017 at 1:53 PM, Charles Margossian ***@***.*** > wrote: As discussed in the meeting, I'm going ahead and developing a first version of the solver. I'll keep it modular, and I'll test with the dogleg method Ben began working on. Should I create a new branch or continue working on feature/dogleg? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#2023 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADOrqnc9ajME1LVlV-bN017QHK2kzEc_ks5rYiYVgaJpZM4Jqgvr> .

bob-carpenter · 2017-02-02T19:24:58Z

Whatever's easiest in terms of branching. From the meeting, we decided to go with an integrate_ode-like interface for now, then later simplify by removing some arguments. Ideally, there will be a clean call to the actual solver that we can plug and play with different solvers, but the main goal's to get one solver working and building and tested. - Bob

…

On Feb 2, 2017, at 1:53 PM, Charles Margossian ***@***.***> wrote: As discussed in the meeting, I'm going ahead and developing a first version of the solver. I'll keep it modular, and I'll test with the dogleg method Ben began working on. Should I create a new branch or continue working on feature/dogleg? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

charlesm93 · 2017-02-02T21:10:50Z

I'm trying to figure out how to call dogleg. I seem to be doing something wrong with the functors (I wrote the functors as I would have written them for the ODE solver):

inline Eigen::VectorXd
algebraEq(const Eigen::VectorXd x) {
  Eigen::VectorXd y(2);
  y(0) = x(0) - 36;
  y(1) = x(1) - 6;
  return y;
}

struct algebraEq_functor {
  inline Eigen::VectorXd
  operator()(const Eigen::VectorXd x) const {
    return algebraEq(x);
  }
};

inline Eigen::MatrixXd
jacobian(const Eigen::VectorXd x) {
  Eigen::MatrixXd y(2, 2);
  y(0, 0) = 1;
  y(0, 1) = 0;
  y(1, 0) = 0;
  y(1, 1) = 1;
  return y;
}

struct jacobian_functor {
  inline Eigen::MatrixXd
  operator()(const Eigen::VectorXd x) const {
    return jacobian(x);
  }
};

TEST(MathMatrix, dogleg) {
  Eigen::VectorXd x(2);
  x << 32, 5;

  Eigen::VectorXd theta;
  theta = stan::math::dogleg(x, algebraEq_functor(), jacobian_functor());
}

The compiler produces the following error message:

 error: no matching conversion for
      functional-style cast from 'const Eigen::VectorXd' (aka 'const
      Matrix<double, Dynamic, 1>') to 'algebraEq_functor'

and

error: no matching conversion for
      functional-style cast from 'const Eigen::VectorXd' (aka 'const
      Matrix<double, Dynamic, 1>') to 'jacobian_functor'
          fjac = F2(x);

I'm guessing I'm passing the wrong arguments to dogleg.

charlesm93 · 2017-02-15T15:09:41Z

The next step is to compute the Jacobian of the solutions with respect to parms. @bgoodri suggests using the implicit function theorem, which, admittedly, I am still wrapping my head around.

Here's a scheme I propose implementing:
We know (or can compute) J_f(x), the Jacobian of the function w.r.t the unknowns x.
We could also, by properly manipulating functors, compute J_f(p), the Jacobian of the function w.r.t the parameters p.

Applying the chain rule, and using the inverse function theorem, I get

J_x(p) = J_f(p) * J_x(f) = J_f(p) * inverse[ J_f(x) ]

Note we assume the functions are differentiable around the points of interest. I think this should work, although, I'm wondering if I overlooked a subtlety when applying the chain rule.

@bgoodri Did you have something similar in mind?

bgoodri · 2017-02-15T15:36:00Z

I already started the case where the Jacobian is unknown and has to be done by autodiff. But it does not yet have the implicit function stuff once the solution is found.

…

On Wed, Feb 15, 2017 at 10:09 AM, Charles Margossian < ***@***.***> wrote: The next step is to compute the Jacobian of the solutions with respect to parms. @bgoodri <https://github.com/bgoodri> suggests using the implicit function theorem, which, admittedly, I am still wrapping my head around. Here's a scheme I propose implementing: We know (or can compute) J_f(x), the Jacobian of the function w.r.t the unknowns x. We could also, by properly manipulating functors, compute J_f(p), the Jacobian of the function w.r.t the parameters p. Applying the chain rule, and using the inverse function theorem, I get J_x(p) = J_f(p) * J_x(f) = J_f(p) * inverse[ J_f(x) ] Note we assume the functions are differentiable around the points of interest. I think this should work, although, I'm wondering if I overlooked a subtlety when applying the chain rule. @bgoodri <https://github.com/bgoodri> Did you have something similar in mind? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2023 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADOrqmATUFM7xJbaCecPaw5n8BLso9mlks5rcxU4gaJpZM4Jqgvr> .

charlesm93 · 2017-02-22T19:01:04Z

I uploaded a working prototype of algebra_solver with a simple unit test (all in the rev regime). The function finds the solutions and propagates the gradient!

A couple of things:

I got an error when I included both the dogleg under the rev and the prim directories, due to redefinition of the functors (as a temporary solution, I did not include rev/.../dogleg.hpp in rev/mat.hpp; the easy fix would be to rename the functors in one of the files).
I wanted to use value_of to pass parms as an eigen vector of doubles instead of var, but got an error regarding converting vars to double. I created a new value function to do that. We could also overload value_of to convert var to doubles, but I'm not sure if this would be a desirable feature.

Ok, the next steps involve: checks and error messages (there are a few constraints posed by the calculation of the Jacobian), fwd regime, more unit tests.

sakrejda · 2017-02-22T19:56:12Z

Awesome. Can't wait to try it out on some icdf's! K

…

On Wed, Feb 22, 2017, 2:01 PM Charles Margossian ***@***.***> wrote: I uploaded a working prototype of algebra_solver with a simple unit test (all in the rev regime). The function finds the solutions and propagates the gradient! A couple of things: - I got an error when I included both the dogleg under the rev and the prim directories, due to redefinition of the functors (as a temporary solution, I did not include rev/.../dogleg.hpp in rev/mat.hpp; the easy fix would be to rename the functors in one of the files). - I wanted to use value_of to pass parms as an eigen vector of doubles instead of var, but got an error regarding converting vars to double. I created a new value function to do that. We could also overload value_of to convert var to doubles, but I'm not sure if this would be a desirable feature. Ok, the next steps involve: checks and error messages (there are a few constraints posed by the calculation of the Jacobian), fwd regime, more unit tests. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2023 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAfA6atwbKb4PsRN64hccv87Yfz3K7F-ks5rfIXxgaJpZM4Jqgvr> .

bob-carpenter · 2017-02-25T02:08:53Z

On Feb 22, 2017, at 2:56 PM, Krzysztof Sakrejda ***@***.***> wrote: Awesome. Can't wait to try it out on some icdf's! K On Wed, Feb 22, 2017, 2:01 PM Charles Margossian ***@***.***> wrote: > I uploaded a working prototype of algebra_solver with a simple unit test > (all in the rev regime). The function finds the solutions and propagates > the gradient! > > A couple of things: > > - I got an error when I included both the dogleg under the rev and the > prim directories, due to redefinition of the functors (as a temporary > solution, I did not include rev/.../dogleg.hpp in rev/mat.hpp; the easy fix > would be to rename the functors in one of the files).

A header guard doesn't work?

> - I wanted to use value_of to pass parms as an eigen vector of doubles > instead of var, but got an error regarding converting vars to double. I > created a new value function to do that. We could also overload > value_of to convert var to doubles, but I'm not sure if this would be > a desirable feature.

You should be able to compute var to double using value_of. You just need to make sure to include the header for it from rev.

> Ok, the next steps involve: checks and error messages (there are a few > constraints posed by the calculation of the Jacobian), fwd regime, more > unit tests.

Sounds good. - Bob

charlesm93 · 2017-03-27T15:50:32Z

Any ideas for particularly hard algebra systems I should throw at the solver in the unit tests?

betanalpha · 2017-03-27T17:22:12Z

Linear systems with analytic solutions so you can verify accuracy, for example a linear matrix system. Non-linear systems with analytic solutions so you can verify accurac, for example third-order polynomials with known roots. Systems with no solutions to check error conditions. Systems with degenerate solutions to check error conditions.

…

On Mar 27, 2017, at 11:50 AM, Charles Margossian ***@***.***> wrote: Hi, Any ideas for particularly hard algebra systems I should throw at the solver in the unit tests? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2023 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABdNlnmCQzaagovrIxHpwP2AZ9S4Xlv-ks5rp9rJgaJpZM4Jqgvr>.

charlesm93 · 2017-03-29T18:58:17Z

I threw two unsolvable systems at the solver, and didn't get any error message. I expected Eigen's dogleg function would react. Instead, it either gives me the wrong result or nan.

Here's one example:

z(0) = x(0) * x(0) + 1
z(1) = x(1) * x(1) + 1

And the output is:

theta = {0, 0}

I'd be surprised if Eigen didn't have some mechanism in place to catch these sorts of errors.

I could implement a check inside algebra_solver that rejects the current proposal when the solution is "bad" (but according to what metric? Do we require the system to be 0 +/- some error? Ok, let's check the algorithm for some error bound on the solution)

billgillespie · 2017-03-29T19:58:17Z

The Eigen function HybridNonLinearSolver is based on one of the nonlinear equation solvers in MINPACK, HYBRJ or HYBRJ1 I am guessing. Based on an incomplete read of the MINPACK documentation those functions attempt to find the minimum of the L2 norm of the user-specified functions and leave it to the user to determine if that minimum is a root. In your example it looks like the function correctly found that minimum. We need to add code to check if the result is a root.

charlesm93 · 2017-03-29T20:10:38Z

The Eigen function HybridNonLinearSolver is based on one of the nonlinear equation solvers in MINPACK, HYBRJ or HYBRJ1 I am guessing

Yes, that seems right, see https://eigen.tuxfamily.org/dox/unsupported/group__NonLinearOptimization__Module.html.

We need to add code to check if the result is a root.

Ok, that sends us down the "check and accept/reject metropolis proposal" route. I'll require the system to be 0 +/- e-10 and leave it open to discussion as to whether we want to change the error, give the user control over it, etc.

betanalpha · 2017-03-29T21:17:22Z

Unit tests work. ;-) If Eigen doesn’t throw on a failure to converge to a root then we’ll have to do that by hand. Just throw an exception and the sampler will take care of the accept/reject part.

…

On Mar 29, 2017, at 4:10 PM, Charles Margossian ***@***.***> wrote: The Eigen function HybridNonLinearSolver is based on one of the nonlinear equation solvers in MINPACK, HYBRJ or HYBRJ1 I am guessing Yes, that seems right, see https://eigen.tuxfamily.org/dox/unsupported/group__NonLinearOptimization__Module.html <https://eigen.tuxfamily.org/dox/unsupported/group__NonLinearOptimization__Module.html>. We need to add code to check if the result is a root. Ok, that sends us down the "check and accept/reject metropolis proposal" route. I'll require the system to be 0 +/- e-10 and leave it open to discussion as to whether we want to change the error, give the user control over it, etc. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2023 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABdNlg8fKCEGdAsJYPGb5ZGO0vtiRtmYks5rqrq_gaJpZM4Jqgvr>.

bob-carpenter · 2017-03-31T19:06:50Z

We'll probably want a version with a sensible default and one with user control. That's what we did with the ODE solvers. If not, then just the one with user controls (along with at least recommended defaults). Usually there's both an absolute tolerance and relative tolerance and some logic requiring both tolerances to be met.

charlesm93 · 2017-04-03T15:02:18Z

Per Thursday's conversation, I added a test demonstrating how initial guesses can influence the solution in a degenerate case. The Jacobian "adapts" to the solution -- which is pretty neat!

Usually there's both an absolute tolerance and relative tolerance and some logic requiring both tolerances to be met.

@bob-carpenter: currently, the code throws an exception when f(solution) doesn't go to 0 +/- abs_tol. I don't think a relative error is formally defined in this case; the only thing that comes to my mind would be an error relative to the initial guess (x(i) * rel_tol) -- but does testing for 0 +/- (x(i) * rel_tol) really make sense?

We'll then have two signatures:

algebra_solver(x, y, dat, dat_int, abs_tol, rel_tol)
algebra_solver(x, y, dat, dat_int)

Next, implementation in Stan. My only concern is translating the system of equations into a functor. To get the Jacobian with respect to either x or y, I constructed the operator of the functor as follow:

  operator()(const Eigen::Matrix<T, Eigen::Dynamic, 1> x) const {
    if (x_is_dv_)
      return degenerateEq(x, y_, dat_, dat_int_);
    else
      return degenerateEq(x_, x, dat_, dat_int_);
  }

The idea is that either x or y can be the independent variables, with respect to which we compute the jacobian. Which is which is determined by the member x_is_dv (a boolean).

I'm looking at the way a regular function gets translated from Stan to C++. I assume this is what I'll have to deal with -- or can we tweak how the function passed to the algebraic solver gets translated?

EDIT: what does the std::ostream* pstream_ argument do?

bob-carpenter · 2017-04-03T16:40:28Z

...

@bob-carpenter: currently, the code throws an exception when f(solution) doesn't go to 0 +/- abs_tol. I don't think a relative error is formally defined in this case; the only thing that comes to my mind would be an error relative to the initial guess (x(i) * rel_tol) -- but does testing for 0 +/- (x(i) * rel_tol) really make sense?

Relative tolerance is defined relative to where you're at. So if the solution is 1e8, we don't want to just use an absolute tolerance of 1e-10 as the difference is beyond double-precision floating point capacity (which is about 1e-16). Where this gets tricky is for solutions near zero. If the actual solution's zero, no relative tolerance is going to work because you keep getting closer to zero then reducing the relative tolerance. I'd suggest looking at how the optimizers built into Stan use tolerances. If you don't have relative tolerance, then you absolutely are going to need a user-specified absolute tolerance.

We'll then have two signatures: • algebra_solver(x, y, dat, dat_int, abs_tol, rel_tol) • algebra_solver(x, y, dat, dat_int) Next, implementation in Stan. My only concern is translating the system of equations into a functor. To get the Jacobian with respect to either x or y, I constructed the operator of the functor as follow: operator()(const Eigen::Matrix<T, Eigen::Dynamic, 1> x) const { if (x_is_dv_) return degenerateEq(x, y_, dat_, dat_int_); else return degenerateEq(x_, x, dat_, dat_int_); } The idea is that either x or y can be the independent variables, with respect to which we compute the jacobian.

Can they both be independent?

Which is which is determined by the member x_is_dv (a boolean).

You want to calculate x_is_dv statically, not as a member variable. Check out how the code for the distributions (e.g., normal_lpdf) is written. The ode_integrator functions aren't actually functions---they're special expression constructs. This allows them to take in names of functions (our ordinary functions don't take functions as arguments) and also constrain some of their arguments (like solution times and data input) to be double only.

I'm looking at the way a regular function gets translated from Stan to C++. I assume this is what I'll have to deal with -- or can we tweak how the function passed to the algebraic solver gets translated?

Write one with a couple arguments and look at the output in the model file .hpp generated by stanc (you can use CmdStan's makefile, just cd to cmdstan, then do $ make target where target is the path foo/bar if the Stan file is in foo/bar.stan.

charlesm93 · 2017-04-03T21:00:23Z

Relative tolerance is defined relative to where you're at. So if the solution is 1e8, we don't want to just use an absolute tolerance of 1e-10 as the difference is beyond double-precision floating point capacity (which is about 1e-16).

We currently test the solution by plugging it in the system. We want to check z(x) = 0. The tricky thing is we're not directly directly measuring an error in x, but in z. So we need to propagate the error. The "final relative tolerance" would be z(x * rel_tol). This gives us how far away from 0 z can be.

Can they [the arguments x and y] both be independent?

No. When you call Jacobian, f is expected to have exactly one dependent argument.

Write one with a couple arguments and look at the output in the model file .hpp generated by stanc

Helpful. A bit of functor gymnastic did the trick, now I need to update all the tests.

bob-carpenter · 2017-04-04T17:37:28Z

That's tricky not because it's a function z(x), but because the function's value is zero, z(x) = 0. Then relative error in z(x) doesn't really make sense. How do the systems you're using specify tolerances?

charlesm93 · 2017-04-04T17:58:39Z

That's tricky not because it's a function z(x), but because the function's value is zero, z(x) = 0. Then relative error in z(x) doesn't really make sense.

Ah yes, I get your point. To quote the MINPACK manual: "F-convergence is not tested for systems of nonlinear equations where F(x*) = 0 is the expected result." They propose another convergence test, I'm currently reading through it.

charlesm93 · 2017-04-07T13:42:01Z

After reading through MINPACK (see attached) and dissecting Eigen's code, I found the following tuning parameters:

xtol: allows to test for convergence; the root finder stops when (delta < xtol * xnorm)
maxfev: the maximum number of function evaluations; the root finder stops when it reaches maxfev.

Eigen does not throw a message when maxfev is reached which may be an issue.

In addition, I propose adding an absolute tolerance test on F(x*).

ftol: if ||F(x*)|| > ftol, throw an exception.

This gives three tuning parameters to the user: xtol, maxfev, and ftol, which are analogous, though not equivalent, to the relative_tolerance, the max_num_steps, and the absolute_tolerance in the ODE integrator.

The user should understand the root finder stops when it reaches acceptable convergence to the solution (as determined by xtol and ||F(x)||) or when it reaches the maximum number of iterations. If the result is not acceptable, an exception is thrown when looking at the final value of F(x*). The error message suggests decreasing xtol or increasing maxfev.

We could try to get an exception when maxfev is reached but this might require creating a stan version of HybridNonLinear.h (since we do not edit Eigen's code).

ANL8074a.pdf

charlesm93 · 2017-04-07T13:42:34Z

RE: what does the std::ostream* pstream_ argument do in integrate_ode?

bgoodri · 2017-04-07T15:43:08Z

That stream is for print or reject statements inside the user-defined function being integrated.

…

On Fri, Apr 7, 2017 at 9:42 AM, Charles Margossian ***@***.*** > wrote: RE: what does the std::ostream* pstream_ argument do in integrate_ode? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2023 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADOrqiF-Xi5mtlChmrpbFnggRnVVw8XJks5rtj1LgaJpZM4Jqgvr> .

charlesm93 · 2017-04-07T18:43:13Z

Switch from using the feature/dogleg branch to the feature/issue-2023-algebra-solver branch to respect dev norms.

charlesm93 · 2017-04-07T19:32:02Z

I'm ready to submit a pull request, but I've only "finished" (in quotes because I expect we'll do revisions as we review the code) the algebra_solver for the rev case. Is there interest in the fwd case? Even if there is, I'd rather first add the rev version and then work on fwd case.

bob-carpenter · 2017-04-10T15:21:28Z

For these higher-order things we can go with just reverse. We don't have a forward-mode ODE solver yet! Are the arguments all sufficiently general that we'll be able to plug into Stan with double values wherever rev-mode vars are allowed? Otherwise, we tend to need to overpromote the double values to var.

…

On Apr 7, 2017, at 3:32 PM, Charles Margossian ***@***.***> wrote: I'm ready to submit a pull request, but I've only "finished" (in quotes because I expect we'll do revisions as we review the code) the algebra_solver for the rev case. Is there interest in the fwd case? Even if there is, I'd rather first add the rev version and then work on fwd case. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

charlesm93 · 2017-04-10T16:27:40Z

Are the arguments all sufficiently general that we'll be able to plug into Stan with double values wherever rev-mode vars are allowed?

No. One of the arguments of the function, y, is expected to contain var (it is similar to parm in the ODE solver). Statements, such as value_of(y) and y_[i] = y(i).vi_ put constraints on the type of y. The user does pass a dat (and dat_int) argument. But if he or she wants to call the algebraic solver on only data, he or she would have to rewrite the functor that gets passed in, by replacing all dependencies on y with dependencies on dat.

If we want y to be a double or var, I think we need to overload the algebraic solver (so create one version under prim/)... which is straightforward! The question is how valuable would this be to users? Would there be a model where the same system gets solved with data and parameters at another point?

betanalpha · 2017-04-10T16:44:01Z

There are certainly uses one could imagine in transformed data and the generated quantities block, so yes it’s worth having a prim version that then gets specialized for the current rev implementation.

…

On Apr 10, 2017, at 12:27 PM, Charles Margossian ***@***.***> wrote: Are the arguments all sufficiently general that we'll be able to plug into Stan with double values wherever rev-mode vars are allowed? No. One of the arguments of the function, y, is expected to contain var (it is similar to parm in the ODE solver). Statements, such as value_of(y) and y_[i] = y(i).vi_ put constraints on the type of y. The user does pass a dat (and dat_int) argument. But if he or she wants to call the algebraic solver on only data, he or she would have to rewrite the functor that gets passed in, by replacing all dependencies on y with dependencies on dat. If we want y to be a double or var, I think we need to overload the algebraic solver (so create one version under prim/)... which is straightforward! The question is how valuable would this be to users? Would there be a model where the same system gets solved with data and parameters at another point? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2023 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABdNlmh7xnbAiTF_X_P9cWLZS6f3JTzcks5rulh8gaJpZM4Jqgvr>.

bob-carpenter · 2017-04-10T18:30:35Z

It's easier to document that way, and if it's part of some generative process, then it could be useful for simulating data. The parameters in the ode_integrate function are allowed to be double values for just this reason. The only restriction is that the solution times and real data must be data. Generally anything that can be a var should be allowed to be data, too.

charlesm93 · 2017-04-10T19:24:45Z

@betanalpha, @bob-carpenter : ok, I created a prim version of the algebraic solver. I realized over lunch break I would also need for Torsten. The rev version now calls the prim solver, and then builds the vari object on top.

I submitted a pull-request. I removed files which were extraneous for the request (such as the prototype fwd version of dogleg) -- but all these files are still saved on the feature/dogleg branch.

bgoodri added the feature label Aug 23, 2016

bgoodri added this to the Future milestone Aug 23, 2016

bgoodri assigned bob-carpenter Aug 23, 2016

bob-carpenter assigned bgoodri and charlesm93 and unassigned bob-carpenter Nov 28, 2016

charlesm93 mentioned this issue May 4, 2017

Algebra Solver #2300

Closed

alashworth mentioned this issue Mar 12, 2019

Solve systems of nonlinear equations alashworth/test-issue-import#109

Open

rok-cesnovar closed this as completed Jun 18, 2020

Solve systems of nonlinear equations #2023

Solve systems of nonlinear equations #2023

Comments

bgoodri commented Aug 23, 2016

Summary:

Description:

Reproducible Steps:

Current Output:

Expected Output:

Additional Information:

Current Version:

bob-carpenter commented Aug 23, 2016

billgillespie commented Aug 23, 2016

bgoodri commented Aug 23, 2016

bob-carpenter commented Aug 24, 2016

bgoodri commented Aug 25, 2016

bob-carpenter commented Aug 25, 2016

bgoodri commented Aug 28, 2016

bob-carpenter commented Oct 4, 2016

charlesm93 commented Nov 7, 2016

bgoodri commented Nov 7, 2016

charlesm93 commented Nov 7, 2016

bgoodri commented Nov 7, 2016

syclik commented Nov 7, 2016

bgoodri commented Nov 7, 2016

syclik commented Nov 7, 2016

bgoodri commented Nov 7, 2016

bob-carpenter commented Nov 8, 2016

charlesm93 commented Nov 8, 2016 • edited Loading

charlesm93 commented Dec 5, 2016 • edited Loading

bob-carpenter commented Dec 5, 2016 via email

charlesm93 commented Feb 2, 2017

bgoodri commented Feb 2, 2017 via email

bob-carpenter commented Feb 2, 2017 via email

charlesm93 commented Feb 2, 2017

charlesm93 commented Feb 15, 2017

bgoodri commented Feb 15, 2017 via email

charlesm93 commented Feb 22, 2017

sakrejda commented Feb 22, 2017 via email

bob-carpenter commented Feb 25, 2017 via email

charlesm93 commented Mar 27, 2017 • edited Loading

betanalpha commented Mar 27, 2017 via email

charlesm93 commented Mar 29, 2017 • edited Loading

billgillespie commented Mar 29, 2017

charlesm93 commented Mar 29, 2017

betanalpha commented Mar 29, 2017 via email

bob-carpenter commented Mar 31, 2017 via email

charlesm93 commented Apr 3, 2017 • edited Loading

bob-carpenter commented Apr 3, 2017 via email

charlesm93 commented Apr 3, 2017

bob-carpenter commented Apr 4, 2017 via email

charlesm93 commented Apr 4, 2017

charlesm93 commented Apr 7, 2017

charlesm93 commented Apr 7, 2017

bgoodri commented Apr 7, 2017 via email

charlesm93 commented Apr 7, 2017

charlesm93 commented Apr 7, 2017

bob-carpenter commented Apr 10, 2017 via email

charlesm93 commented Apr 10, 2017

betanalpha commented Apr 10, 2017 via email

bob-carpenter commented Apr 10, 2017 via email

charlesm93 commented Apr 10, 2017

charlesm93 commented Nov 8, 2016 •

edited

Loading

charlesm93 commented Dec 5, 2016 •

edited

Loading

charlesm93 commented Mar 27, 2017 •

edited

Loading

charlesm93 commented Mar 29, 2017 •

edited

Loading

charlesm93 commented Apr 3, 2017 •

edited

Loading