Variadic argument test framework #993

bbbales2 · 2018-08-20T09:36:31Z

Edit (@syclik): 4/24/2019. Reducing the issue to just reverse mode. We can create new issues for forward and mix later.

Description

The goal here is to make a nice, generic testing framework built with the new C++14 stuff. This would be used from within Google Test.

Generically goal would be to test consistency of prim/var function implementations with any number of inputs automatically. This would make development of the Math library easier.

What is meant by consistency in all these cases is that, for a given set of inputs,

values match (comparing prim and rev)
derivatives using reverse mode match finite difference or complex step results
error handling is match (comparing prim and rev)

Other requirements:

Provide useful error messages about where in the tests the failures happened. As far as Google test would be concerned, this would just be a single giant test. For it to be useful for identifying bugs, it'd need to be more verbose on output than just EXPECT_EQ failed or whatnot.

Design

A possible interface looks like:

// the function under test
auto test_function = [](auto x, auto y, auto z) { 
  return stan::math::foo(x, y, z);
};

// the testing functions
template <typename F, typename... Targs>
void test_reverse_mode(F f, const Targs&... args);

template <typename F, typename E, typename... Targs>
void test_reverse_mode_exception(F f, E e, const Targs&... args);

// usage
test_reverse_mode(test_function, 1.0, 2.0, 3.0);
test_reverse_mode_exception(test_function, std::domain_error, 1.0, 2.0, 3.0);

This will be able to handle std::vector<double> and std::vector<int> with this interface.

NB: I had originally wanted this interface, but I couldn't see a way to implement it. There might be, but I couldn't figure it out.

test_reverse_mode<stan::math::foo>(1.0, 2.0, 3.0);

Implementation

Technical things to make this possible:

a function that takes in another function + a set of tuples. Call the passed in function with all combinations of arguments of the set of tuples (@bbbales2 has some code for this on branch feature/automatic-autodiff-testing)
a function that gives array-like access to parameter pack of other variables (so all the autodiff can be generic) (I have some code for this -- could use better)
functions for promoting double arguments to var as needed

Instead of computing Jacobians of multiple input/non-scalar output functions. I'm more interested in dotting outputs with random vectors and testing gradients in random directions a few times (so tests are always on scalars). It might be too early optimization, but I'm not fond of the idea of test complexity scaling with number of inputs/outputs and I think this'd be just as good.

@bbbales2's implementation got close, but it promoted all the types, then ran the gradient() function, which is destroys the autodiff stack. We have to promote lazily. In other words, for a univariate function, this is how it should behave:

The user calls test_reverse_mode(test_function, 1.0);. From within test_reverse_mode:

compute the double value as a reference: double expected_result = test_function(1.0);
promote the arguments: var x = to_var(1.0);
compute the var result value: var result = test_function(x);
compute the gradients: std::vector<double> grad; std::vector<var> vars = {x}; result.grad(x, vars, grad);
compare the values: EXPECT_FLOAT_EQ(expected_result, result.val());
compare the gradient: EXPECT_FLOAT_EQ(finite_diff(test_function, x), grad[0]); (need some way to compute the finite diff.
clean the autodiff stack: stan::recover_all_memory().

When this is done with two arguments, we'll have to do steps 2-7 in a loop where we'll have to recursively walk through the possible combinations of var / double for each argument.

Additional Information

The exact features are unclear because I don't understand fully what is easily do-able.

Basically I need an issue to track this stuff.

This is building on stuff from a few testing frameworks that are already in Stan's unit tests:

https://github.com/stan-dev/math/blob/develop/test/unit/math/mix/mat/util/autodiff_tester.hpp
vectorization tests: lib/stan_math/test/unit/math/mix/mat/fun/atan_test.cpp
half of the rng tests (not the half that checks the distributions are working correctly): test/unit/math/prim/mat/prob/vector_rng_test_helper.hpp

Current Math Version

v2.18.0

The text was updated successfully, but these errors were encountered:

bob-carpenter · 2018-08-20T09:48:11Z

I'm OK with the proposed directional derivative tests instead of exhaustive input and output tests. @syclik --- do you have an opinion on this?

I'm not sure why test all combinations of inputs for prim/rev is called out separately.

It should be possible to use or borrow from the general testing framework (the one used for `operator_multiplication_test).

I'm OK dropping the vectorization tests there now and relying on a few explicit instantiations for each in the new framework.

The trick is going to be to build up functionality in small, understandable PRs.

syclik · 2018-08-20T13:56:50Z

On Mon, Aug 20, 2018 at 5:48 AM Bob Carpenter ***@***.***> wrote: I'm OK with the proposed directional derivative tests instead of exhaustive input and output tests. @syclik <https://github.com/syclik> --- do you have an opinion on this?

I'm not sure what the alternative (exhaustive input and output tests) is. For simplicity, I think the overall description / design is good. I think the implementation should do a little more than just check the directional derivatives. Here's what I'm thinking: for anything vectorized, this sort of tests give a false sense of correctness. There are two cases I think we should cover (because we've introduced bugs this way before and I think we'll continue to be at risk because it's tricky). The two cases are where we've vectorized with the same var used multiple times and the other case where all the elements are independent vars. We won't have this sort of testing requirement when using autodiffed operations. We get into this problem when we start specializing derivatives. There isn't really a lower spot to test it at. So.... I know it adds complexity, but maybe a flag to just test the directional derivatives when it's a simply written function and something that will just include a simple construction of whatever vectorized version under both conditions? Not looking for that to be too crazy and full coverage. As long as the value of the derivative isn't trivial (maybe 0), it should be pretty easy to tell if this mistake was made. I think with that, the tests become something that we can trust. If not, I think the person writing the function should add more tests to demonstrate that the vectorized code is correct. Btw, I don't mean we have to test every derivative when it's vectorized. Oh... and happy to discuss; perhaps there's a different way to prevent that sort of behavior. I'm not sure why test all combinations of inputs for prim/rev is called out

separately. It should be possible to use or borrow from the general testing framework (the one used for `operator_multiplication_test). I'm OK dropping the vectorization tests there now and relying on a few explicit instantiations for each in the new framework.

The trick is going to be to build up functionality in small, understandable

PRs

Agreed. Perhaps the first would be to have the simplest test with the interface defined. Then other PRs could be for adding more tests incrementally? —

…

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#993 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZ_FzXyZ2wQ15M83p9lztFUq8z-iDQKks5uSoXigaJpZM4WDqB3> .

syclik · 2018-08-20T13:59:18Z

I should have asked this before: is this supposed to be a function with one return or multiple return values?

…

On Mon, Aug 20, 2018 at 9:54 AM Daniel Lee ***@***.***> wrote: On Mon, Aug 20, 2018 at 5:48 AM Bob Carpenter ***@***.***> wrote: > I'm OK with the proposed directional derivative tests instead of > exhaustive input and output tests. @syclik <https://github.com/syclik> > --- do you have an opinion on this? > I'm not sure what the alternative (exhaustive input and output tests) is. For simplicity, I think the overall description / design is good. I think the implementation should do a little more than just check the directional derivatives. Here's what I'm thinking: for anything vectorized, this sort of tests give a false sense of correctness. There are two cases I think we should cover (because we've introduced bugs this way before and I think we'll continue to be at risk because it's tricky). The two cases are where we've vectorized with the same var used multiple times and the other case where all the elements are independent vars. We won't have this sort of testing requirement when using autodiffed operations. We get into this problem when we start specializing derivatives. There isn't really a lower spot to test it at. So.... I know it adds complexity, but maybe a flag to just test the directional derivatives when it's a simply written function and something that will just include a simple construction of whatever vectorized version under both conditions? Not looking for that to be too crazy and full coverage. As long as the value of the derivative isn't trivial (maybe 0), it should be pretty easy to tell if this mistake was made. I think with that, the tests become something that we can trust. If not, I think the person writing the function should add more tests to demonstrate that the vectorized code is correct. Btw, I don't mean we have to test every derivative when it's vectorized. Oh... and happy to discuss; perhaps there's a different way to prevent that sort of behavior. I'm not sure why test all combinations of inputs for prim/rev is called > out separately. > > It should be possible to use or borrow from the general testing framework > (the one used for `operator_multiplication_test). > > I'm OK dropping the vectorization tests there now and relying on a few > explicit instantiations for each in the new framework. > The trick is going to be to build up functionality in small, > understandable PRs > Agreed. Perhaps the first would be to have the simplest test with the interface defined. Then other PRs could be for adding more tests incrementally? — > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > <#993 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AAZ_FzXyZ2wQ15M83p9lztFUq8z-iDQKks5uSoXigaJpZM4WDqB3> > . >

bob-carpenter · 2018-08-20T15:01:16Z

@syclick: Which function are you asking about? It's supposed to be a test framework for all of our differentiable functions.

Suppose we a function f : R^N -> R^M.

The exhaustive strategy is to test each N * M first-order, N^2 * M second-order, and N^3 * M third-order derivatives.

The directional derivative strategy is to choose a few M-vectors, say y1, ..., yK and test the function lambda x. f(x).transpose() * yk, which is R^N -> R. That reduces testing load by a factor of M without seeming to lose any information if the yk aren't regular.

In general, this is going to work to test arbitrary functions of the form f:(R^N1 x ... x R^Nj) -> R^M. The vectorized functions all have this form and my suggestion was that we just test them like all the unvectorized functions. So far, the vectorization is for distributions and for unary functions pretty much. I was just suggesting we not test the 10,000 possible instantiations of student-t, say, or the arbitrary number of instantiations of the vectorized functions (which work on arbitrary dimension arrays), but make a strategic selection of types to test.

So I don't see how this depends on automatic vs. hand-written derivatives or vectorized vs. unvectorized functions.

syclik · 2018-08-20T17:22:21Z

@Bob_Carpenter, thanks for the clarification! That helps a lot. So I'm definitely with you that we don't need to do exhaustive testing. I'm trying to write out what I think are common, yet hard to debug errors in writing custom gradient code. What I'm thinking is we're covered if we do this: f:(R^2 x R^2 x R^2 ... x R^2) -> R^M where we test where each of the M args are the same vars and where each of the M args are independent vars. I think we'd still be down to R^(2N) -> R, but maybe I'm mistaken.

…

On Mon, Aug 20, 2018 at 11:01 AM Bob Carpenter ***@***.***> wrote: @syclick <https://github.com/syclick>: Which function are you asking about? It's supposed to be a test framework for all of our differentiable functions. Suppose we a function f : R^N -> R^M. The exhaustive strategy is to test each N * M first-order, N^2 * M second-order, and N^3 * M third-order derivatives. The directional derivative strategy is to choose a few M-vectors, say y1, ..., yK and test the function lambda x. f(x).transpose() * yk, which is R^N -> R. That reduces testing load by a factor of M without seeming to lose any information if the yk aren't regular. In general, this is going to work to test arbitrary functions of the form f:(R^N1 x ... x R^Nj) -> R^M. The vectorized functions all have this form and my suggestion was that we just test them like all the unvectorized functions. So far, the vectorization is for distributions and for unary functions pretty much. I was just suggesting we not test the 10,000 possible instantiations of student-t, say, or the arbitrary number of instantiations of the vectorized functions (which work on arbitrary dimension arrays), but make a strategic selection of types to test. So I don't see how this depends on automatic vs. hand-written derivatives or vectorized vs. unvectorized functions. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#993 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZ_F37uDfOXMg-Qoljx9YjZzmKqLxMcks5uSs89gaJpZM4WDqB3> .

bbbales2 · 2018-08-20T17:41:43Z

So with the lingo so far, f : R^N -> R^M, then the thing I was talking about testing is the product w^T * J * v, where J is the Jacobian of f, w is an M row random column vector and v is an N row random column vector.

Whether you're testing with fvars or vars, it's one forward or one reverse pass to get the number you wanna test. If you kept doing them, you could solve for the J. The finite difference stencils and complex step check check code would be simplified as well.

bob-carpenter · 2018-08-21T08:56:52Z

On Aug 20, 2018, at 7:41 PM, Ben Bales ***@***.***> wrote: So with the lingo so far, f : R^N -> R^M, then the thing I was talking about testing is the product w^T * J * v, where J is the Jacobian of f, w is an M row random column vector and v is an N row random column vector.

Cool. I missed the final multiply by v.

Whether you're testing with fvars or vars, it's one forward or one reverse pass to get the number you wanna test.

That last multiply by v is another O(m) reduction in testing load.

If you kept doing them, you could solve for the J.

Would that require M passes? I don't think this is necessary. If there are special inputs, we can test for those in the same way.

The finite difference stencils and complex step check check code would be simplified as well.

I'm not sure what you mean by "stencil" but the functionals for finite diff should stay the same. We can add a functional to compute directional derivatives with reverse mode, but I don't know you'd use that in testing.

…ace (Issue #993)

… changes so I just wanted to save a copy of this (Issue #993)

…ities it uses (Issue #993)

… (Issue #993)

… and broke test_autodiff into smaller pieces (Issue #993)

… (Issue #993)

…ble_to_test.cpp) (Issue #993)

syclik · 2019-02-10T02:38:14Z

I want to make sure we don't lose track of @bbbales2's effort. His original PR was here: #1021.

We identified a couple problems. The first was that the variables weren't persisting on the stack so we were dealing with undefined behavior. We can fix this by lazily promoting the argument types to stan::math::var types.

There's a really cool thing that @bbbales2 did inside his PR that we should take note of. His call_all_argument_combos expands arguments into the exhaustive list of tuple of arguments. It's done using some neat recursion. We'll want to do the same using types instead of instantiated objects.

syclik · 2019-04-24T12:29:09Z

I updated the issue description with more information. I've been slowly working on this... I'm closer, but still not complete. Would love help if anyone is inclined.

ghost · 2019-04-24T18:54:16Z

I'm interested. What might be involved?

syclik · 2019-04-25T15:16:59Z

I'm interested. What might be involved?

Great question. First off, a lot of variadic template stuff. (I'm still getting fluent in it; assume C++14.)

If you want to know exactly what I'm thinking about, @bbbales2 put together something called call_all_arg_combos or something like that on his branch. It essentially walks through each of the arguments, figures out what needs to be promoted (from double to var and it handles things like std::vector<double>), then builds a typelist with the full combinatorial expansion of arguments AND it also carries the instance of this type which contains the promoted vars. He cleverly built that recursively using a bunch of templated C++.

That was almost correct. Instead of promoting all the vars eagerly, we want to promote lazily. That means we need to do the same thing that he did with the typelist, but having the promotion happen later. That way we can be safe about usage by adding vars to the stack, running the gradient code, then recovering memory multiple times. So... the thing we need to do is do that combinatorial expansion with types, not implicitly with instances. Once we figure that out, we can do the rest. I was having trouble manipulating types to build that list... it's a lot easier for me to think about building a binary tree with recursion with objects, but it is definitely possible to do with types.

I don't know how much of that is understandable, but that's the problem I'm dealing with right now. If you want to help with this part, I can try to write a proper spec for what I'm doing.

ghost · 2019-04-25T23:51:31Z

Where is the code you were using to manipulate the list of types?

I noticed that an old branch cited in Implementation: feature/issue-993-testing is deleted or otherwise missing.

syclik · 2019-04-26T02:46:44Z

Whoops. Wrong branch. It's this one: feature/automatic-autodiff-testing

syclik · 2019-04-26T02:49:14Z

I updated the PR description with the correct branch.

rok-cesnovar · 2019-10-16T09:18:37Z

I figure this is also a non-issue with the new AD testing framework. Closing. If I missed something please reopen.

bbbales2 added a commit that referenced this issue Sep 5, 2018

Added variable adapter to adapt parameter packs to 1d indexing interf…

2057ad1

…ace (Issue #993)

bbbales2 added a commit that referenced this issue Sep 5, 2018

Added logic to do argument promotion for tests (Issue #993)

8d2d46b

bbbales2 added a commit that referenced this issue Sep 5, 2018

Variadic autodiff test framework checkpoint. About to make some major…

34a163c

… changes so I just wanted to save a copy of this (Issue #993)

bbbales2 added a commit that referenced this issue Sep 5, 2018

Added lots of tests and docs for the autodiff tester and all the util…

1b3cc0c

…ities it uses (Issue #993)

bbbales2 added a commit that referenced this issue Sep 5, 2018

Updated test code to work with c++14 (Issue #993)

1b3c059

bbbales2 added a commit that referenced this issue Sep 5, 2018

Added VectorXd, RowVectorXd, and MatrixXd tests for variable_adapater…

d92f7fb

… (Issue #993)

bbbales2 added a commit that referenced this issue Sep 5, 2018

Added tests for apply (Issue #993)

037928b

bbbales2 added a commit that referenced this issue Sep 5, 2018

Added tuple size checks to call_all_argument_combos (Issue #993)

3d172e2

bbbales2 mentioned this issue Sep 5, 2018

Automatic reverse mode autodiff testing (Issue #993) #1021

Closed

5 tasks

bbbales2 added a commit that referenced this issue Sep 8, 2018

Fixed cpplint & doxygen errors (Issue #993)

dd76e91

bbbales2 added a commit that referenced this issue Sep 12, 2018

Clarified some doc, renamed promote_double_to_T to promote_double_to,…

1fd117d

… and broke test_autodiff into smaller pieces (Issue #993)

bbbales2 added a commit that referenced this issue Sep 12, 2018

Fix sign comparison issue (Issue #993)

4994c17

bbbales2 added a commit that referenced this issue Sep 13, 2018

Added missing apply header to adj_jac_apply (Issue #993)

fde7a93

bbbales2 added a commit that referenced this issue Sep 13, 2018

Updated call_all_argument_combos doc and modified it to work with gcc…

eec0f9c

… (Issue #993)

bbbales2 added a commit that referenced this issue Sep 13, 2018

Removed old test (these tests are already in mat/scal/fun/promote_dou…

17d800c

…ble_to_test.cpp) (Issue #993)

bbbales2 added a commit that referenced this issue Sep 13, 2018

Reworked tuple constructors in apply_test (Issue #993)

76b42c1

bbbales2 added a commit that referenced this issue Sep 13, 2018

Fixed promote_double_to to work with g++ (Issue #993)

979d483

rok-cesnovar closed this as completed Oct 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variadic argument test framework #993

Variadic argument test framework #993

bbbales2 commented Aug 20, 2018 •

edited by syclik

bob-carpenter commented Aug 20, 2018

syclik commented Aug 20, 2018 via email

syclik commented Aug 20, 2018 via email

bob-carpenter commented Aug 20, 2018

syclik commented Aug 20, 2018 via email

bbbales2 commented Aug 20, 2018

bob-carpenter commented Aug 21, 2018 via email

syclik commented Feb 10, 2019

syclik commented Apr 24, 2019

ghost commented Apr 24, 2019

syclik commented Apr 25, 2019

ghost commented Apr 25, 2019

syclik commented Apr 26, 2019 •

edited

syclik commented Apr 26, 2019

rok-cesnovar commented Oct 16, 2019

Variadic argument test framework #993

Variadic argument test framework #993

Comments

bbbales2 commented Aug 20, 2018 • edited by syclik

Description

Design

Implementation

Additional Information

Current Math Version

bob-carpenter commented Aug 20, 2018

syclik commented Aug 20, 2018 via email

syclik commented Aug 20, 2018 via email

bob-carpenter commented Aug 20, 2018

syclik commented Aug 20, 2018 via email

bbbales2 commented Aug 20, 2018

bob-carpenter commented Aug 21, 2018 via email

syclik commented Feb 10, 2019

syclik commented Apr 24, 2019

ghost commented Apr 24, 2019

syclik commented Apr 25, 2019

ghost commented Apr 25, 2019

syclik commented Apr 26, 2019 • edited

syclik commented Apr 26, 2019

rok-cesnovar commented Oct 16, 2019

bbbales2 commented Aug 20, 2018 •

edited by syclik

syclik commented Apr 26, 2019 •

edited