New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variadic argument test framework #993
Comments
I'm OK with the proposed directional derivative tests instead of exhaustive input and output tests. @syclik --- do you have an opinion on this? I'm not sure why test all combinations of inputs for prim/rev is called out separately. It should be possible to use or borrow from the general testing framework (the one used for `operator_multiplication_test). I'm OK dropping the vectorization tests there now and relying on a few explicit instantiations for each in the new framework. The trick is going to be to build up functionality in small, understandable PRs. |
On Mon, Aug 20, 2018 at 5:48 AM Bob Carpenter ***@***.***> wrote:
I'm OK with the proposed directional derivative tests instead of
exhaustive input and output tests. @syclik <https://github.com/syclik>
--- do you have an opinion on this?
I'm not sure what the alternative (exhaustive input and output tests) is.
For simplicity, I think the overall description / design is good.
I think the implementation should do a little more than just check the
directional derivatives. Here's what I'm thinking:
for anything vectorized, this sort of tests give a false sense of
correctness. There are two cases I think we should cover (because we've
introduced bugs this way before and I think we'll continue to be at risk
because it's tricky). The two cases are where we've vectorized with the
same var used multiple times and the other case where all the elements are
independent vars.
We won't have this sort of testing requirement when using autodiffed
operations. We get into this problem when we start specializing
derivatives. There isn't really a lower spot to test it at.
So.... I know it adds complexity, but maybe a flag to just test the
directional derivatives when it's a simply written function and something
that will just include a simple construction of whatever vectorized version
under both conditions? Not looking for that to be too crazy and full
coverage. As long as the value of the derivative isn't trivial (maybe 0),
it should be pretty easy to tell if this mistake was made.
I think with that, the tests become something that we can trust. If not, I
think the person writing the function should add more tests to demonstrate
that the vectorized code is correct.
Btw, I don't mean we have to test every derivative when it's vectorized.
Oh... and happy to discuss; perhaps there's a different way to prevent that
sort of behavior.
I'm not sure why test all combinations of inputs for prim/rev is called out
separately.
It should be possible to use or borrow from the general testing framework
(the one used for `operator_multiplication_test).
I'm OK dropping the vectorization tests there now and relying on a few
explicit instantiations for each in the new framework.
The trick is going to be to build up functionality in small, understandable
PRs
Agreed. Perhaps the first would be to have the simplest test with the
interface defined. Then other PRs could be for adding more tests
incrementally?
—
… You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#993 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAZ_FzXyZ2wQ15M83p9lztFUq8z-iDQKks5uSoXigaJpZM4WDqB3>
.
|
I should have asked this before: is this supposed to be a function with one
return or multiple return values?
…On Mon, Aug 20, 2018 at 9:54 AM Daniel Lee ***@***.***> wrote:
On Mon, Aug 20, 2018 at 5:48 AM Bob Carpenter ***@***.***>
wrote:
> I'm OK with the proposed directional derivative tests instead of
> exhaustive input and output tests. @syclik <https://github.com/syclik>
> --- do you have an opinion on this?
>
I'm not sure what the alternative (exhaustive input and output tests) is.
For simplicity, I think the overall description / design is good.
I think the implementation should do a little more than just check the
directional derivatives. Here's what I'm thinking:
for anything vectorized, this sort of tests give a false sense of
correctness. There are two cases I think we should cover (because we've
introduced bugs this way before and I think we'll continue to be at risk
because it's tricky). The two cases are where we've vectorized with the
same var used multiple times and the other case where all the elements are
independent vars.
We won't have this sort of testing requirement when using autodiffed
operations. We get into this problem when we start specializing
derivatives. There isn't really a lower spot to test it at.
So.... I know it adds complexity, but maybe a flag to just test the
directional derivatives when it's a simply written function and something
that will just include a simple construction of whatever vectorized version
under both conditions? Not looking for that to be too crazy and full
coverage. As long as the value of the derivative isn't trivial (maybe 0),
it should be pretty easy to tell if this mistake was made.
I think with that, the tests become something that we can trust. If not, I
think the person writing the function should add more tests to demonstrate
that the vectorized code is correct.
Btw, I don't mean we have to test every derivative when it's vectorized.
Oh... and happy to discuss; perhaps there's a different way to prevent
that sort of behavior.
I'm not sure why test all combinations of inputs for prim/rev is called
> out separately.
>
> It should be possible to use or borrow from the general testing framework
> (the one used for `operator_multiplication_test).
>
> I'm OK dropping the vectorization tests there now and relying on a few
> explicit instantiations for each in the new framework.
>
The trick is going to be to build up functionality in small,
> understandable PRs
>
Agreed. Perhaps the first would be to have the simplest test with the
interface defined. Then other PRs could be for adding more tests
incrementally?
—
> You are receiving this because you were mentioned.
>
>
> Reply to this email directly, view it on GitHub
> <#993 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAZ_FzXyZ2wQ15M83p9lztFUq8z-iDQKks5uSoXigaJpZM4WDqB3>
> .
>
|
@syclick: Which function are you asking about? It's supposed to be a test framework for all of our differentiable functions. Suppose we a function The exhaustive strategy is to test each The directional derivative strategy is to choose a few In general, this is going to work to test arbitrary functions of the form So I don't see how this depends on automatic vs. hand-written derivatives or vectorized vs. unvectorized functions. |
@Bob_Carpenter, thanks for the clarification! That helps a lot.
So I'm definitely with you that we don't need to do exhaustive testing. I'm
trying to write out what I think are common, yet hard to debug errors in
writing custom gradient code.
What I'm thinking is we're covered if we do this:
f:(R^2 x R^2 x R^2 ... x R^2) -> R^M where we test where each of the M args
are the same vars and where each of the M args are independent vars.
I think we'd still be down to R^(2N) -> R, but maybe I'm mistaken.
…On Mon, Aug 20, 2018 at 11:01 AM Bob Carpenter ***@***.***> wrote:
@syclick <https://github.com/syclick>: Which function are you asking
about? It's supposed to be a test framework for all of our differentiable
functions.
Suppose we a function f : R^N -> R^M.
The exhaustive strategy is to test each N * M first-order, N^2 * M
second-order, and N^3 * M third-order derivatives.
The directional derivative strategy is to choose a few M-vectors, say y1,
..., yK and test the function lambda x. f(x).transpose() * yk, which is R^N
-> R. That reduces testing load by a factor of M without seeming to lose
any information if the yk aren't regular.
In general, this is going to work to test arbitrary functions of the form f:(R^N1
x ... x R^Nj) -> R^M. The vectorized functions all have this form and my
suggestion was that we just test them like all the unvectorized functions.
So far, the vectorization is for distributions and for unary functions
pretty much. I was just suggesting we not test the 10,000 possible
instantiations of student-t, say, or the arbitrary number of instantiations
of the vectorized functions (which work on arbitrary dimension arrays), but
make a strategic selection of types to test.
So I don't see how this depends on automatic vs. hand-written derivatives
or vectorized vs. unvectorized functions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#993 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAZ_F37uDfOXMg-Qoljx9YjZzmKqLxMcks5uSs89gaJpZM4WDqB3>
.
|
So with the lingo so far, Whether you're testing with fvars or vars, it's one forward or one reverse pass to get the number you wanna test. If you kept doing them, you could solve for the J. The finite difference stencils and complex step check check code would be simplified as well. |
On Aug 20, 2018, at 7:41 PM, Ben Bales ***@***.***> wrote:
So with the lingo so far, f : R^N -> R^M, then the thing I was talking about testing is the product w^T * J * v, where J is the Jacobian of f, w is an M row random column vector and v is an N row random column vector.
Cool. I missed the final multiply by v.
Whether you're testing with fvars or vars, it's one forward or one reverse pass to get the number you wanna test.
That last multiply by v is another O(m) reduction in testing load.
If you kept doing them, you could solve for the J.
Would that require M passes?
I don't think this is necessary.
If there are special inputs, we can test for those in the same way.
The finite difference stencils and complex step check check code would be simplified as well.
I'm not sure what you mean by "stencil" but the functionals for finite diff should stay the same. We can add a functional to compute directional derivatives with reverse mode, but I don't know you'd use that in testing.
|
… changes so I just wanted to save a copy of this (Issue #993)
… and broke test_autodiff into smaller pieces (Issue #993)
I want to make sure we don't lose track of @bbbales2's effort. His original PR was here: #1021. We identified a couple problems. The first was that the variables weren't persisting on the stack so we were dealing with undefined behavior. We can fix this by lazily promoting the argument types to There's a really cool thing that @bbbales2 did inside his PR that we should take note of. His |
I updated the issue description with more information. I've been slowly working on this... I'm closer, but still not complete. Would love help if anyone is inclined. |
I'm interested. What might be involved? |
Great question. First off, a lot of variadic template stuff. (I'm still getting fluent in it; assume C++14.) If you want to know exactly what I'm thinking about, @bbbales2 put together something called That was almost correct. Instead of promoting all the I don't know how much of that is understandable, but that's the problem I'm dealing with right now. If you want to help with this part, I can try to write a proper spec for what I'm doing. |
Where is the code you were using to manipulate the list of types? I noticed that an old branch cited in Implementation: |
Whoops. Wrong branch. It's this one: |
I updated the PR description with the correct branch. |
I figure this is also a non-issue with the new AD testing framework. Closing. If I missed something please reopen. |
Edit (@syclik): 4/24/2019. Reducing the issue to just reverse mode. We can create new issues for forward and mix later.
Description
The goal here is to make a nice, generic testing framework built with the new C++14 stuff. This would be used from within Google Test.
Generically goal would be to test consistency of prim/var function implementations with any number of inputs automatically. This would make development of the Math library easier.
What is meant by consistency in all these cases is that, for a given set of inputs,
Other requirements:
Design
A possible interface looks like:
This will be able to handle
std::vector<double>
andstd::vector<int>
with this interface.NB: I had originally wanted this interface, but I couldn't see a way to implement it. There might be, but I couldn't figure it out.
Implementation
Technical things to make this possible:
feature/automatic-autodiff-testing
)var
as neededInstead of computing Jacobians of multiple input/non-scalar output functions. I'm more interested in dotting outputs with random vectors and testing gradients in random directions a few times (so tests are always on scalars). It might be too early optimization, but I'm not fond of the idea of test complexity scaling with number of inputs/outputs and I think this'd be just as good.
@bbbales2's implementation got close, but it promoted all the types, then ran the
gradient()
function, which is destroys the autodiff stack. We have to promote lazily. In other words, for a univariate function, this is how it should behave:The user calls
test_reverse_mode(test_function, 1.0);
. From withintest_reverse_mode
:double expected_result = test_function(1.0);
var x = to_var(1.0);
var result = test_function(x);
std::vector<double> grad; std::vector<var> vars = {x}; result.grad(x, vars, grad);
EXPECT_FLOAT_EQ(expected_result, result.val());
EXPECT_FLOAT_EQ(finite_diff(test_function, x), grad[0]);
(need some way to compute the finite diff.stan::recover_all_memory()
.When this is done with two arguments, we'll have to do steps 2-7 in a loop where we'll have to recursively walk through the possible combinations of
var
/double
for each argument.Additional Information
The exact features are unclear because I don't understand fully what is easily do-able.
Basically I need an issue to track this stuff.
This is building on stuff from a few testing frameworks that are already in Stan's unit tests:
Current Math Version
v2.18.0
The text was updated successfully, but these errors were encountered: