-
-
Notifications
You must be signed in to change notification settings - Fork 197
feature/neg_binomial_2_log_glm #658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature/neg_binomial_2_log_glm #658
Conversation
… hand-built linear regression
Can one of the admins verify this patch? |
Jenkins, test this please. |
According to my tests, this is >6 faster than a similar regression built from existing primitives (using neg_binomial_2_log_lpmf). |
…re/neg_binomial_2_log_glm
Jenkins, ok to test. |
Hi @seantalts , as far as I'm concerned this is done and can be merged. |
…re/neg_binomial_2_log_glm
Hey, there was a change recently such that now cpplint runs on test files as well, and this is failing there: http://d1m1s1b1.stat.columbia.edu:8080/job/Math%20Pipeline/view/change-requests/job/PR-658/2/execution/node/23/log/ |
OK, I've satisfied cpplint for the test as well now. |
Jenkins, retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions and only one thing that definitely needs to change - that annoying std::string
-> char*
thing, sorry again.
* If containers are supplied, returns the log sum of the probabilities. | ||
* @tparam T_n type of positive int vector of dependent variables (labels); | ||
* this can also be a single positive integer value; | ||
* @tparam T_x type of the matrix of independent variables (features); this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think statisticians usually use things like "variates" and "covariates" FWIW. Not sure if we have a consistent nomenclature. @betanalpha do you know if we have a preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it variates and covariates.
* @tparam T_beta type of the weight vector; | ||
* this can also be a single value; | ||
* @tparam T_alpha type of the intercept; | ||
* this should be a single value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We often say "scalar" to be more specific when distinguishing "single values" from vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
typename return_type<T_x, T_beta, T_alpha, T_precision>::type | ||
neg_binomial_2_log_glm_lpmf(const T_n &n, const T_x &x, const T_beta &beta, | ||
const T_alpha &alpha, const T_precision &phi) { | ||
static const std::string function = "neg_binomial_2_log_glm_lpmf"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one also needs the std::string -> char* fix :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
}).sum(); | ||
} | ||
if (include_summand<propto, T_precision>::value) { | ||
logp += (phi_arr.binaryExpr(phi_arr, [](T_partials_return xx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A thing we might prefer to do here is define multiply_log
and lgamma
(below) for Eigen types - it's possible we could save a fair bit of computation by doing that intelligently. This version I think will create a lot of intermediate var
s. @bob-carpenter Am I right about that? What do you think about this binaryExpr
(basically map
+ zip
for Eigen Arrays - runs the function coefficient-wise along both of them in lock step, returning the result at each point as a new array) being here in this file vs. as a new specialization of multiply_log? I'm guessing the only real reason to do the latter would be potential performance gains, and to maybe save a slight amount of typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what binaryExpr
is supposed to do, map
after zip
, I think. I was hoping that this would already be reasonably efficient. I now see that apply_scalar_unary
in the Stan math library does something similar to unaryExpr
. Would that be more efficient, @bob-carpenter ? Also, would you have a suggested way of implementing something like binaryExpr
, perhaps using apply_scalar_unary
and a zip
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean this version is definitely good enough; I'm assuming you did the performance test and the custom derivatives were already better than autodiffing through. If someone (Bob? @bgoodri?) thinks there will be more performance gains in a custom version we can always add that later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this version was 4.5x faster than the composite of existing primitives, in my tests. I should probably still do more extensive tests though on different shapes of data to be sure of how robust the speedup of the new primitives is.
if (include_summand<propto, T_x, T_beta, T_alpha>::value) | ||
logp += (n_arr * theta_dbl).sum(); | ||
if (include_summand<propto, T_precision>::value) { | ||
logp += n_plus_phi.unaryExpr([](T_partials_return xx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just pass lgamma
in here instead of creating a lambda? I'm assuming you tried and got some type error, curious about that...
(applies to all of these binary
- and unaryExpr
s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it seems we cannot just pass lgamma
. I think binaryExpr
and unaryExpr
expect a lambda-abstraction (or a functor). It seems to not expect a function. Joys of C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boo.
&& is_constant_struct<T_alpha>::value)) { | ||
Matrix<T_partials_return, Dynamic, 1> theta_derivative(N, 1); | ||
theta_derivative = (n_arr - (n_plus_phi / (phi_arr / (theta_dbl.exp()) | ||
+ Array<double, Dynamic, 1>::Ones(N, 1)))).matrix(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have wisdom / rules of thumb to share on when .matrix()
or .eval()
are required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, a little bit. .matrix() and .array() convert back and forth between Eigen::Matrix and Eigen::Array types. There is no runtime cost. The two types just provide a different interface: matrix operations vs coefficient-wise operations. Sometimes you need one, sometimes the other.
.eval() forces eager evaluation, as far as I understand it. By default, Eigen has lazy evaluation. I guess there are some cases where you want eager evaluation for efficiency or for correctness purposes, which you can force with .eval(). I haven't used it though.
Matrix<double, Dynamic, 1> phi(3, 1); | ||
phi << 2, 1, 0.2; | ||
|
||
EXPECT_FLOAT_EQ((stan::math::neg_binomial_2_log_lpmf(n, theta, phi)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern is looking so familiar! I wonder if there's a way to abstract further (not in this PR, just brainstorming). Like maybe using templates to construct a GLM version of any distribution... with C++11 we can have variadic template args and fun things like that. Curious if you have thoughts here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This could be done much more generally. I even wonder how much automation could be put into constructing compiler optimisations of the shape f(g(x)) -> f_composed_with_g(x). Ideally, you'd want to just scan a large base of Stan code, see which primitives are most often combined and then automatically implement a primitive for the composite function. That's a bit of a wild idea of course as it requires us to be able to compute derivatives symbolically. But I wonder if we could automate some of it in cases like GLMs like you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - I mean there are symbolic differentiation libraries that are pretty good, though I'm not sure they're great 100% of the time. We might need some kind of heuristic for deciding during compilation whether 1) calculating the symbolic version is likely to bear fruit 2) given a symbolic version, if it's likely to be faster than autodiff.
This is now ready to be merged, pending a decision on whether to remove the |
Thanks again for all of these! Beautiful. |
Regarding #658 (comment), symbolic derivatives use much less memory than autodiff as there's nothing in the expression graph but the inputs and output. Because the calculations are compiled with |
Why is this:
written this way rather than with |
@bob-carpenter , could you explain a bit? I'm quite new to C++ and am not familiar with proper use of const and passing by reference. Would this speed things up? UPDATE: Never mind. Michael has since explained it to me and I've updated the code. |
…Matthijs/math into feature/neg_binomial_2_log_glm
@VMatthijs Could you take a quick look at the merge conflict here in the includes? I think we want scalar_seq_view still, not sure about the others... |
Hi @seantalts , I just had a look at it: The real mystery though is why the code still runs as it should be required. I just tested it and it runs just fine. Moreover, it seems that the only includes I cannot delete are Thanks! |
So good practice is to be explicit about what you need via including the
files. But the way C++ is compiled is essentially concatenating everything
included (recursively) into one giant file. So as long as something else
you rely on includes the stuff you use, it will compile.
…On Tue, Nov 14, 2017 at 10:35 Matthijs Vákár ***@***.***> wrote:
Hi @seantalts <https://github.com/seantalts> , I just had a look at it:
#include <stan/math/prim/scal/fun/size_zero.hpp>
doesn't even exist anymore so it is not needed;
#include <boost/random/bernoulli_distribution.hpp>
is not needed;
#include <stan/math/prim/scal/meta/scalar_seq_view.hpp>
should be required; I must've accidentally deleted it.
The real mystery though is why the code still runs as it should be
required. I just tested it and it runs just fine. Moreover, it seems that
the only includes I cannot delete are
#include <stan/math/prim/scal/meta/partials_return_type.hpp> and
#include <stan/math/prim/scal/meta/include_summand.hpp> .
Then it complains, but all the others seem redundant. Do you have any idea
why that might be?
Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#658 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAxJ7AxkqO81zWpXdh7xwG8z9-vEC44Jks5s2bNYgaJpZM4QFAiL>
.
|
Thanks for explaining! That makes sense! |
Seems like you might need some new size_zero |
…re/neg_binomial_2_log_glm
…Matthijs/math into feature/neg_binomial_2_log_glm
Submission Checklist
./runTests.py test/unit
make cpplint
Summary:
Adds primitive for Neg_binomial_2 regression with log link function.
Intended Effect:
Speed up such regressions.
How to Verify:
Run performance test (commented out in unit test file).
Side Effects:
Fix typo in Logistic Regression primitive.
Documentation:
Copyright and Licensing
Matthijs Vákár
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: