Skip to content

Conversation

VMatthijs
Copy link
Member

@VMatthijs VMatthijs commented Oct 24, 2017

Submission Checklist

  • Run unit tests: ./runTests.py test/unit
  • Run cpplint: make cpplint
  • Declare copyright holder and open-source license: see below

Summary:

Adds primitive for Neg_binomial_2 regression with log link function.

Intended Effect:

Speed up such regressions.

How to Verify:

Run performance test (commented out in unit test file).

Side Effects:

Fix typo in Logistic Regression primitive.

Documentation:

Copyright and Licensing

Matthijs Vákár

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@stan-buildbot
Copy link
Contributor

Can one of the admins verify this patch?

@VMatthijs VMatthijs changed the title Feature/neg binomial 2 log glm feature/neg_binomial_2_log_glm Oct 24, 2017
@bob-carpenter
Copy link
Member

Jenkins, test this please.

@VMatthijs
Copy link
Member Author

According to my tests, this is >6 faster than a similar regression built from existing primitives (using neg_binomial_2_log_lpmf).

@seantalts
Copy link
Member

Jenkins, ok to test.

@VMatthijs
Copy link
Member Author

Hi @seantalts , as far as I'm concerned this is done and can be merged.

@seantalts
Copy link
Member

Hey, there was a change recently such that now cpplint runs on test files as well, and this is failing there: http://d1m1s1b1.stat.columbia.edu:8080/job/Math%20Pipeline/view/change-requests/job/PR-658/2/execution/node/23/log/

@VMatthijs
Copy link
Member Author

OK, I've satisfied cpplint for the test as well now.

@seantalts
Copy link
Member

Jenkins, retest this please.

@seantalts seantalts self-requested a review November 3, 2017 15:48
Copy link
Member

@seantalts seantalts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and only one thing that definitely needs to change - that annoying std::string -> char* thing, sorry again.

* If containers are supplied, returns the log sum of the probabilities.
* @tparam T_n type of positive int vector of dependent variables (labels);
* this can also be a single positive integer value;
* @tparam T_x type of the matrix of independent variables (features); this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think statisticians usually use things like "variates" and "covariates" FWIW. Not sure if we have a consistent nomenclature. @betanalpha do you know if we have a preference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it variates and covariates.

* @tparam T_beta type of the weight vector;
* this can also be a single value;
* @tparam T_alpha type of the intercept;
* this should be a single value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We often say "scalar" to be more specific when distinguishing "single values" from vectors

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

typename return_type<T_x, T_beta, T_alpha, T_precision>::type
neg_binomial_2_log_glm_lpmf(const T_n &n, const T_x &x, const T_beta &beta,
const T_alpha &alpha, const T_precision &phi) {
static const std::string function = "neg_binomial_2_log_glm_lpmf";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one also needs the std::string -> char* fix :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

}).sum();
}
if (include_summand<propto, T_precision>::value) {
logp += (phi_arr.binaryExpr(phi_arr, [](T_partials_return xx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thing we might prefer to do here is define multiply_log and lgamma (below) for Eigen types - it's possible we could save a fair bit of computation by doing that intelligently. This version I think will create a lot of intermediate vars. @bob-carpenter Am I right about that? What do you think about this binaryExpr (basically map + zip for Eigen Arrays - runs the function coefficient-wise along both of them in lock step, returning the result at each point as a new array) being here in this file vs. as a new specialization of multiply_log? I'm guessing the only real reason to do the latter would be potential performance gains, and to maybe save a slight amount of typing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what binaryExpr is supposed to do, map after zip, I think. I was hoping that this would already be reasonably efficient. I now see that apply_scalar_unary in the Stan math library does something similar to unaryExpr. Would that be more efficient, @bob-carpenter ? Also, would you have a suggested way of implementing something like binaryExpr, perhaps using apply_scalar_unary and a zip?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this version is definitely good enough; I'm assuming you did the performance test and the custom derivatives were already better than autodiffing through. If someone (Bob? @bgoodri?) thinks there will be more performance gains in a custom version we can always add that later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this version was 4.5x faster than the composite of existing primitives, in my tests. I should probably still do more extensive tests though on different shapes of data to be sure of how robust the speedup of the new primitives is.

if (include_summand<propto, T_x, T_beta, T_alpha>::value)
logp += (n_arr * theta_dbl).sum();
if (include_summand<propto, T_precision>::value) {
logp += n_plus_phi.unaryExpr([](T_partials_return xx) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just pass lgamma in here instead of creating a lambda? I'm assuming you tried and got some type error, curious about that...

(applies to all of these binary- and unaryExprs)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it seems we cannot just pass lgamma. I think binaryExpr and unaryExpr expect a lambda-abstraction (or a functor). It seems to not expect a function. Joys of C++.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boo.

&& is_constant_struct<T_alpha>::value)) {
Matrix<T_partials_return, Dynamic, 1> theta_derivative(N, 1);
theta_derivative = (n_arr - (n_plus_phi / (phi_arr / (theta_dbl.exp())
+ Array<double, Dynamic, 1>::Ones(N, 1)))).matrix();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have wisdom / rules of thumb to share on when .matrix() or .eval() are required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, a little bit. .matrix() and .array() convert back and forth between Eigen::Matrix and Eigen::Array types. There is no runtime cost. The two types just provide a different interface: matrix operations vs coefficient-wise operations. Sometimes you need one, sometimes the other.
.eval() forces eager evaluation, as far as I understand it. By default, Eigen has lazy evaluation. I guess there are some cases where you want eager evaluation for efficiency or for correctness purposes, which you can force with .eval(). I haven't used it though.

Matrix<double, Dynamic, 1> phi(3, 1);
phi << 2, 1, 0.2;

EXPECT_FLOAT_EQ((stan::math::neg_binomial_2_log_lpmf(n, theta, phi)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is looking so familiar! I wonder if there's a way to abstract further (not in this PR, just brainstorming). Like maybe using templates to construct a GLM version of any distribution... with C++11 we can have variadic template args and fun things like that. Curious if you have thoughts here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This could be done much more generally. I even wonder how much automation could be put into constructing compiler optimisations of the shape f(g(x)) -> f_composed_with_g(x). Ideally, you'd want to just scan a large base of Stan code, see which primitives are most often combined and then automatically implement a primitive for the composite function. That's a bit of a wild idea of course as it requires us to be able to compute derivatives symbolically. But I wonder if we could automate some of it in cases like GLMs like you suggest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - I mean there are symbolic differentiation libraries that are pretty good, though I'm not sure they're great 100% of the time. We might need some kind of heuristic for deciding during compilation whether 1) calculating the symbolic version is likely to bear fruit 2) given a symbolic version, if it's likely to be faster than autodiff.

@VMatthijs
Copy link
Member Author

This is now ready to be merged, pending a decision on whether to remove the unaryExpr and binaryExpr.

@seantalts
Copy link
Member

Thanks again for all of these! Beautiful.

@bob-carpenter
Copy link
Member

Regarding #658 (comment), symbolic derivatives use much less memory than autodiff as there's nothing in the expression graph but the inputs and output. Because the calculations are compiled with double values, they're almost always faster because autodiff is essentially interpreted. Now having said that, the autodiff algorithm isn't bad and it does the dynamic programming right if you happen to use shared computations in the code being autodiffed.

@bob-carpenter
Copy link
Member

Why is this:

[](T_partials_return xx, T_partials_return yy) {
  return log_sum_exp(xx, yy);
}

written this way rather than with const T_partials_return& as the argument?

@VMatthijs
Copy link
Member Author

VMatthijs commented Nov 13, 2017

@bob-carpenter , could you explain a bit? I'm quite new to C++ and am not familiar with proper use of const and passing by reference. Would this speed things up?

UPDATE: Never mind. Michael has since explained it to me and I've updated the code.

@seantalts
Copy link
Member

@VMatthijs Could you take a quick look at the merge conflict here in the includes? I think we want scalar_seq_view still, not sure about the others...

@VMatthijs
Copy link
Member Author

Hi @seantalts , I just had a look at it:
#include <stan/math/prim/scal/fun/size_zero.hpp>
doesn't even exist anymore so it is not needed;
#include <boost/random/bernoulli_distribution.hpp>
is not needed;
#include <stan/math/prim/scal/meta/scalar_seq_view.hpp>
should be required; I must've accidentally deleted it.

The real mystery though is why the code still runs as it should be required. I just tested it and it runs just fine. Moreover, it seems that the only includes I cannot delete are
#include <stan/math/prim/scal/meta/partials_return_type.hpp> and
#include <stan/math/prim/scal/meta/include_summand.hpp> .
Then it complains, but all the others seem redundant. Do you have any idea why that might be?

Thanks!

@seantalts
Copy link
Member

seantalts commented Nov 14, 2017 via email

@VMatthijs
Copy link
Member Author

Thanks for explaining! That makes sense!

@seantalts
Copy link
Member

Seems like you might need some new size_zero

@seantalts seantalts merged commit b338d50 into stan-dev:develop Nov 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants