Implement ordinal regression GLM (ordered_logistic_glm_lpmf) #1252

t4c1 · 2019-05-23T12:46:57Z

Summary

This implements ordinal regression GLM (ordered_logistic_glm_lpmf) for CPU.

Tests

New tests are in rev/mat/prob/ordered_logistic_glm_lpmf_test.cpp.

Side Effects

None.

Checklist

Math issue Implement ordinal regression GLM (ordered_logistic_glm_lpmf) #1251
Copyright holder: Tadej Ciglarič, University of Ljubljana

The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

…ic test.

…stable/2017-11-14)

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp

t4c1 · 2019-05-23T14:16:24Z

I measured speedups of this compared to using ordered_logistic_lpmf(). Benchmark code is here. Here are results:

bob-carpenter · 2019-05-23T17:42:44Z

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp

+ * @param y integer vector parameter
+ * @param x design matrix
+ * @param beta weight vector
+ * @param cuts vector of cutpoints


I assume this only accepts a column vector, since that's what we take "vector" to mean. Otherwise, please indicate this can be a row vector or column vector. We want the doc to be clear on types here since the templating isn't providing that information when you template the whole container.

I'm just adding single comments rather than reviewing as I don't want to overreview someone else who might be doing this.

bob-carpenter · 2019-05-23T17:43:41Z

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp

+  check_finite(function, "Final cut-point", cuts[N_classes - 2]);
+  check_finite(function, "First cut-point", cuts[0]);
+
+  if (size_zero(y, x, beta, cuts))


This isn't your fault, but it's another function that landed without sufficient review---it should be named size_zero_any or something like that. Anyway, guess that's a different PR.

Yeah, this one is on me. This got in before the _any discussion in the is_nan PR. Sorry about that. I am assigning myself to fix this one.

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp

bob-carpenter · 2019-05-28T16:55:09Z

Whether multiplication is expensive or not depends on the context. It's slower than addition, but faster than exponentiation and much faster than a cache miss on memory. Doesn't evaluating .(cut > 0) require branching. I can imagine it's better at allow vectorization than a literal branch in the code. If you know how to do it, it would be great to speed up Stan's vectorized functions. As is, they're vectorized naively by just calling the function repeatedly. Not in this PR, of course!

t4c1 · 2019-05-28T18:54:32Z

cut>0 is just a comparison, not a branch. I am not sure, but I think comparisons can be vectorized.

Explicitely vectorizing code requires use of intrinsics, which is tedious and not portable. Luckily Eigen does it, so we do not have to. I just assume all Eigen expression will produce vectorized binary code. While there might be exceptions, Eigen expessions seem to always results in equivalent or faster calculations than doing things element-wise.

bob-carpenter · 2019-05-28T19:39:38Z

On May 28, 2019, at 2:54 PM, t4c1 ***@***.***> wrote: cut>0 is just a comparison, not a branch. I am not sure, but I think comparisons can be vectorized.

That would be cool if they could be vectorized. It's not a branch in the if/then structure, but internally, it has to be doing a comparison and taking different behavior based on the result as that's the semantics. But it may be encapsulated such that it doesn't look like a branch to the compiler. That's why I was asking. My compiler knowledge isn't that deep.

Explicitely vectorizing code requires use of intrinsics, which is tedious and not portable. Luckily Eigen does it, so we do not have to. I just assume all Eigen expression will produce vectorized binary code. While there might be exceptions, Eigen expessions seem to always results in equivalent or faster calculations than doing things element-wise.

That seems to imply we should redo our vectorization to lean on whatever Eigen's using to do it. I'm not saying we should write our own proper vectorization library, just that we might be able to lean on Eigen's. You seem to be writing faster versions of our vectorized functions using Eigen. I'd like to expose that to users as elementwise application of functions happens everywhere.

t4c1 · 2019-05-28T21:03:57Z

taking different behavior

That is not true. Comparisons are basic CPU instructions, just like for example adition. Just the result is either 0 or 1. Comparisons without if/loop are not branches.

bob-carpenter · 2019-05-29T16:00:46Z

I think I see what you're getting at. Just because comparison has an intrinsically branching behavior, it doesn't entail generating branching instructions. Is the relevant feature here whether you need to jump for one of the branches? On a side note, do you know if the ternary conditional operator (a ? b : c) generates branches? There's a lot of bad folk wisdom around all this stuff and I've never dug into it deeply enough or found it made enough of a difference to show up in end-to-end profiling. So thanks for bearing with.

…

On May 28, 2019, at 5:03 PM, t4c1 ***@***.***> wrote: taking different behavior That is not true. Comparisons are basic CPU instructions, just like for example adition. Just the result is either 0 or 1. Comparisons without if/loop are not branches. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

t4c1 · 2019-05-29T16:17:34Z

Sorry, I don't understant, what do you mean with the first question. Ternary operator is a branch.

On a side note I assume we want this GLM to have similar input types as softmax regression.

bob-carpenter · 2019-05-29T16:38:08Z

Let me elaborate with an example. Let's say I have a naive implementation of absolute value using the ternary operator.

a > 0 : a ? -a

The question is whether this generates branching instructions or not. How would that compare to the

(a > 0) * a + (1 - (a > 0)) * -a

which shouldn't generate any branching. I saw a Stack Overflow discussion about this recently which said the first version's faster on modern compilers, but now I can't find it.

t4c1 · 2019-05-30T06:23:14Z

FIrst one would generate a branch and second would not. But the second would compile into quite a number of instructions. However, both might be optimized by a compiler.

…stable/2017-11-14)

t4c1 · 2019-05-30T09:18:57Z

This test failure was expected and will disappear once PR1249 gets merged.

…stable/2017-11-14)

stan-buildbot · 2019-06-10T14:27:54Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 1.01)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.98)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.98)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 0.99821972904
Commit hash: 1dad46b

stan-buildbot · 2019-06-20T10:04:36Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.99)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.98)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.02)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.02)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 0.99)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.04)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.01)
Result: 1.00248068624
Commit hash: 1dad46b

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp

stan-buildbot · 2019-06-24T14:33:49Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.98)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.98)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 0.99)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.95)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.03)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.99)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.99)
Result: 0.99429347567
Commit hash: 94f4635

stan-buildbot · 2019-08-30T18:24:06Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.99)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.01)
(compilation, 1.03)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.01)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.93)
Result: 0.99891798471
Commit hash: bd54230

rok-cesnovar · 2019-09-11T09:25:30Z

Do we have any volunteers to review this? Bob's and my intial comments were addressed so this is definitely ready for a proper review.

andrjohns

This looks great! I always learn a lot about using Eigen from your pulls.

The rev testing is very thorough, but I'm not sure what the policy on tests are for the prob functions (i.e. whether you also need prim/fwd/mix tests here) @syclik what's your view here?

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp

t4c1 · 2019-09-13T06:25:05Z

What is going on with these tests, running for 22 hours?

rok-cesnovar · 2019-09-13T06:33:51Z

Its somethin with the AWS instances I think. Restarting the entire tests is the only way I know how to fix this.

cc: @serban-nicusor-toptal

stan-buildbot · 2019-09-13T11:18:05Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.98)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.97)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.89)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.01)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 0.98)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.98)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.99)
Result: 0.98804564761
Commit hash: 7973dbc

t4c1 · 2019-09-13T11:25:30Z

@andrjohns This is ready for next review.

andrjohns

Looks great!

t4c1 and others added 11 commits May 21, 2019 12:00

First wip implementation of ordered_logistic_glm_lpmf and a problemat…

a0008b5

…ic test.

overflow prevention

74ba255

added big checks

388fd9b

performance improvements

c056c7d

completed tests and removed debugging code

0bfeae1

added doxygen

ed3b35f

added newlines to end of new files

65a76dd

Merge commit 'f49f224561ea2f2e10082d721075d27161ea6f25' into HEAD

991dba1

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

9705d67

…stable/2017-11-14)

fixed cpplint

2122dd5

fixed headers

4837bca

andrjohns reviewed May 23, 2019

View reviewed changes

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp Outdated Show resolved Hide resolved

bob-carpenter reviewed May 23, 2019

View reviewed changes

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp Outdated Show resolved Hide resolved

bob-carpenter reviewed May 23, 2019

View reviewed changes

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp Outdated Show resolved Hide resolved

t4c1 and others added 3 commits May 30, 2019 09:18

changed input types and enabled broadcasting

30fdb65

optimized log1p_exp calculations

a279322

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

c3cd074

…stable/2017-11-14)

Merge branch 'develop' into CPU_ordered_logistic_glm_lpmf

5440978

t4c1 and others added 2 commits June 10, 2019 12:13

underflow prevention

3efdbee

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

1dad46b

…stable/2017-11-14)

rok-cesnovar reviewed Jun 22, 2019

View reviewed changes

stan/math/prim/mat/prob/ordered_logistic_glm_lpmf.hpp Outdated Show resolved Hide resolved

t4c1 added 2 commits June 24, 2019 10:43

Merge branch 'develop' into CPU_ordered_logistic_glm_lpmf

84e628a

updated to use single meta include

94f4635

serban-nicusor-toptal added this to the 2.20.0++ milestone Jul 18, 2019

t4c1 mentioned this pull request Aug 26, 2019

Implement GLMs in OpenCL #1319

Closed

7 tasks

t4c1 added 2 commits August 30, 2019 16:05

Merge branch 'develop' into CPU_ordered_logistic_glm_lpmf

51a4617

replace is_constant_struct with is_constant_all

bd54230

andrjohns requested changes Sep 12, 2019

View reviewed changes

addressed review comments

7973dbc

andrjohns approved these changes Sep 16, 2019

View reviewed changes

andrjohns merged commit 591efb9 into stan-dev:develop Sep 16, 2019

rok-cesnovar mentioned this pull request Aug 18, 2020

use of glm functions for multilevel models paul-buerkner/brms#984

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ordinal regression GLM (ordered_logistic_glm_lpmf) #1252

Implement ordinal regression GLM (ordered_logistic_glm_lpmf) #1252

t4c1 commented May 23, 2019 •

edited

t4c1 commented May 23, 2019

bob-carpenter May 23, 2019 •

edited

bob-carpenter May 23, 2019

rok-cesnovar May 23, 2019

bob-carpenter commented May 28, 2019 via email

t4c1 commented May 28, 2019

bob-carpenter commented May 28, 2019 via email

t4c1 commented May 28, 2019 •

edited

bob-carpenter commented May 29, 2019 via email

t4c1 commented May 29, 2019

bob-carpenter commented May 29, 2019

t4c1 commented May 30, 2019 •

edited

t4c1 commented May 30, 2019

stan-buildbot commented Jun 10, 2019

stan-buildbot commented Jun 20, 2019

stan-buildbot commented Jun 24, 2019

stan-buildbot commented Aug 30, 2019

rok-cesnovar commented Sep 11, 2019

andrjohns left a comment

t4c1 commented Sep 13, 2019

rok-cesnovar commented Sep 13, 2019

stan-buildbot commented Sep 13, 2019

t4c1 commented Sep 13, 2019

andrjohns left a comment

Implement ordinal regression GLM (ordered_logistic_glm_lpmf) #1252

Implement ordinal regression GLM (ordered_logistic_glm_lpmf) #1252

Conversation

t4c1 commented May 23, 2019 • edited

Summary

Tests

Side Effects

Checklist

t4c1 commented May 23, 2019

bob-carpenter May 23, 2019 • edited

Choose a reason for hiding this comment

bob-carpenter May 23, 2019

Choose a reason for hiding this comment

rok-cesnovar May 23, 2019

Choose a reason for hiding this comment

bob-carpenter commented May 28, 2019 via email

t4c1 commented May 28, 2019

bob-carpenter commented May 28, 2019 via email

t4c1 commented May 28, 2019 • edited

bob-carpenter commented May 29, 2019 via email

t4c1 commented May 29, 2019

bob-carpenter commented May 29, 2019

t4c1 commented May 30, 2019 • edited

t4c1 commented May 30, 2019

stan-buildbot commented Jun 10, 2019

stan-buildbot commented Jun 20, 2019

stan-buildbot commented Jun 24, 2019

stan-buildbot commented Aug 30, 2019

rok-cesnovar commented Sep 11, 2019

andrjohns left a comment

Choose a reason for hiding this comment

t4c1 commented Sep 13, 2019

rok-cesnovar commented Sep 13, 2019

stan-buildbot commented Sep 13, 2019

t4c1 commented Sep 13, 2019

andrjohns left a comment

Choose a reason for hiding this comment

t4c1 commented May 23, 2019 •

edited

bob-carpenter May 23, 2019 •

edited

t4c1 commented May 28, 2019 •

edited

t4c1 commented May 30, 2019 •

edited