Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ordinal regression GLM (ordered_logistic_glm_lpmf) #1252

Merged
merged 22 commits into from Sep 16, 2019

Conversation

t4c1
Copy link
Contributor

@t4c1 t4c1 commented May 23, 2019

Summary

This implements ordinal regression GLM (ordered_logistic_glm_lpmf) for CPU.

Tests

New tests are in rev/mat/prob/ordered_logistic_glm_lpmf_test.cpp.

Side Effects

None.

Checklist

@t4c1
Copy link
Contributor Author

t4c1 commented May 23, 2019

I measured speedups of this compared to using ordered_logistic_lpmf(). Benchmark code is here. Here are results:

glm_speedup

* @param y integer vector parameter
* @param x design matrix
* @param beta weight vector
* @param cuts vector of cutpoints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this only accepts a column vector, since that's what we take "vector" to mean. Otherwise, please indicate this can be a row vector or column vector. We want the doc to be clear on types here since the templating isn't providing that information when you template the whole container.

I'm just adding single comments rather than reviewing as I don't want to overreview someone else who might be doing this.

check_finite(function, "Final cut-point", cuts[N_classes - 2]);
check_finite(function, "First cut-point", cuts[0]);

if (size_zero(y, x, beta, cuts))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't your fault, but it's another function that landed without sufficient review---it should be named size_zero_any or something like that. Anyway, guess that's a different PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this one is on me. This got in before the _any discussion in the is_nan PR. Sorry about that. I am assigning myself to fix this one.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented May 28, 2019 via email

@t4c1
Copy link
Contributor Author

t4c1 commented May 28, 2019

cut>0 is just a comparison, not a branch. I am not sure, but I think comparisons can be vectorized.

Explicitely vectorizing code requires use of intrinsics, which is tedious and not portable. Luckily Eigen does it, so we do not have to. I just assume all Eigen expression will produce vectorized binary code. While there might be exceptions, Eigen expessions seem to always results in equivalent or faster calculations than doing things element-wise.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented May 28, 2019 via email

@t4c1
Copy link
Contributor Author

t4c1 commented May 28, 2019

taking different behavior

That is not true. Comparisons are basic CPU instructions, just like for example adition. Just the result is either 0 or 1. Comparisons without if/loop are not branches.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented May 29, 2019 via email

@t4c1
Copy link
Contributor Author

t4c1 commented May 29, 2019

Sorry, I don't understant, what do you mean with the first question. Ternary operator is a branch.

On a side note I assume we want this GLM to have similar input types as softmax regression.

@bob-carpenter
Copy link
Contributor

Let me elaborate with an example. Let's say I have a naive implementation of absolute value using the ternary operator.

a > 0 : a ? -a

The question is whether this generates branching instructions or not. How would that compare to the

(a > 0) * a + (1 - (a > 0)) * -a

which shouldn't generate any branching. I saw a Stack Overflow discussion about this recently which said the first version's faster on modern compilers, but now I can't find it.

@t4c1
Copy link
Contributor Author

t4c1 commented May 30, 2019

FIrst one would generate a branch and second would not. But the second would compile into quite a number of instructions. However, both might be optimized by a compiler.

@t4c1
Copy link
Contributor Author

t4c1 commented May 30, 2019

This test failure was expected and will disappear once PR1249 gets merged.

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 1.01)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.98)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.98)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 0.99821972904
Commit hash: 1dad46b

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.99)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.98)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.02)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.02)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 0.99)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.04)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.01)
Result: 1.00248068624
Commit hash: 1dad46b

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.98)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.98)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 0.99)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.95)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.03)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.99)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.99)
Result: 0.99429347567
Commit hash: 94f4635

@serban-nicusor-toptal serban-nicusor-toptal added this to the 2.20.0++ milestone Jul 18, 2019
@t4c1 t4c1 mentioned this pull request Aug 26, 2019
7 tasks
@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.99)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.01)
(compilation, 1.03)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.01)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.93)
Result: 0.99891798471
Commit hash: bd54230

@rok-cesnovar
Copy link
Member

Do we have any volunteers to review this? Bob's and my intial comments were addressed so this is definitely ready for a proper review.

Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I always learn a lot about using Eigen from your pulls.

The rev testing is very thorough, but I'm not sure what the policy on tests are for the prob functions (i.e. whether you also need prim/fwd/mix tests here) @syclik what's your view here?

@t4c1
Copy link
Contributor Author

t4c1 commented Sep 13, 2019

What is going on with these tests, running for 22 hours?

@rok-cesnovar
Copy link
Member

Its somethin with the AWS instances I think. Restarting the entire tests is the only way I know how to fix this.

cc: @serban-nicusor-toptal

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.98)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.97)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.89)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.01)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 0.98)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.98)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.99)
Result: 0.98804564761
Commit hash: 7973dbc

@t4c1
Copy link
Contributor Author

t4c1 commented Sep 13, 2019

@andrjohns This is ready for next review.

Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants