Allow LMM to init GLMM #588

palday · 2022-01-04T00:26:27Z

Benchmarking suggests that the GLM inits we're currently using are the best for small models (although at a few hundred millisecond total difference in the fit, it's realistically a wash). But things get more interesting if we look at big models.

Here's an example from the English Lexicon project data -- I've split it into two plots because the scale changes pretty dramatically.

First 50 iterations (after LMM fitting when the LMM is used)

All successive iterations

Here's the dataframe showing the progress (wrapped in a zip file to make GitHub happy):
glmm_fitlog_by_init.arrow.zip

The relevant timing info

Minimizing 973   Time: 1:42:50 ( 6.34  s/it) # GLM init
6179.284576 seconds (1.23 M allocations: 1.374 GiB, 0.00% gc time, 0.00% compilation time)
Minimizing 38    Time: 0:00:04 ( 0.12  s/it) # LMM being fit....
Minimizing 113   Time: 0:11:16 ( 5.98  s/it) # beta and theta from LMM
689.532497 seconds (285.49 k allocations: 1.341 GiB, 0.06% gc time)
Minimizing 38    Time: 0:00:04 ( 0.12  s/it) 
Minimizing 556   Time: 1:00:04 ( 6.48  s/it) # theta from LMM
3618.443314 seconds (767.97 k allocations: 1.358 GiB, 0.01% gc time)
Minimizing 38    Time: 0:00:04 ( 0.12  s/it)
Minimizing 113 Time: 0:12:10 ( 6.47  s/it) # beta from LMM
744.312527 seconds (285.59 k allocations: 1.341 GiB, 0.08% gc time)

It could be interesting plot the norm of successive differences between the parameter vector so that we can see how much movement we're getting -- and maybe how much of that is change in beta and how much is change in theta.

@dmbates

codecov · 2022-01-04T01:01:45Z

Codecov Report

Merging #588 (1aa42af) into main (abca4ea) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #588      +/-   ##
==========================================
+ Coverage   96.22%   96.23%   +0.01%     
==========================================
  Files          28       28              
  Lines        2516     2523       +7     
==========================================
+ Hits         2421     2428       +7     
  Misses         95       95

Impacted Files	Coverage Δ
src/generalizedlinearmixedmodel.jl	`90.15% <100.00%> (+0.22%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abca4ea...1aa42af. Read the comment docs.

dmbates · 2022-01-04T15:43:22Z

Thanks for doing this and for cleaning up some of our earlier code. This definitely looks worth pursuing.

I'm not sure I understand the plots, particularly the objective axis. What is the scaling between the first and second plots?

Where are the values for lmm-\beta on the second plot? The oscillating behavior for lmm-\beta\theta is alarming as is the difference in the eventual objective values for the different starting value methods.

Lots of good stuff here to contemplate.

dmbates · 2022-01-04T15:43:53Z

P.S. Let me know when you want a review of this.

palday · 2022-01-04T15:55:14Z

@dmbates I messed up the first plot -- I was trying to see if I could get it all on one scale and log-scaled the objective on that one, but not on the second one.

For the second plot, I think lmm-beta is overplotted by the glm-init.

I'm a bit concerned by the difference in the final objectives as well -- it's 75 points of log-likelihood! I think part of it is that we're looking at a very large dataset, so tiny perturbations in the parameters can lead to a big change in the log-likelihood because even if the change in LL per observation is small, we're still summing over a huge number of observations.

Taken all together, it seems that the random effects dominate optimization in larger models.

Do you have a better name for the kwarg? I'm tending towards adding a note to the docstring that this feature and kwarg is experimental and may disappear/change without being considered a breaking change. Then we can merge and use this code for further experimentation elsewhere.

dmbates · 2022-01-04T16:53:21Z

I should have thought of a logarithm scale.

Are you referring to the thin kwarg? I agree that it is a poor choice. In retrospect I think it would be sufficient to make the argument a Boolean rather than a peculiar numeric variable where smaller means more information until you get to 0 when it doesn't. If someone wants to prune the values after the fact they can just go ahead and do it.

By the way, we currently have the initial value repeated because we start the structure with the initial value then also record the first iteration. If we want to allow ourselves breaking changes later we should fix that and maybe eliminate a few of the fields of the OptSummary struct that, now redundantly, record the starting values, the initial objective, the final objective and the converged values.

palday · 2022-01-04T19:20:15Z

For this PR, I meant the lmminit kwarg -- maybe init_from_lmm? Clearer but longer.

For OptSummary, we could potentially change some fields to be properties that reference the fitlog. That wouldn't be breaking in terms of API, and define our own serialization methods, so breaking the way the Serializationstdlib serializes that struct wouldn't be breaking IMHO.

palday · 2022-01-06T03:12:19Z

@dmbates I think this is good to go.

dmbates

Looks good. Thanks for doing this.

palday added 3 commits January 3, 2022 18:17

simplifiy kwarg passing

2a699cc

use lmm to init glmm

87851cc

welp, that would do it

c86e34c

palday mentioned this pull request Jan 4, 2022

GLMM models dmbates/EnglishLexicon.jl#1

Draft

palday added 2 commits January 3, 2022 23:36

test

bff0243

JuliaFormatter

bf4c404

kwarg rename

15c6fbe

palday marked this pull request as ready for review January 6, 2022 03:11

palday requested a review from dmbates January 6, 2022 03:11

version bump

1aa42af

dmbates approved these changes Jan 6, 2022

View reviewed changes

palday merged commit df95a29 into main Jan 6, 2022

palday deleted the pa/init branch January 6, 2022 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow LMM to init GLMM #588

Allow LMM to init GLMM #588

palday commented Jan 4, 2022 •

edited

codecov bot commented Jan 4, 2022 •

edited

dmbates commented Jan 4, 2022

dmbates commented Jan 4, 2022

palday commented Jan 4, 2022

dmbates commented Jan 4, 2022

palday commented Jan 4, 2022

palday commented Jan 6, 2022

dmbates left a comment

Allow LMM to init GLMM #588

Allow LMM to init GLMM #588

Conversation

palday commented Jan 4, 2022 • edited

codecov bot commented Jan 4, 2022 • edited

Codecov Report

dmbates commented Jan 4, 2022

dmbates commented Jan 4, 2022

palday commented Jan 4, 2022

dmbates commented Jan 4, 2022

palday commented Jan 4, 2022

palday commented Jan 6, 2022

dmbates left a comment

Choose a reason for hiding this comment

palday commented Jan 4, 2022 •

edited

codecov bot commented Jan 4, 2022 •

edited