simulate! and thus parametricbootstrap for GLMM #418

palday · 2020-10-13T23:23:21Z

Closes #245.

I had to add in a few accessor-to-preallocated-buffer methods for GLMM elsewhere (fixef!, etc.).
~~stderror! actually does a hidden allocation, but I'll fix that as soon as I've looked up StatsBase.stderror.~~

~~There's also still some debugging code in there, but it should otherwise be in a good state to check for conceptual errors.~~

julia> using DataFrames

julia> using MixedModels

julia> using StableRNGs

julia> contra = MixedModels.dataset(:contra)
Arrow.Table: (dist = ["D01", "D01", "D01", "D01", "D01", "D01", "D01", "D01", "D01", "D01"  …  "D61", "D61", "D61", "D61", "D61", "D61", "D61", "D61", "D61", "D61"], urban = ["Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y"  …  "N", "N", "N", "N", "N", "N", "N", "N", "N", "N"], livch = ["3+", "0", "2", "3+", "0", "0", "3+", "3+", "1", "3+"  …  "1", "3+", "3+", "2", "2", "3+", "2", "3+", "0", "3+"], age = [18.44, -5.56, 1.44, 8.44, -13.56, -11.56, 18.44, -3.56, -5.56, 1.44  …  -5.56, 14.44, 19.44, -9.56, -2.56, 14.44, -4.56, 14.44, -13.56, 10.44], use = ["N", "N", "N", "N", "N", "N", "N", "N", "N", "N"  …  "N", "N", "N", "Y", "N", "N", "N", "N", "N", "N"])

julia> gm0 = fit(MixedModel, @formula(use ~ 1+age+abs2(age)+urban+livch+(1|urban&dist)), contra, Bernoulli(), fast=true);

julia> bs = parametricbootstrap(StableRNG(42), 1000, deepcopy(gm0));
Progress: 100%|███████████████████████████████████████ Time: 0:00:14

julia> combine(groupby(DataFrame(bs.β), :coefname), :β => first ∘ shortestcovint => :lower, :β => last ∘ shortestcovint => :upper)
7×3 DataFrame
│ Row │ coefname    │ lower       │ upper       │
│     │ Symbol      │ Float64     │ Float64     │
├─────┼─────────────┼─────────────┼─────────────┤
│ 1   │ (Intercept) │ -1.36078    │ -0.671704   │
│ 2   │ age         │ -0.01329    │ 0.0218545   │
│ 3   │ abs2(age)   │ -0.00565161 │ -0.00294269 │
│ 4   │ urban: Y    │ 0.439763    │ 1.09529     │
│ 5   │ livch: 1    │ 0.495756    │ 1.11646     │
│ 6   │ livch: 2    │ 0.493062    │ 1.23956     │
│ 7   │ livch: 3+   │ 0.549297    │ 1.23275     │

# CIs from the normal Wald approximation for comparison
julia> DataFrame(coef=fixefnames(gm0), lower=fixef(gm0) .- 2 .* stderror(gm0), upper=fixef(gm0) .+ 2 .* stderror(gm0))
7×3 DataFrame
│ Row │ coef        │ lower       │ upper       │
│     │ String      │ Float64     │ Float64     │
├─────┼─────────────┼─────────────┼─────────────┤
│ 1   │ (Intercept) │ -1.39311    │ -0.667259   │
│ 2   │ age         │ -0.0153779  │ 0.0218421   │
│ 3   │ abs2(age)   │ -0.00581783 │ -0.00289668 │
│ 4   │ urban: Y    │ 0.407463    │ 1.09011     │
│ 5   │ livch: 1    │ 0.486554    │ 1.14029     │
│ 6   │ livch: 2    │ 0.516467    │ 1.26265     │
│ 7   │ livch: 3+   │ 0.527929    │ 1.27843     │

codecov · 2020-10-13T23:50:05Z

Codecov Report

Merging #418 into master will decrease coverage by 0.08%.
The diff coverage is 90.38%.

@@            Coverage Diff             @@
##           master     #418      +/-   ##
==========================================
- Coverage   94.31%   94.22%   -0.09%     
==========================================
  Files          23       23              
  Lines        1637     1681      +44     
==========================================
+ Hits         1544     1584      +40     
- Misses         93       97       +4

Impacted Files	Coverage Δ
src/linearmixedmodel.jl	`94.23% <ø> (-0.02%)`	⬇️
src/simulate.jl	`89.55% <87.50%> (-1.88%)`	⬇️
src/generalizedlinearmixedmodel.jl	`87.09% <91.66%> (+0.54%)`	⬆️
src/bootstrap.jl	`98.13% <100.00%> (+0.03%)`	⬆️
src/mixedmodel.jl	`81.81% <100.00%> (+0.86%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52ca2ca...137cce7. Read the comment docs.

palday · 2020-10-14T00:08:04Z

I need to fix the Gaussian with non-identity link pathway.

dmbates

I like the idea of using the StableRNGs package in the tests.

I think it will be necessary to consider the scaling in building η in the simulate! method. I put a comment there. I'm pretty sure that using sdest applied to the penalized weighted least squares structure for the working response is not correct.

test/bootstrap.jl

src/simulate.jl

dmbates · 2020-10-17T19:52:59Z

src/simulate.jl

+    end
+
+    # scale by lm.σ and add fixed-effects contribution
+    BLAS.gemv!('N', one(T), lm.X, β, lm.σ, η)


Are you sure that lm.σ is the correct multiple here? It just calls sdest(lm) which calls varest(m) which is based on pwrss(m). And that value is not necessarily related to any scale parameter.

Yep, my deep dive into the GLMM code has highlighted this to me. It 'works' here because sigma is usually close to 1 for families without a dispersion parameter and I think unit scaling is correct.

palday · 2020-10-18T18:59:22Z

The failure on nightly looks unrelated:

   MethodError: propertynames(::LinearMixedModel{Float64}, ::Bool) is ambiguous. Candidates:
    propertynames(m::LinearMixedModel, private) in MixedModels at /home/runner/work/MixedModels.jl/MixedModels.jl/src/linearmixedmodel.jl:564
    propertynames(x, private::Bool) in Base at reflection.jl:1530

palday · 2020-10-18T20:17:29Z

@dmbates Care to take a second look?

…/glmmsimulate

dmbates · 2020-10-29T15:47:39Z

src/bootstrap.jl

@@ -290,7 +304,7 @@ function tidyσs(bsamp::MixedModelBootstrap{T}) where {T}
    )
    for (iter, r) in enumerate(bstr)
        setθ!(bsamp, iter)    # install r.θ in λ
-        σ = r.σ
+        σ = r.σ === missing ? 1 : r.σ


This could be written as

σ = coalesce(r.σ, one(T))

to keep type consistency and more in the spirit of the missing design.

Nice, thanks for the tip!

dmbates

Just one minor suggestion of using coalesce instead of x === missing ? ...

…/glmmsimulate

palday added 5 commits October 13, 2020 22:05

simulate! for GLMM

30d582b

basics of parametric bootstrap for GLMM

f7b2a35

fix simulate!

36f3181

tests and fix Binomial simulate!

7b5441d

loosen threshold on dummy test

667c949

palday added 4 commits October 15, 2020 12:07

slightly more efficient stderror!(::GLMM)

4b538fd

have tests use stderror for simulation drift

2b52173

fix docstring, remove some debugging code

82c32cf

news update and version bump

17e724e

palday requested a review from dmbates October 15, 2020 10:30

palday added 5 commits October 15, 2020 13:21

issingular for GLMM and docstring

0d0aa8d

docstring, dead code removeal

c12845d

re-organize tests

606f52b

small docs update

ddba31e

updating tests is hard

4c2207d

dmbates reviewed Oct 17, 2020

View reviewed changes

palday added 2 commits October 18, 2020 20:48

unit scaling + comments in simulate!

1c7b528

don't compute shortestcovint twice

4d13528

palday added 2 commits October 18, 2020 21:24

rng tweak

737678a

fix sdest, varest, and handling of scaling for RE in bootstrap methods

1e86c47

palday added 3 commits October 23, 2020 12:43

5-arg mul! instead of BLAS (∴ more matrix el/types)

d9245c8

Merge branch 'master' of github.com:JuliaStats/MixedModels.jl into pa…

9e08029

…/glmmsimulate

force Binomial n to be Int

34c40c3

dmbates reviewed Oct 29, 2020

View reviewed changes

dmbates approved these changes Oct 29, 2020

View reviewed changes

coalesce, thanks @dmbates

32f9616

Merge branch 'master' of github.com:JuliaStats/MixedModels.jl into pa…

137cce7

…/glmmsimulate

palday merged commit 0bf61a2 into master Oct 29, 2020

palday deleted the pa/glmmsimulate branch October 29, 2020 16:31

palday mentioned this pull request Nov 5, 2020

Better loglikelihood for GLMM #419

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simulate! and thus parametricbootstrap for GLMM #418

simulate! and thus parametricbootstrap for GLMM #418

palday commented Oct 13, 2020 •

edited

codecov bot commented Oct 13, 2020 •

edited

palday commented Oct 14, 2020

dmbates left a comment

dmbates Oct 17, 2020

palday Oct 18, 2020

palday commented Oct 18, 2020

palday commented Oct 18, 2020

dmbates Oct 29, 2020

palday Oct 29, 2020

dmbates left a comment

simulate! and thus parametricbootstrap for GLMM #418

simulate! and thus parametricbootstrap for GLMM #418

Conversation

palday commented Oct 13, 2020 • edited

codecov bot commented Oct 13, 2020 • edited

Codecov Report

palday commented Oct 14, 2020

dmbates left a comment

Choose a reason for hiding this comment

dmbates Oct 17, 2020

Choose a reason for hiding this comment

palday Oct 18, 2020

Choose a reason for hiding this comment

palday commented Oct 18, 2020

palday commented Oct 18, 2020

dmbates Oct 29, 2020

Choose a reason for hiding this comment

palday Oct 29, 2020

Choose a reason for hiding this comment

dmbates left a comment

Choose a reason for hiding this comment

palday commented Oct 13, 2020 •

edited

codecov bot commented Oct 13, 2020 •

edited