Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pool for gamlss objects #405

Closed
dnzmarcio opened this issue Jun 9, 2021 · 1 comment · Fixed by #406
Closed

pool for gamlss objects #405

dnzmarcio opened this issue Jun 9, 2021 · 1 comment · Fixed by #406

Comments

@dnzmarcio
Copy link
Contributor

The function pool ignores that there are different models for each parameter in gamlss (Generalized Additive Models for Location, Scale and Shape) objects such that it will average over the regression coefficients from different parameters that are modelled with the same covariates.

Libraries and data generation

# Libraries
library(gamlss)
library(mice)

# Data
group <- sample(c("A", "B"), size = 100, replace = TRUE)
biomarker <- rnorm(n = 100, mean = 0, sd = 1)
design.matrix.mu <- model.matrix(~ group + biomarker)
design.matrix.tau <- model.matrix(~ biomarker)
beta <- c(-1, 0.5, 1)
gamma <- c(1, 0.6)
sigma <- plogis(1)
mu <- plogis(design.matrix.mu%*%beta)
nu <- exp(design.matrix.tau%*%gamma)

outcome <- rBEINF(n = 100, mu = mu, sigma = sigma, nu = nu, tau = 0.1)
dataset <- data.frame(outcome, group, biomarker)
dataset[sample(1:100, size = 10), "biomarker"] <- NA

Fitting a gamlss model with formula for the mean (mu) and the probability of inflated zeros (nu)

fit <- gamlss(formula = outcome ~ group + biomarker,
              nu.formula = ~ biomarker,
              data = na.omit(dataset),
              family = "BEINF")
## GAMLSS-RS iteration 1: Global Deviance = -53.5503 
## GAMLSS-RS iteration 2: Global Deviance = -56.593 
## GAMLSS-RS iteration 3: Global Deviance = -56.6955 
## GAMLSS-RS iteration 4: Global Deviance = -56.7008 
## GAMLSS-RS iteration 5: Global Deviance = -56.7012
summary(fit)
## ******************************************************************
## Family:  c("BEINF", "Beta Inflated") 
## 
## Call:  gamlss(formula = outcome ~ group + biomarker, nu.formula = ~biomarker,  
##     family = "BEINF", data = na.omit(dataset)) 
## 
## Fitting method: RS() 
## 
## ------------------------------------------------------------------
## Mu link function:  logit
## Mu Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -1.4053     0.4058  -3.463 0.000846 ***
## groupB        1.1891     0.4763   2.497 0.014517 *  
## biomarker     1.0107     0.2567   3.937 0.000171 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## ------------------------------------------------------------------
## Sigma link function:  logit
## Sigma Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   0.6505     0.2025   3.213  0.00187 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## ------------------------------------------------------------------
## Nu link function:  log 
## Nu Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.8971     0.2430   3.692 0.000397 ***
## biomarker     0.4460     0.2642   1.688 0.095138 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## ------------------------------------------------------------------
## Tau link function:  log 
## Tau Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -2.6027     0.7328  -3.552 0.000634 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## ------------------------------------------------------------------
## No. of observations in the fit:  90 
## Degrees of Freedom for the fit:  7
##       Residual Deg. of Freedom:  83 
##                       at cycle:  5 
##  
## Global Deviance:     -56.70123 
##             AIC:     -42.70123 
##             SBC:     -25.20256 
## ******************************************************************

Multiple imputation

imp <- mice(m = 2, dataset)
## 
##  iter imp variable
##   1   1  biomarker
##   1   2  biomarker
##   2   1  biomarker
##   2   2  biomarker
##   3   1  biomarker
##   3   2  biomarker
##   4   1  biomarker
##   4   2  biomarker
##   5   1  biomarker
##   5   2  biomarker
fit <- with(data = imp, gamlss(formula = outcome ~ group + biomarker,
                               nu.formula = ~ biomarker,
                               family = "BEINF"))
## GAMLSS-RS iteration 1: Global Deviance = -37.6422 
## GAMLSS-RS iteration 2: Global Deviance = -41.2915 
## GAMLSS-RS iteration 3: Global Deviance = -41.4323 
## GAMLSS-RS iteration 4: Global Deviance = -41.4397 
## GAMLSS-RS iteration 5: Global Deviance = -41.4405 
## GAMLSS-RS iteration 1: Global Deviance = -33.0859 
## GAMLSS-RS iteration 2: Global Deviance = -37.5867 
## GAMLSS-RS iteration 3: Global Deviance = -37.7456 
## GAMLSS-RS iteration 4: Global Deviance = -37.7563 
## GAMLSS-RS iteration 5: Global Deviance = -37.7567

Issue: pool ignores that there are two regression models for mu and tau resulting in only one set of coefficients

summary(pool(fit))
##          term   estimate std.error  statistic        df    p.value
## 1 (Intercept) -0.5395432 1.5824961 -0.3409444  3.004534 0.75559497
## 2      groupB  0.9770695 0.4330088  2.2564658 88.762810 0.02649718
## 3   biomarker  0.6821122 0.3807480  1.7915055  7.769334 0.11209532

Session Info

devtools::session_info()
## - Session info ---------------------------------------------------------------
##  setting  value                       
##  version  R version 4.0.3 (2020-10-10)
##  os       Windows 10 x64              
##  system   x86_64, mingw32             
##  ui       RTerm                       
##  language (EN)                        
##  collate  English_United States.1252  
##  ctype    English_United States.1252  
##  tz       America/Los_Angeles         
##  date     2021-06-09                  
## 
## - Packages -------------------------------------------------------------------
##  package     * version date       lib source        
##  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.3)
##  backports     1.2.1   2020-12-09 [1] CRAN (R 4.0.3)
##  broom         0.7.5   2021-02-19 [1] CRAN (R 4.0.4)
##  callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.3)
##  cli           2.4.0   2021-04-05 [1] CRAN (R 4.0.5)
##  crayon        1.4.1   2021-02-08 [1] CRAN (R 4.0.4)
##  DBI           1.1.0   2019-12-15 [1] CRAN (R 4.0.3)
##  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.3)
##  devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.3)
##  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
##  dplyr         1.0.5   2021-03-05 [1] CRAN (R 4.0.4)
##  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.4)
##  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.3)
##  fansi         0.4.2   2021-01-15 [1] CRAN (R 4.0.3)
##  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.3)
##  gamlss      * 5.2-0   2020-09-12 [1] CRAN (R 4.0.3)
##  gamlss.data * 5.1-4   2019-05-15 [1] CRAN (R 4.0.3)
##  gamlss.dist * 5.1-7   2020-07-13 [1] CRAN (R 4.0.3)
##  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
##  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
##  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.3)
##  knitr         1.31    2021-01-27 [1] CRAN (R 4.0.4)
##  lattice       0.20-41 2020-04-02 [2] CRAN (R 4.0.3)
##  lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.0.4)
##  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.4)
##  MASS        * 7.3-53  2020-09-09 [2] CRAN (R 4.0.3)
##  Matrix        1.2-18  2019-11-27 [2] CRAN (R 4.0.3)
##  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.3)
##  mice        * 3.13.0  2021-01-27 [1] CRAN (R 4.0.5)
##  nlme        * 3.1-149 2020-08-23 [2] CRAN (R 4.0.3)
##  pillar        1.6.0   2021-04-13 [1] CRAN (R 4.0.5)
##  pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 4.0.3)
##  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.3)
##  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.3)
##  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.3)
##  processx      3.4.4   2020-09-03 [1] CRAN (R 4.0.3)
##  ps            1.4.0   2020-10-07 [1] CRAN (R 4.0.3)
##  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.0.3)
##  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
##  Rcpp          1.0.6   2021-01-15 [1] CRAN (R 4.0.4)
##  remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.3)
##  rlang         0.4.10  2020-12-30 [1] CRAN (R 4.0.3)
##  rmarkdown     2.5     2020-10-21 [1] CRAN (R 4.0.3)
##  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.0.3)
##  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
##  stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
##  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.3)
##  survival      3.2-7   2020-09-28 [2] CRAN (R 4.0.3)
##  testthat      3.0.0   2020-10-31 [1] CRAN (R 4.0.3)
##  tibble        3.1.1   2021-04-18 [1] CRAN (R 4.0.3)
##  tidyr         1.1.3   2021-03-03 [1] CRAN (R 4.0.4)
##  tidyselect    1.1.0   2020-05-11 [1] CRAN (R 4.0.3)
##  usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.3)
##  utf8          1.2.1   2021-03-12 [1] CRAN (R 4.0.3)
##  vctrs         0.3.7   2021-03-29 [1] CRAN (R 4.0.5)
##  withr         2.4.2   2021-04-18 [1] CRAN (R 4.0.3)
##  xfun          0.22    2021-03-11 [1] CRAN (R 4.0.4)
##  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.3)
@stefvanbuuren
Copy link
Member

Thanks for your note.

I am afraid I cannot solve this. gamlss fits a model for each moment of the distribution. From your mail I conclude that the tidy() function for gamlss objects only stores the model parameters for the model of the first moment.

I see two ways to go forward:

  • The royal solution: Adapt tidy() and glance() functions for gamlss object to include all model coefficients;
  • The quick way: Manually extract the fitted coefficients and their variances for each model, and use mice::pool.scalar() to combine them.

@amices amices locked and limited conversation to collaborators Jun 9, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants