Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

passing maxit to glm.control when fitting earth #1018

Closed
elad663 opened this issue Mar 17, 2019 · 3 comments
Closed

passing maxit to glm.control when fitting earth #1018

elad663 opened this issue Mar 17, 2019 · 3 comments

Comments

@elad663
Copy link

elad663 commented Mar 17, 2019

I am fitting an earth classification model but I need more than 25 iterations. earth calls to glm. glm has a control variable which is passed to glm.control and that is how maxit is passed:

    mars_fit <- earth(formula = response ~ x1 + x2, data = dat,
                      glm = list(family=binomial, control = list(maxit = 50)))

but when I try to pass glm = list(family=binomial, control = list(maxit = 50)) to train, it is not passed through and there is no error message.

Minimal, reproducible example:

it is hard to simulate data that needs more than 25 iterations (default), so I am trying to set max_iter to 1 and show that model summary states a higher value. See information printed under GLM (family binomial, link logit)

Adding glm to the tuneGrid fails with Error: The tuning parameter grid should not have columns nprune, degree.

How can I set maxit?
Thank you.
also: https://stackoverflow.com/questions/55210511/how-to-pass-glm-control-argument-for-earth-using-caret-maxit

Minimal dataset:

rm(list=ls())
set.seed(123)
library(caret)
library(earth)

n=1000
b1 = .72; b2 = -.86; b12 = 2.91
x1 <- rnorm(n = n, sd = 1)
x2 <- rnorm(n = n, sd = 1)
noise <- rnorm(n = n, sd = 3)
prob <- plogis(b1*x1 + b2*x2 + b12*x1^2*x2+ noise)
y <- rbinom(n = n, size = 1, prob)
dat <- data.frame(y=y, x1=x1, x2=x2)

Minimal, runnable code:

fit_control <- trainControl(method = "cv", number = 10)
mars_grid <- expand.grid(degree=1:2, nprune=2:10)

mars_fit <- train(factor(y)~x1+x2, method='earth', trControl = fit_control, tuneGrid=mars_grid, data=dat,
                  glm = list(family=binomial, control = list(maxit = 1)))


> summary(mars_fit$finalModel)
Call: earth(x=matrix[1000,2], y=factor.object, keepxy=TRUE, glm=list(family=function.object),
            degree=2, nprune=10)

GLM coefficients
                                            1
(Intercept)                         0.0133392
h(x1- -0.445662)                    3.8482523
h(-0.10828-x1)                     -2.3430108
h(x1- -0.10828)                    -5.8111977
h(-0.34965-x1) * h(x2- -0.805698)   3.5749689
h(x1- -0.34965) * h(x2- -0.805698) -2.5430102
h(-0.10828-x1) * h(x2-0.366114)    -2.0382430
h(x1- -0.10828) * h(x2-0.686135)    2.5903118
h(x1-0.144476) * h(x2- -0.805698)   5.5678739
h(x1-0.431099) * h(x2-0.76164)     -6.5535434

Earth selected 10 of 17 terms, and 2 of 2 predictors
Termination condition: Reached nk 21
Importance: x1, x2
Number of terms at each degree of interaction: 1 3 6
Earth GCV 0.2073745    RSS 197.7424    GRSq 0.1720774    RSq 0.2089513

GLM (family binomial, link logit):
  nulldev  df      dev  df devratio  AIC iters converged
 1386.194 999 1128.068 990    0.186 1148     5         1

Session Info:

R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] earth_4.7.0        plotmo_3.5.2       TeachingDemos_2.10 plotrix_3.7-4     
[5] caret_6.0-81       ggplot2_3.1.0      lattice_0.20-38   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         pillar_1.3.1       compiler_3.5.2     gower_0.2.0       
 [5] plyr_1.8.4         iterators_1.0.10   class_7.3-15       tools_3.5.2       
 [9] rpart_4.1-13       ipred_0.9-8        lubridate_1.7.4    tibble_2.0.1      
[13] nlme_3.1-137       gtable_0.2.0       pkgconfig_2.0.2    rlang_0.3.1       
[17] Matrix_1.2-16      foreach_1.4.4      rstudioapi_0.9.0   prodlim_2018.04.18
[21] e1071_1.7-0.1      withr_2.1.2        dplyr_0.8.0.1      stringr_1.4.0     
[25] generics_0.0.2     recipes_0.1.4      stats4_3.5.2       nnet_7.3-12       
[29] grid_3.5.2         tidyselect_0.2.5   glue_1.3.1         data.table_1.12.0 
[33] R6_2.4.0           survival_2.43-3    lava_1.6.5         reshape2_1.4.3    
[37] purrr_0.3.1        magrittr_1.5       splines_3.5.2      MASS_7.3-51.1     
[41] scales_1.0.0       codetools_0.2-16   ModelMetrics_1.2.2 assertthat_0.2.0  
[45] timeDate_3043.102  colorspace_1.4-0   stringi_1.4.3      lazyeval_0.2.1    
[49] munsell_0.5.0      crayon_1.3.4      
@topepo
Copy link
Owner

topepo commented Mar 20, 2019

It's a bug. I'll fix for the next release.

@topepo
Copy link
Owner

topepo commented Mar 25, 2019

This should do it but please test. If you don't pass glm then it populates it but passing it overrides the default.

@elad663
Copy link
Author

elad663 commented Mar 25, 2019

Works well for me. Thank you.

@topepo topepo closed this as completed Mar 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants