Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect definition of glm.nb model #688

Closed
jpclemens0 opened this issue Jul 13, 2017 · 1 comment
Closed

Incorrect definition of glm.nb model #688

jpclemens0 opened this issue Jul 13, 2017 · 1 comment

Comments

@jpclemens0
Copy link

@jpclemens0 jpclemens0 commented Jul 13, 2017

The glm.nb model fails to produce suitable output for the default cross validation cost function when fitting count data because it is missing type = 'response' in the definition of predict. Compare with the glm model.

Minimal, reproducible example:

require('caret')
require('MASS')

cost <- function(data, lev = NULL, model = NULL) {
print(head(data))
defaultSummary(data)
}

train_control <- trainControl(method = 'cv', number = 2, summaryFunction = cost)
tr_out <- train(Days ~ Sex/(Age + Eth*Lrn), data = quine, trControl = train_control, method = 'glm.nb', tuneLength = 1)

Session Info:

>sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MASS_7.3-47     caret_6.0-76    ggplot2_2.1.0   lattice_0.20-35

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.11       magrittr_1.5       splines_3.4.0      munsell_0.4.3      colorspace_1.3-2   foreach_1.4.3     
 [7] minqa_1.2.4        stringr_1.2.0      car_2.1-5          plyr_1.8.4         tools_3.4.0        parallel_3.4.0    
[13] nnet_7.3-12        pbkrtest_0.4-7     grid_3.4.0         gtable_0.2.0       nlme_3.1-131       mgcv_1.8-17       
[19] quantreg_5.33      MatrixModels_0.4-1 iterators_1.0.8    lme4_1.1-13        Matrix_1.2-10      nloptr_1.0.4      
[25] reshape2_1.4.2     ModelMetrics_1.1.0 codetools_0.2-15   stringi_1.1.5      compiler_3.4.0     scales_0.4.1      
[31] stats4_3.4.0       SparseM_1.77  
topepo added a commit that referenced this issue Jul 24, 2017
@topepo
Copy link
Owner

@topepo topepo commented Jul 24, 2017

You are correct! The new code gives:

> tr_out <- train(Days ~ Sex/(Age + Eth*Lrn), data = quine, trControl = train_control, method = modelInfo, tuneLength = 1)
    pred obs rowIndex
137    1   0       89
23    43  11       20
142    5  19      127
51    19   6       46
91     8  81       94
103   41   0      110
        pred obs rowIndex
1  17.125190   2        1
4  13.649380   5        4
6  13.649380  13        6
7  13.649380  20        7
12  5.210164   7       12
13  5.210164  14       13
       pred obs rowIndex
2  15.44885  11        2
3  15.44885  14        3
5  10.59268   5        5
8  10.59268  22        8
9  13.85107   6        9
10 13.85107   6       10

(the first set is a random check that train does to make sure that the evaluation function is working correctly)

@topepo topepo closed this Aug 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.