Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update.train making .weights an independent variable when method = 'ranger' #935

Closed
mb158127 opened this issue Aug 27, 2018 · 2 comments
Closed

Comments

@mb158127
Copy link

@mb158127 mb158127 commented Aug 27, 2018

As the title says. I am trying to override the train object $bestTune to get a different $finalModel for prediction. I am not entirely sure if this should be placed here or with ranger. Thanks.

Minimal, reproducible example:

library(caret)

grid <- expand.grid(mtry = 1:2,
                    min.node.size = c(5, 10, 15),
                    splitrule = "variance" )

ctrl <- trainControl(method = "cv",
                     number = 5)

## define response for model
response <- iris$Sepal.Length / iris$Sepal.Width
## define case.weights for ranger
wgt <- iris$Sepal.Width

## fit initial train using ranger
trn.ranger <- train( y = response,
                     x = iris[, 3:5],
                     weights = wgt,
                     method = "ranger",
                     tuneGrid = grid,
                     verbose = F,
                     trControl = ctrl,
                     num.trees = 20,
                     importance = "impurity")

varImp(trn.ranger) ## 3 features in model
trn.ranger$bestTune ##bestTune is mtry=1, min.node.size = 10

## update to choose different bestTune
trn.ranger.update <- update(trn.ranger, param = list(mtry = 2, 
                                                     splitrule = "variance", 
                                                     min.node.size = 15))
varImp(trn.ranger.update) ## new 4th feature is .weights

## error due to .weights column does not exist in newdata
predict(trn.ranger.update, newdata = iris)

Session Info:

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] caret_6.0-80         ggplot2_3.0.0        lattice_0.20-35      RevoUtils_11.0.0    
[5] RevoUtilsMath_11.0.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17        lubridate_1.7.4     tidyr_0.8.1         class_7.3-14        assertthat_0.2.0   
 [6] ipred_0.9-6         psych_1.8.4         foreach_1.5.0       ranger_0.9.0        R6_2.2.2           
[11] plyr_1.8.4          magic_1.5-8         stats4_3.5.0        e1071_1.6-8         pillar_1.2.3       
[16] rlang_0.2.1         lazyeval_0.2.1      kernlab_0.9-26      rpart_4.1-13        Matrix_1.2-14      
[21] splines_3.5.0       CVST_0.2-2          ddalpha_1.3.3       gower_0.1.2         stringr_1.3.1      
[26] foreign_0.8-70      munsell_0.5.0       broom_0.4.4         compiler_3.5.0      pkgconfig_2.0.1    
[31] mnormt_1.5-5        dimRed_0.1.0        nnet_7.3-12         tidyselect_0.2.4    tibble_1.4.2       
[36] prodlim_2018.04.18  DRR_0.0.3           codetools_0.2-15    RcppRoll_0.2.2      dplyr_0.7.6        
[41] withr_2.1.2         MASS_7.3-49         recipes_0.1.2       ModelMetrics_1.1.0  grid_3.5.0         
[46] nlme_3.1-137        gtable_0.2.0        magrittr_1.5        scales_0.5.0        stringi_1.2.2      
[51] reshape2_1.4.3      bindrcpp_0.2.2.9000 timeDate_3043.102   robustbase_0.93-0   geometry_0.3-6     
[56] lava_1.6.1          iterators_1.0.10    tools_3.5.0         glue_1.2.0          DEoptimR_1.0-8     
[61] purrr_0.2.5         sfsmisc_1.1-2       abind_1.4-5         parallel_3.5.0      survival_2.41-3    
[66] yaml_2.1.19         colorspace_1.3-2    bindr_0.1.1  

@mb158127
Copy link
Author

@mb158127 mb158127 commented Oct 25, 2018

Please let me know if there is something more I should include @topepo

topepo added a commit that referenced this issue Nov 15, 2018
@topepo
Copy link
Owner

@topepo topepo commented Nov 15, 2018

Sorry, just now rotating back to caret.

It is fixed but, to be honest, I'll be merging PRs and fixing bugs until the weekend. I would wait until then before testing.

library(caret)
#> Loading required package: lattice
#> Loading required package: ggplot2

grid <- expand.grid(mtry = 1:2,
                    min.node.size = c(5, 10, 15),
                    splitrule = "variance" )

ctrl <- trainControl(method = "cv",
                     number = 5)

## define response for model
response <- iris$Sepal.Length / iris$Sepal.Width
## define case.weights for ranger
wgt <- iris$Sepal.Width

## fit initial train using ranger
trn.ranger <- train( y = response,
                     x = iris[, 3:5],
                     weights = wgt,
                     method = "ranger",
                     tuneGrid = grid,
                     verbose = F,
                     trControl = ctrl,
                     num.trees = 20,
                     importance = "impurity")

varImp(trn.ranger) ## 3 features in model
#> ranger variable importance
#> 
#>              Overall
#> Petal.Width   100.00
#> Species        40.53
#> Petal.Length    0.00
trn.ranger$bestTune ##bestTune is mtry=1, min.node.size = 10
#>   mtry splitrule min.node.size
#> 3    1  variance            15

## update to choose different bestTune
trn.ranger.update <- update(trn.ranger, param = list(mtry = 2, 
                                                     splitrule = "variance", 
                                                     min.node.size = 15))
varImp(trn.ranger.update) ## new 4th feature is _no longer_ .weights
#> ranger variable importance
#> 
#>              Overall
#> Petal.Length  100.00
#> Petal.Width    61.78
#> Species         0.00

Created on 2018-11-14 by the reprex package (v0.2.1)

@topepo topepo closed this Nov 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.