Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upxgbDART #742
xgbDART #742
Conversation
Codecov Report
@@ Coverage Diff @@
## master #742 +/- ##
=======================================
Coverage 16.97% 16.97%
=======================================
Files 90 90
Lines 13187 13187
=======================================
Hits 2238 2238
Misses 10949 10949Continue to review full report at Codecov.
|
|
|
||
| if( !is.null(modelFit$param$objective) && modelFit$param$objective == 'binary:logitraw'){ | ||
| p <- predict(modelFit, newdata) | ||
| out <- exp(p)/(1+exp(p)) |
topepo
Sep 27, 2017
Owner
Using out <- binomial()$linkinv(p) would be better since it takes into account potential numerical issues
Using out <- binomial()$linkinv(p) would be better since it takes into account potential numerical issues
| "Minimum Loss Reduction", | ||
| "Subsample Percentage", | ||
| "Subsample Ratio of Columns", | ||
| "Fraction of previous trees to drop during dropout", |
topepo
Sep 27, 2017
Owner
Can you shorten these and use consistent capitalization (e.g. maybe "Fraction of Previous Trees")? The labels might get used in ggplot legends or facets and long labels might be an issue.
Can you shorten these and use consistent capitalization (e.g. maybe "Fraction of Previous Trees")? The labels might get used in ggplot legends or facets and long labels might be an issue.
| @@ -1,5 +1,5 @@ | |||
| Package: caret | |||
| Version: 6.0-77 | |||
| Version: 6.0-78 | |||
topepo
Sep 27, 2017
Owner
I just revved the file to bump the version up so this isn't needed.
I just revved the file to bump the version up so this isn't needed.
|
It looks good. I had a few minor notes that you should see. |
|
No problem, all very reasonable; implemented. |
|
Thanks! |
'xgboost' offers a third booster type option - DART. It allows controlling under/over-fitting by drop-outs; trees added to correct trivial errors may be prevented. Relevant reference by Rashmi & Gilad-Bachrach here. All test in RegressionTests/Code work fine. (The standard
warningwhen passingxgb.DMatrixas inputs remain) Due to it's design (it has to traverse all the previous trees before making the "next fit") it is slower thanxgbTree.Comment: I have found it to be good in terms of
varImpinsights. Some artificial noise-variables that trickedxgbTreewere weeded-out byxgbDARTin some toy examples I tried.