xgbDART #742

hadjipantelis · 2017-09-27T10:15:22Z

'xgboost' offers a third booster type option - DART. It allows controlling under/over-fitting by drop-outs; trees added to correct trivial errors may be prevented. Relevant reference by Rashmi & Gilad-Bachrach here. All test in RegressionTests/Code work fine. (The standard warning when passing xgb.DMatrix as inputs remain) Due to it's design (it has to traverse all the previous trees before making the "next fit") it is slower than xgbTree.

Comment: I have found it to be good in terms of varImp insights. Some artificial noise-variables that tricked xgbTree were weeded-out by xgbDART in some toy examples I tried.

codecov-io · 2017-09-27T10:29:20Z

Codecov Report

Merging #742 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #742   +/-   ##
=======================================
  Coverage   16.97%   16.97%           
=======================================
  Files          90       90           
  Lines       13187    13187           
=======================================
  Hits         2238     2238           
  Misses      10949    10949

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bcd2cd9...3acde64. Read the comment docs.

topepo · 2017-09-27T16:50:17Z

models/files/xgbDART.R

+
+                    if( !is.null(modelFit$param$objective) && modelFit$param$objective == 'binary:logitraw'){
+                      p <- predict(modelFit, newdata)
+                      out <- exp(p)/(1+exp(p))


Using out <- binomial()$linkinv(p) would be better since it takes into account potential numerical issues

topepo · 2017-09-27T16:52:21Z

models/files/xgbDART.R

+                                                    "Minimum Loss Reduction",
+                                                    "Subsample Percentage",
+                                                    "Subsample Ratio of Columns",
+                                                    "Fraction of previous trees to drop during dropout",


Can you shorten these and use consistent capitalization (e.g. maybe "Fraction of Previous Trees")? The labels might get used in ggplot legends or facets and long labels might be an issue.

topepo · 2017-09-27T16:54:04Z

pkg/caret/DESCRIPTION

@@ -1,5 +1,5 @@
 Package: caret
-Version: 6.0-77
+Version: 6.0-78


I just revved the file to bump the version up so this isn't needed.

topepo · 2017-09-27T16:55:10Z

It looks good. I had a few minor notes that you should see.

hadjipantelis · 2017-09-27T21:39:41Z

No problem, all very reasonable; implemented.

topepo · 2017-09-28T02:30:28Z

Thanks!

hadjipantelis added 14 commits September 26, 2017 00:07

Adding xgbDART

215f6c4

xgbDART tests

9dea623

Bringing original implem. closer to 'xgbTree'

824f3a7

Versioning check changed.

2a8dc11

Updating NEWS.md

1392fdb

Correcting NEWS.md (mxnet in -77)

546c53f

Adding xgbDART

188d384

xgbDART tests

b9f15bb

Bringing original implem. closer to 'xgbTree'

f85a731

Versioning check changed.

fac5293

Updating NEWS.md

833c6b8

Correcting NEWS.md (mxnet in -77)

462f7eb

Merge branch 'master' of https://github.com/hadjipantelis/caret

3f9e08a

Parsing xgbDART

b49fab9

topepo reviewed Sep 27, 2017

View reviewed changes

Ironing out minor notes.

3acde64

topepo merged commit 87c1ac5 into topepo:master Sep 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xgbDART #742

xgbDART #742

hadjipantelis commented Sep 27, 2017

codecov-io commented Sep 27, 2017 •

edited

Loading

topepo Sep 27, 2017

topepo Sep 27, 2017

topepo Sep 27, 2017

topepo commented Sep 27, 2017

hadjipantelis commented Sep 27, 2017

topepo commented Sep 28, 2017

xgbDART #742

xgbDART #742

Conversation

hadjipantelis commented Sep 27, 2017

codecov-io commented Sep 27, 2017 • edited Loading

Codecov Report

topepo Sep 27, 2017

Choose a reason for hiding this comment

topepo Sep 27, 2017

Choose a reason for hiding this comment

topepo Sep 27, 2017

Choose a reason for hiding this comment

topepo commented Sep 27, 2017

hadjipantelis commented Sep 27, 2017

topepo commented Sep 28, 2017

codecov-io commented Sep 27, 2017 •

edited

Loading