Variable importance #29

ghost · 2016-03-07T12:47:09Z

Almond and me (in coordination with Fabian Scheipl) wrote a function to extract and plot the (in-bag) risk reduction per base-learner from a fitted mboost-model.
The contribution (to risk reduction) of each base-learner can be used as a measure for variable importance of the different base-learners or variables in the model.

Function varimp(object) simply returns a object of class varimp with the corresponding risk reductions per base-learner.
The plot function (eventually a lattice::barchart) for these varimp objects additionally offers some visualisation options like the display in percentages or absolute values, number and order of the displayed bars and whether to focus on base-learners or variables involved in the base-leaners.

- changed selprob order to decreasing=FALSE

still -> no selprob -> no "other" -> no maxchar

…s ordered factors instead of sorting the data.frame. Now, variables are correctly sorted as well. bl per variable are accumulated again

… the baselearners.

variables in interactions sorted for identifiability selprob -> selfreq

hofnerb · 2016-03-07T16:58:34Z

Thanks a lot. We currently have an issue with Travis-CI but I will try to fix this and have a look at your PR afterwards.

ghost · 2016-03-20T05:46:28Z

Just as a short information - the Travis build still includes two notes caused by the variable importance code:

plot.varimp: no visible binding for global variable ‘ylab’
plot.varimp: no visible binding for global variable ‘blearner’

We already tried Hadley Wickhams advice to solve the problem via the following line:
if (getRversion() >= "2.15.1") globalVariables(c("ylab", "blearner"))
but unfortunately that didn't work out. Besides the notes, we also got additional errors:

Error in registerNames(names, package, ".__global__", add) :
The namespace for package "mboostDevel" is locked; no changes in
the global variables list may be made.

Question is, can we keep it that way despite the notes?

hofnerb · 2016-03-21T12:25:27Z

No, the issue has to be resolved.

Please do specify all possible options of arguments like type and blorder in the function definition, e.g. type = c("blearner", "variable") and use type <- match.arg(type) in your function body.
Do not throw an error if the user wants to specify xlabs and ylabs. Use xlab = NULL and ylab = NULL in your function definition. If these arguments are not set, i.e., NULL, keep the default labels as you do now. Otherwise use the user specified labels. By having xlab and ylab in your function definition this should also remove the warning.

Regarding the issue with blearner I do not have a solution a.t.m. I can remember that we had similar issues before but do not remember how we've fixed them.

…tion definition

ghost · 2016-03-21T16:39:55Z

Ok, thanks for the advice. We implemented remarks 1) and 2) and it helped eliminating the note for ylab, of course.
blearner is still an issue. This seems to be quite a common problem!? Is there maybe anyone else who we could ask for support?

hofnerb · 2016-03-22T16:25:41Z

I now seem to have found the error. type = "variable" should not work as you use the object blearner which isn't defined:

 if( type == "variable" ) {
   barchart(variable ~ reduction, groups = blearner, data = plot_data,
    horizontal = TRUE, xlab = xlab, ylab = ylab, xlim = xlim,
    scales = list(x = list(tck = c(1,0), at = seq(0,sum(x), length.out = 5))), 
    stack = TRUE, auto.key = auto.key, ...)
}

I'd guess this should be plot_data[, "blearner"]? Might that be the case?

ghost · 2016-03-22T22:07:03Z

..yes, that's it - thanks a lot!! The note for blearner is gone, too.

Actually the barchart was correctly created with the call in the post above (with groups = blearner), but it also works with groups = plot_data[, "blearner"].

Is there anything else that got your attention or can we leave the code as it is?

hofnerb · 2016-07-04T15:18:21Z

Hi @tkuehn13,

I was a bit too lazy (and had too little time) to pull in your patch bevor changing the structure of mboost (I got rid of the stupid sub folders mboostPatch and mboostDevel. Now, I cannot use your patch easily for mboostDevel anymore. Do you have any ideas how to solve this issue? I do not want to loose your change track etc by simply copy-pasting the files in the new structure...

Sorry for this.
Benjamin

t-8-n and others added 30 commits February 8, 2016 22:17

new functions for variable importance added

938944f

Documentation of varimp

b4b335f

Horizontal y-labels for plot.varimp_mboost

c65fe7a

Details for varimp documentation

af8206c

varimp.Rd typos

568ccb7

glmboost problem fixed

18d82fd

binary response problem fixed

4772b07

change of extraction of baselearner names

77b04f7

new tests for variable importance

f5be92a

change plotting to lattice

3158d97

- added zero-sel.probs to varimp-object

07d69c0

- changed selprob order to decreasing=FALSE

selprob in plot for 'other'-bar

f664d6f

variable_names now extracted by variable.names(object)

d854c08

plot with type = "variable"

2471418

still -> no selprob -> no "other" -> no maxchar

combine steps for blearner/variable plot

21e3248

Sorting of variables/baselearners now achieved by interpreting them a…

07b903e

…s ordered factors instead of sorting the data.frame. Now, variables are correctly sorted as well. bl per variable are accumulated again

Routines in case of negative risk reduction values are incorporated.

eb6b470

Added S3 Method as.data.frame.varimp

5cecf09

Set plot type default to "variable"

8fec4cc

No auto.key for glmboost-barchart

1cf4aad

auto.key is now allowed to be user specified

ab9f567

Add auto.key only if number of blearner exceeds number of variables

31d517e

new test for varimp

f64e312

small changes to varimp plot

e55854a

Argument blorder added to plot.varimp in order to adjust the order of…

26ff030

… the baselearners.

change in control structure for baselearner order in varimp

9ddedac

Sort variable names in interaction term to guarantee identifiability

26d3c33

Docu updated,

8686148

variables in interactions sorted for identifiability selprob -> selfreq

\dontrun für ggplot example

a402b2c

fix typos

ccd1d98

t-8-n added 2 commits March 6, 2016 18:38

update of usage part in varimp-manual

7b0b2b3

Merge remote-tracking branch 'upstream/master'

98eced7

t-8-n added 8 commits March 7, 2016 18:49

fix notes and warnings of package check

1ed6714

mismatch between code and doc fixed

c84fb49

fix typos

e71c8df

update doc for as.data.frame.varimp

ae587a0

avoid Notes: 'no visible binding for global variable'

f35c28e

globalVariables function added to namespace

fd22277

added option 'add = false' for setting of global variables

109e868

removed setting of global variables

d33b55c

t-8-n added 2 commits March 21, 2016 15:41

xlab, ylab and options for type and blorder added to plot.varimp func…

cc07c98

…tion definition

update varimp manual

5f2013e

update call of function barchart

3152090

hofnerb mentioned this pull request Sep 20, 2016

Plotting variable selection frequency and coefficients #52

Closed

hofnerb closed this Sep 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable importance #29

Variable importance #29

ghost commented Mar 7, 2016

hofnerb commented Mar 7, 2016

ghost commented Mar 20, 2016

hofnerb commented Mar 21, 2016

ghost commented Mar 21, 2016

hofnerb commented Mar 22, 2016

ghost commented Mar 22, 2016

hofnerb commented Jul 4, 2016

Variable importance #29

Variable importance #29

Conversation

ghost commented Mar 7, 2016

hofnerb commented Mar 7, 2016

ghost commented Mar 20, 2016

hofnerb commented Mar 21, 2016

ghost commented Mar 21, 2016

hofnerb commented Mar 22, 2016

ghost commented Mar 22, 2016

hofnerb commented Jul 4, 2016