diff --git a/docs/articles/tutorial/cost_sensitive_classif.html b/docs/articles/tutorial/cost_sensitive_classif.html index e58a04febb..ef7df3b71e 100644 --- a/docs/articles/tutorial/cost_sensitive_classif.html +++ b/docs/articles/tutorial/cost_sensitive_classif.html @@ -391,7 +391,7 @@
The nonsense.filter
is now registered in mlr
and shown by listFilterMethods()
.
listFilterMethods()$id
-## [1] anova.test auc
-## [3] carscore cforest.importance
-## [5] chi.squared FSelectorRcpp.gainratio
-## [7] FSelectorRcpp.infogain FSelectorRcpp.symuncert
-## [9] kruskal.test linear.correlation
-## [11] mrmr nonsense.filter
-## [13] oneR permutation.importance
-## [15] praznik.CMIM praznik.DISR
-## [17] praznik.JMI praznik.JMIM
-## [19] praznik.MIM praznik.MRMR
-## [21] praznik.NJMIM randomForest.importance
-## [23] randomForestSRC.rfsrc randomForestSRC.var.select
-## [25] ranger.impurity ranger.permutation
-## [27] rank.correlation relief
-## [29] univariate.model.score variance
-## 33 Levels: anova.test auc carscore cforest.importance ... variance
You can use it like any other filter method already integrated in mlr
(i.e., via the method
argument of generateFilterValuesData()
or the fw.method
argument of makeFilterWrapper()
; see also the page on feature selection.
d = generateFilterValuesData(iris.task, method = c("nonsense.filter", "anova.test"))
d
diff --git a/docs/articles/tutorial/feature_selection.html b/docs/articles/tutorial/feature_selection.html
index 00c5bd5d94..311773cd53 100644
--- a/docs/articles/tutorial/feature_selection.html
+++ b/docs/articles/tutorial/feature_selection.html
@@ -316,24 +316,30 @@
Calculating the feature importance
Different methods for calculating the feature importance are built into mlr
’s function generateFilterValuesData()
. Currently, classification, regression and survival analysis tasks are supported. A table showing all available methods can be found in article filter methods.
The most basic approach is to use generateFilterValuesData()
directly on a Task()
with a character string specifying the filter method.
-fv = generateFilterValuesData(iris.task, method = "FSelectorRcpp.infogain")
+fv = generateFilterValuesData(iris.task, method = "FSelectorRcpp_information.gain")
## Loading required namespace: FSelectorRcpp
fv
## FilterValues:
## Task: iris-example
-## name type FSelectorRcpp.infogain
-## 1 Sepal.Length numeric 0.4521286
-## 2 Sepal.Width numeric 0.2672750
-## 3 Petal.Length numeric 0.9402853
-## 4 Petal.Width numeric 0.9554360
+## name type FSelectorRcpp_information.gain
+## 1 Sepal.Length numeric 0.4521286
+## 2 Sepal.Width numeric 0.2672750
+## 3 Petal.Length numeric 0.9402853
+## 4 Petal.Width numeric 0.9554360
fv
is a FilterValues()
object and fv$data
contains a data.frame
that gives the importance values for all features. Optionally, a vector of filter methods can be passed.
-fv2 = generateFilterValuesData(iris.task, method = c("FSelectorRcpp.infogain", "chi.squared"))
-fv2$data
-## name type FSelectorRcpp.infogain chi.squared
-## 1 Sepal.Length numeric 0.4521286 0.6288067
-## 2 Sepal.Width numeric 0.2672750 0.4922162
-## 3 Petal.Length numeric 0.9402853 0.9346311
-## 4 Petal.Width numeric 0.9554360 0.9432359
+fv2 = generateFilterValuesData(iris.task,
+ method = c("FSelectorRcpp_information.gain", "FSelector_chi.squared"))
+fv2$data
+## name type FSelectorRcpp_information.gain
+## 1 Sepal.Length numeric 0.4521286
+## 2 Sepal.Width numeric 0.2672750
+## 3 Petal.Length numeric 0.9402853
+## 4 Petal.Width numeric 0.9554360
+## FSelector_chi.squared
+## 1 0.6288067
+## 2 0.4922162
+## 3 0.9346311
+## 4 0.9432359
A bar plot of importance values for the individual features can be obtained using function plotFilterValues()
.
@@ -352,7 +358,7 @@
Function filterFeatures()
supports these three methods as shown in the following example. Moreover, you can either specify the method
for calculating the feature importance or you can use previously computed importance values via argument fval
.
# Keep the 2 most important features
-filtered.task = filterFeatures(iris.task, method = "FSelectorRcpp.infogain", abs = 2)
+filtered.task = filterFeatures(iris.task, method = "FSelectorRcpp_information.gain", abs = 2)
# Keep the 25% most important features
filtered.task = filterFeatures(iris.task, fval = fv, perc = 0.25)
@@ -384,12 +390,13 @@
Using fixed parameters
In the following example we calculate the 10-fold cross-validated error rate mmce of the k-nearest neighbor classifier (FNN::fnn()
) with preceding feature selection on the iris
(datasets::iris()
) data set. We use information.gain
as importance measure with the aim to subset the dataset to the two features with the highest importance. In each resampling iteration feature selection is carried out on the corresponding training data set before fitting the learner.
-lrn = makeFilterWrapper(learner = "classif.fnn", fw.method = "FSelectorRcpp.infogain", fw.abs = 2)
-rdesc = makeResampleDesc("CV", iters = 10)
-r = resample(learner = lrn, task = iris.task, resampling = rdesc, show.info = FALSE, models = TRUE)
-r$aggr
-## mmce.test.mean
-## 0.04
+lrn = makeFilterWrapper(learner = "classif.fnn",
+ fw.method = "FSelectorRcpp_information.gain", fw.abs = 2)
+rdesc = makeResampleDesc("CV", iters = 10)
+r = resample(learner = lrn, task = iris.task, resampling = rdesc, show.info = FALSE, models = TRUE)
+r$aggr
+## mmce.test.mean
+## 0.04
You may want to know which features have been used. Luckily, we have called resample()
with the argument models = TRUE
, which means that r$models
contains a list
of models (makeWrappedModel()
) fitted in the individual resampling iterations. In order to access the selected feature subsets we can call getFilteredFeatures()
on each model.
sfeats = sapply(r$models, getFilteredFeatures)
table(sfeats)
@@ -408,7 +415,7 @@
The threshold of the filter method (fw.threshold
)
In the following regression example we consider the BostonHousing
(mlbench::BostonHousing()
) data set. We use a Support Vector Machine and determine the optimal percentage value for feature selection such that the 3-fold cross-validated mean squared error (mse()
) of the learner is minimal. Additionally, we tune the hyperparameters of the algorithm at the same time. As search strategy for tuning a random search with five iterations is used.
-lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "chi.squared")
+lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "FSelector_chi.squared")
ps = makeParamSet(makeNumericParam("fw.perc", lower = 0, upper = 1),
makeNumericParam("C", lower = -10, upper = 10,
trafo = function(x) 2^x),
@@ -463,13 +470,13 @@
## mse.test.mean
## 21.38273
After tuning we can generate a new wrapped learner with the optimal percentage value for further use (e.g. to predict to new data).
-lrn = makeFilterWrapper(learner = "regr.lm", fw.method = "chi.squared",
+lrn = makeFilterWrapper(learner = "regr.lm", fw.method = "FSelector_chi.squared",
fw.perc = res$x$fw.perc, C = res$x$C, sigma = res$x$sigma)
mod = train(lrn, bh.task)
mod
## Model for learner.id=regr.lm.filtered; learner.class=FilterWrapper
## Trained on: task.id = BostonHousing-example; obs = 506; features = 13
-## Hyperparameters: fw.method=chi.squared,fw.perc=0.338
+## Hyperparameters: fw.method=FSelector_ch...,fw.perc=0.338
getFilteredFeatures(mod)
## [1] "crim" "dis" "rad" "lstat"
@@ -517,7 +524,7 @@
control = ctrl, show.info = FALSE)
sfeats
## FeatSel result:
-## Features (15): mean_perimeter, mean_smoothness, mean_compactness, mean_concavepoints, SE_radius, SE_area, SE_compactness, SE_concavepoints, SE_symmetry, SE_fractaldim, worst_texture, worst_smoothness, worst_compactness, worst_concavepoints, tsize
+## Features (15): mean_perimeter, mean_smoothness, mean_compactne...
## cindex.test.mean=0.7014085
sfeats
is a FeatSelResult
(selectFeatures()
) object. The selected features and the corresponding performance can be accessed as follows:
sfeats$x
@@ -539,7 +546,7 @@
show.info = FALSE)
sfeats
## FeatSel result:
-## Features (11): crim, zn, chas, nox, rm, dis, rad, tax, ptratio, b, lstat
+## Features (11): crim, zn, chas, nox, rm, dis, rad, tax, ptratio...
## mse.test.mean=23.5662834
Further information about the sequential feature selection process can be obtained by function analyzeFeatSelResult()
.
analyzeFeatSelResult(sfeats)
@@ -579,7 +586,7 @@
sfeats = getFeatSelResult(mod)
sfeats
## FeatSel result:
-## Features (17): mean_radius, mean_texture, mean_smoothness, mean_compactness, mean_concavepoints, mean_symmetry, SE_perimeter, SE_area, SE_compactness, SE_concavity, SE_concavepoints, SE_symmetry, worst_texture, worst_smoothness, worst_compactness, worst_concavity, pnodes
+## Features (17): mean_radius, mean_texture, mean_smoothness, mea...
## cindex.test.mean=0.6796954
The selected features are:
sfeats$x
@@ -601,27 +608,27 @@
lapply(r$models, getFeatSelResult)
## [[1]]
## FeatSel result:
-## Features (18): mean_radius, mean_perimeter, mean_compactness, mean_concavity, mean_concavepoints, SE_texture, SE_perimeter, SE_compactness, SE_concavepoints, SE_symmetry, SE_fractaldim, worst_perimeter, worst_area, worst_smoothness, worst_compactness, worst_concavity, worst_symmetry, pnodes
+## Features (18): mean_radius, mean_perimeter, mean_compactness, ...
## cindex.test.mean=0.5382065
##
## [[2]]
## FeatSel result:
-## Features (18): mean_radius, mean_perimeter, mean_area, mean_smoothness, mean_concavity, mean_symmetry, mean_fractaldim, SE_texture, SE_area, SE_compactness, SE_concavepoints, worst_radius, worst_texture, worst_area, worst_compactness, worst_concavepoints, tsize, pnodes
+## Features (18): mean_radius, mean_perimeter, mean_area, mean_sm...
## cindex.test.mean=0.6349051
##
## [[3]]
## FeatSel result:
-## Features (20): mean_texture, mean_smoothness, mean_concavity, mean_concavepoints, mean_symmetry, mean_fractaldim, SE_texture, SE_perimeter, SE_compactness, SE_concavepoints, SE_symmetry, SE_fractaldim, worst_texture, worst_area, worst_smoothness, worst_compactness, worst_concavity, worst_concavepoints, worst_symmetry, pnodes
+## Features (20): mean_texture, mean_smoothness, mean_concavity, ...
## cindex.test.mean=0.6812985
##
## [[4]]
## FeatSel result:
-## Features (11): mean_perimeter, mean_concavity, mean_concavepoints, mean_symmetry, SE_perimeter, SE_symmetry, worst_smoothness, worst_compactness, worst_concavity, worst_symmetry, tsize
+## Features (11): mean_perimeter, mean_concavity, mean_concavepoi...
## cindex.test.mean=0.6924829
##
## [[5]]
## FeatSel result:
-## Features (14): mean_area, mean_smoothness, mean_fractaldim, SE_texture, SE_area, SE_compactness, SE_concavity, SE_concavepoints, SE_symmetry, SE_fractaldim, worst_area, worst_compactness, tsize, pnodes
+## Features (14): mean_area, mean_smoothness, mean_fractaldim, SE...
## cindex.test.mean=0.6701811
diff --git a/docs/articles/tutorial/feature_selection_files/figure-html/unnamed-chunk-4-1.png b/docs/articles/tutorial/feature_selection_files/figure-html/unnamed-chunk-4-1.png
index 24e3aa36d2..65e69f8e53 100644
Binary files a/docs/articles/tutorial/feature_selection_files/figure-html/unnamed-chunk-4-1.png and b/docs/articles/tutorial/feature_selection_files/figure-html/unnamed-chunk-4-1.png differ
diff --git a/docs/articles/tutorial/filter_methods.html b/docs/articles/tutorial/filter_methods.html
index 2d8cdc63ad..8eaca17820 100644
--- a/docs/articles/tutorial/filter_methods.html
+++ b/docs/articles/tutorial/filter_methods.html
@@ -290,20 +290,20 @@
Current methods
-
+
-
+
-
-
-
-
-
-
+
+
+
+
+
+
Method
@@ -377,7 +377,7 @@
X
-chi.squared
+FSelector_chi.squared
FSelector
Chi-squared statistic of independence between feature and target
X
@@ -391,21 +391,77 @@
-FSelectorRcpp.gainratio
-FSelectorRcpp
-Entropy-based Filters: Algorithms that find ranks of importance of discrete attributes, basing on their entropy with a continous class attribute
+FSelector_gain.ratio
+FSelector
+Entropy-based gain ratio between feature and target
X
X
+
+X
+
+
X
+
+
+
+FSelector_information.gain
+FSelector
+Entropy-based information gain between feature and target
X
X
+
+
X
+
+
+X
+
+
+
+FSelector_oneR
+FSelector
+oneR association rule
+X
+X
+
+
+X
+
+
X
-FSelectorRcpp.infogain
+FSelector_relief
+FSelector
+RELIEF algorithm
+X
+X
+
+
+X
+
+
+X
+
+
+
+FSelector_symmetrical.uncertainty
+FSelector
+Entropy-based symmetrical uncertainty between feature and target
+X
+X
+
+
+X
+
+
+X
+
+
+
+FSelectorRcpp_gain.ratio
FSelectorRcpp
Entropy-based Filters: Algorithms that find ranks of importance of discrete attributes, basing on their entropy with a continous class attribute
X
@@ -419,7 +475,7 @@
-FSelectorRcpp.symuncert
+FSelectorRcpp_information.gain
FSelectorRcpp
Entropy-based Filters: Algorithms that find ranks of importance of discrete attributes, basing on their entropy with a continous class attribute
X
@@ -433,6 +489,20 @@
+FSelectorRcpp_symmetrical.uncertainty
+FSelectorRcpp
+Entropy-based Filters: Algorithms that find ranks of importance of discrete attributes, basing on their entropy with a continous class attribute
+X
+X
+
+X
+X
+X
+X
+X
+
+
+
kruskal.test
Kruskal Test for binary and multiclass classification tasks
@@ -446,7 +516,7 @@
X
-
+
linear.correlation
Pearson correlation between feature and target
@@ -460,7 +530,7 @@
X
-
+
mrmr
mRMRe
Minimum redundancy, maximum relevance filter
@@ -474,20 +544,6 @@
X
X
-
-oneR
-FSelector
-oneR association rule
-X
-X
-
-
-X
-
-
-X
-
-
permutation.importance
@@ -503,7 +559,7 @@
X
-praznik.CMIM
+praznik_CMIM
praznik
Minimal conditional mutual information maximisation filter
X
@@ -517,7 +573,7 @@
-praznik.DISR
+praznik_DISR
praznik
Double input symmetrical relevance filter
X
@@ -531,7 +587,7 @@
-praznik.JMI
+praznik_JMI
praznik
Joint mutual information filter
X
@@ -545,7 +601,7 @@
-praznik.JMIM
+praznik_JMIM
praznik
Minimal joint mutual information maximisation filter
X
@@ -559,7 +615,7 @@
-praznik.MIM
+praznik_MIM
praznik
conditional mutual information based feature selection filters
X
@@ -573,7 +629,7 @@
-praznik.MRMR
+praznik_MRMR
praznik
Minimum redundancy maximal relevancy filter
X
@@ -587,7 +643,7 @@
-praznik.NJMIM
+praznik_NJMIM
praznik
Minimal normalised joint mutual information maximisation filter
X
@@ -685,20 +741,6 @@
-relief
-FSelector
-RELIEF algorithm
-X
-X
-
-
-X
-
-
-X
-
-
-
univariate.model.score
Resamples an mlr learner for each input feature individually. The resampling performance is used as filter score, with rpart as default learner.
@@ -712,7 +754,7 @@
X
X
-
+
variance
A simple variance filter
diff --git a/docs/articles/tutorial/learning_curve_files/figure-html/LearningCurveTPFP-1.png b/docs/articles/tutorial/learning_curve_files/figure-html/LearningCurveTPFP-1.png
index d2a6e48ea3..53aa3e4b61 100644
Binary files a/docs/articles/tutorial/learning_curve_files/figure-html/LearningCurveTPFP-1.png and b/docs/articles/tutorial/learning_curve_files/figure-html/LearningCurveTPFP-1.png differ
diff --git a/docs/articles/tutorial/nested_resampling.html b/docs/articles/tutorial/nested_resampling.html
index 46468e0549..0e94d05561 100644
--- a/docs/articles/tutorial/nested_resampling.html
+++ b/docs/articles/tutorial/nested_resampling.html
@@ -581,7 +581,7 @@
Filter methods assign an importance value to each feature. Based on these values you can select a feature subset by either keeping all features with importance higher than a certain threshold or by keeping a fixed number or percentage of the highest ranking features. Often, neither the theshold nor the number or percentage of features is known in advance and thus tuning is necessary.
In the example below the threshold value (fw.threshold
) is tuned in the inner resampling loop. For this purpose the base Learner (makeLearner()
) "regr.lm"
is wrapped two times. First, makeFilterWrapper()
is used to fuse linear regression with a feature filtering preprocessing step. Then a tuning step is added by makeTuneWrapper()
.
# Tuning of the percentage of selected filters in the inner loop
-lrn = makeFilterWrapper(learner = "regr.lm", fw.method = "chi.squared")
+lrn = makeFilterWrapper(learner = "regr.lm", fw.method = "FSelector_chi.squared")
ps = makeParamSet(makeDiscreteParam("fw.threshold", values = seq(0, 1, 0.2)))
ctrl = makeTuneControlGrid()
inner = makeResampleDesc("CV", iters = 3)
@@ -606,17 +606,17 @@
## [[1]]
## Model for learner.id=regr.lm.filtered.tuned; learner.class=TuneWrapper
## Trained on: task.id = BostonHousing-example; obs = 338; features = 13
-## Hyperparameters: fw.method=chi.squared
+## Hyperparameters: fw.method=FSelector_ch...
##
## [[2]]
## Model for learner.id=regr.lm.filtered.tuned; learner.class=TuneWrapper
## Trained on: task.id = BostonHousing-example; obs = 337; features = 13
-## Hyperparameters: fw.method=chi.squared
+## Hyperparameters: fw.method=FSelector_ch...
##
## [[3]]
## Model for learner.id=regr.lm.filtered.tuned; learner.class=TuneWrapper
## Trained on: task.id = BostonHousing-example; obs = 337; features = 13
-## Hyperparameters: fw.method=chi.squared
+## Hyperparameters: fw.method=FSelector_ch...
The result of the feature selection can be extracted by function getFilteredFeatures()
. Almost always all 13 features are selected.
lapply(r$models, function(x) getFilteredFeatures(x$learner.model$next.model))
## [[1]]
@@ -814,12 +814,12 @@
## $`BostonHousing-example`$regr.lm.featsel
## $`BostonHousing-example`$regr.lm.featsel[[1]]
## FeatSel result:
-## Features (11): zn, indus, chas, nox, rm, dis, rad, tax, ptratio, b, lstat
+## Features (11): zn, indus, chas, nox, rm, dis, rad, tax, ptrati...
## mse.test.mean=27.2177924
##
## $`BostonHousing-example`$regr.lm.featsel[[2]]
## FeatSel result:
-## Features (10): crim, zn, nox, rm, dis, rad, tax, ptratio, b, lstat
+## Features (10): crim, zn, nox, rm, dis, rad, tax, ptratio, b, l...
## mse.test.mean=19.9134734
getBMRFeatSelResults(res, drop = TRUE)
## $regr.rpart
@@ -828,12 +828,12 @@
## $regr.lm.featsel
## $regr.lm.featsel[[1]]
## FeatSel result:
-## Features (11): zn, indus, chas, nox, rm, dis, rad, tax, ptratio, b, lstat
+## Features (11): zn, indus, chas, nox, rm, dis, rad, tax, ptrati...
## mse.test.mean=27.2177924
##
## $regr.lm.featsel[[2]]
## FeatSel result:
-## Features (10): crim, zn, nox, rm, dis, rad, tax, ptratio, b, lstat
+## Features (10): crim, zn, nox, rm, dis, rad, tax, ptratio, b, l...
## mse.test.mean=19.9134734
You can access results for individual learners and tasks and inspect them further.
feats = getBMRFeatSelResults(res, learner.id = "regr.lm.featsel", drop = TRUE)
@@ -891,7 +891,7 @@
Example 3: One task, two learners, feature filtering with tuning
Here is a minimal example for feature filtering with tuning of the feature subset size.
# Feature filtering with tuning in the inner resampling loop
-lrn = makeFilterWrapper(learner = "regr.lm", fw.method = "chi.squared")
+lrn = makeFilterWrapper(learner = "regr.lm", fw.method = "FSelector_chi.squared")
ps = makeParamSet(makeDiscreteParam("fw.abs", values = seq_len(getTaskNFeats(bh.task))))
ctrl = makeTuneControlGrid()
inner = makeResampleDesc("CV", iter = 2)
diff --git a/docs/news/index.html b/docs/news/index.html
index 5fad52e039..1734c05dca 100644
--- a/docs/news/index.html
+++ b/docs/news/index.html
@@ -323,20 +323,45 @@
filter - new
-- praznik.JMI
-- praznik.DISR
-- praznik.JMIM
-- praznik.MIM
-- praznik.NJMIM
-- praznik.MRMR
-- praznik.CMIM
+- praznik_JMI
+- praznik_DISR
+- praznik_JMIM
+- praznik_MIM
+- praznik_NJMIM
+- praznik_MRMR
+- praznik_CMIM
+- FSelectorRcpp_gain.ratio
+- FSelectorRcpp_information.gain
+- FSelectorRcpp_symuncert
filter - general
-- Replaced filters
information.gain
, gainratio
and symmetrical.uncertainty
depending on package FSelector
by package FSelectorRcpp
. The change comes with a ~ 100 times speedup.
+- Added filters
FSelectorRcpp_gain.ratio
, FSelectorRcpp_information.gain
and FSelectorRcpp_symmetrical.uncertainty
from package FSelectorRcpp
. These filters are ~ 100 times faster than the implementation of the FSelector
pkg. Please note that both implementations do things slightly different internally and the FSelectorRcpp
methods should not be seen as direct replacement for the FSelector
pkg.
+- prefixed all filters from pkg
FSelector
with FSelector
to distinguish them from the new FSelectorRcpp
filters
+
+-
+
information.gain
-> FSelector_information.gain
+
+-
+
gain.ratio
-> FSelector_gain.ratio
+
+-
+
symmetrical.uncertainty
-> FSelector_symmetrical.uncertainty
+
+-
+
chi.squared
-> FSelector_chi.squared
+
+-
+
relief
-> FSelector_relief
+
+-
+
oneR
-> FSelector_oneR
+
+
+
@@ -347,6 +372,15 @@
regr.liquidSVM
+