Feature weighting for decision trees and random forests #17307

jhogg11 · 2020-05-21T22:12:02Z

Basic idea: Allow decision trees and random forests (and possibly other ensemble algorithms) to have feature weightings that make the selection of specific features as split points more or less likely.

https://www.researchgate.net/publication/220338672_Random_feature_weights_for_decision_tree_ensemble_construction

I was looking to see if anything similar existed and came across the above paper. If I'm understanding the method correctly, the idea is to use random feature weights to create highly diverse ensembles that outperform other common methods, particularly where noise is present.

A general feature_weights parameter could allow for tuning based on domain knowledge or other methods that might vastly improve the ability of tree or forest to generalize. Random forests could accept a set of weights to be used on all forests or an array of weights, one for each tree estimator.

In full disclosure, I'm really just getting my feet wet in terms of decision trees, random forests, etc. Feel free to let me know if this is wrong or does not make sense.

The text was updated successfully, but these errors were encountered:

NicolasHug · 2020-05-21T22:15:14Z

thanks for the suggestion @jhogg11 but this does not seem to meet our inclusion criteria in terms of number of citations https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms

Would you know of any examples where this has been successfully used?

jhogg11 · 2020-05-21T22:40:57Z

I'm not aware of any real world implementations. I just found the paper by searching after noticing that certain models would get worse in terms of generalization after adding additional features. It seems like a natural extension of decision trees, especially given that there are weighting parameters for samples and classes, and I would think it would provide a significant opportunity to improve the effectiveness of random forests.

Personally, I would not call it a new algorithm but rather a modification to an existing one. Understandable if there is disagreement about that.

xuyxu · 2020-05-22T16:45:16Z

@NicolasHug @jhogg11
Here is one example in Anomaly Detection: Robust Random Cut Forest Based Anomaly Detection On Streams (https://proceedings.mlr.press/v48/guha16.pdf):

Unlike Isolation Forest that randomly selects one attribute at each internal node, robust cut tree in this paper selects one attribute with the probability proportional to the input range on that attribute (corresponds to feature_weights here).

jhogg11 · 2020-05-23T22:47:43Z

Here's another paper that takes a slightly different approach to feature weights:
http://www.aclweb.org/old_anthology/O/O08/O08-6001.pdf

Rather than retaining all features as candidates at every split point but altering the likelihood of each feature being chosen (like the paper referenced in the OP - "RFW"), this paper proposes only using a subset of features at each decision point but weighting the features to be more or less likely to be included the subset ("IRFA"). IRFA is also asserted to be better than vanilla random forests.

I think RFW is the better and more generalized of the two approaches (and the paper is more in-depth in terms of comparisons with other tree-based algorithms). RFW does not preclude using a subset of weights and might have the same effect as IRFA, whereas IRFA does preclude using the full set of features.

One other thing that might be worth mentioning is that feature weighting is already available in the nearest neighbor algorithms in sklearn through the w parameter of wminkowski distance.

https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html

I look at decision trees as being similar to KNN in that they both attempt to use near neighbors but with slightly different ways of getting there (hyperrectangle vs a radius based on the Kth nearest neighbor). Maybe someone with more knowledge can comment, but tuning feature weights so that a given feature is more or less likely to be selected for a split point seems highly analogous to tuning feature weights to make features more or less likely to affect the nearest neighbor determination.

jnothman · 2020-05-25T20:56:23Z

Can this be implemented as an extension Splitter to the current code base? I'm not sure how easily we should justify adding it to the primary implementation, as every piece of added complexity comes with substantial maintenance cost.

jhogg11 · 2020-05-30T18:38:43Z

Can this be implemented as an extension Splitter to the current code base? I'm not sure how easily we should justify adding it to the primary implementation, as every piece of added complexity comes with substantial maintenance cost.

Would this be as simple as adding an additional feature_weights parameter and applying the weights to current_proxy_improvement in the splitter node_split method, e.g., something like this?

current_proxy_improvement = self.criterion.proxy_impurity_improvement()
current_proxy_improvement *= self.feature_weights[current.feature]      # new line

if current_proxy_improvement > best_proxy_improvement:
   ....

jhogg11 added the New Feature label May 21, 2020

cmarmo added the module:ensemble label May 25, 2020

gjain-7 mentioned this issue Aug 29, 2023

Implementation of Robust Random Cut Forest (RRCF) Algorithm #27200

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature weighting for decision trees and random forests #17307

Feature weighting for decision trees and random forests #17307

jhogg11 commented May 21, 2020

NicolasHug commented May 21, 2020

jhogg11 commented May 21, 2020

xuyxu commented May 22, 2020

jhogg11 commented May 23, 2020 •

edited

jnothman commented May 25, 2020 via email

jhogg11 commented May 30, 2020

Feature weighting for decision trees and random forests #17307

Feature weighting for decision trees and random forests #17307

Comments

jhogg11 commented May 21, 2020

NicolasHug commented May 21, 2020

jhogg11 commented May 21, 2020

xuyxu commented May 22, 2020

jhogg11 commented May 23, 2020 • edited

jnothman commented May 25, 2020 via email

jhogg11 commented May 30, 2020

jhogg11 commented May 23, 2020 •

edited