New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature importance for random forests #210

Closed
mjmckp opened this Issue May 23, 2018 · 3 comments

Comments

Projects
5 participants
@mjmckp

mjmckp commented May 23, 2018

When training a random forest, the scikit-learn package returns the feature importances (see http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html), where the feature importance metric is the mean decrease impurity (as described here http://blog.datadive.net/selecting-good-features-part-iii-random-forests/). Can ML.Net also provide this information?

@codemzs

This comment has been minimized.

Show comment
Hide comment
@codemzs

codemzs May 23, 2018

Member

Hi @mjmckp , Yes, this information is maintained in the runtime but there isn't a high level API at the moment that exposes feature importance. We are working on exposing model summary via high level API that should contain feature importance.

Member

codemzs commented May 23, 2018

Hi @mjmckp , Yes, this information is maintained in the runtime but there isn't a high level API at the moment that exposes feature importance. We are working on exposing model summary via high level API that should contain feature importance.

@shauheen shauheen added this to To do in v0.3 May 23, 2018

@shauheen shauheen added this to the 0618 milestone Jun 13, 2018

@shauheen shauheen moved this from To do to In Progress in v0.3 Jun 19, 2018

@zeahmed

This comment has been minimized.

Show comment
Hide comment
@zeahmed

zeahmed Jun 19, 2018

Member

Looking at the new API proposal #371, this feature will be automatically available.

For example, in the proposed example look at this Line where predictor is created. The predictor contains several methods to save models/metadata and SaveSummary is one of them which emits feature importance where applicable.

Also if the predictor implements ICanGetSummaryInKeyValuePairs then feature importance can be obtain as a List of KeyValuePairs. For example, see here for reference.

Member

zeahmed commented Jun 19, 2018

Looking at the new API proposal #371, this feature will be automatically available.

For example, in the proposed example look at this Line where predictor is created. The predictor contains several methods to save models/metadata and SaveSummary is one of them which emits feature importance where applicable.

Also if the predictor implements ICanGetSummaryInKeyValuePairs then feature importance can be obtain as a List of KeyValuePairs. For example, see here for reference.

@shauheen

This comment has been minimized.

Show comment
Hide comment
@shauheen

shauheen Jun 28, 2018

Member

Closing this issue as #371 would address the main problem.

Member

shauheen commented Jun 28, 2018

Closing this issue as #371 would address the main problem.

@shauheen shauheen closed this Jun 28, 2018

@shauheen shauheen moved this from In Progress to Done in v0.3 Jun 28, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment