Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use Feature selection #136

Closed
Vijay27anand opened this issue Nov 23, 2018 · 11 comments
Closed

How to use Feature selection #136

Vijay27anand opened this issue Nov 23, 2018 · 11 comments
Assignees
Labels
question Further information is requested

Comments

@Vijay27anand
Copy link

I want to use FeatureSelectorByMutualInformation feature selection, can you please provide me a sample code snippet how to use it with a dataset, i could not able find in the samples provided.

@asthana86
Copy link
Contributor

@Zruty0 would be able to help with this one.

@Vijay27anand
Copy link
Author

@Zruty0 what i am trying to do is to use a feature selection by mutual information to return me the list of columns with the scores, can you help with a sample. Also do i need to do onehotencoding, normalize on the IDataView before doing feature selection?

@asthana86
Copy link
Contributor

@Zruty0 is out on parental leave. Tom/Ivan can you folks help answer Vijay's query.
@Ivanidzo4ka @TomFinley

@asthana86 asthana86 added the question Further information is requested label Nov 28, 2018
@Ivanidzo4ka
Copy link

Sure.
https://github.com/dotnet/machinelearning/blob/master/docs/samples/Microsoft.ML.Samples/Dynamic/FeatureSelectionTransform.cs
Here is example of our Count selection and Mutual information transforms in action.
They are gonna be part of 0.8 release (which is around the corner) but you can always try our daily nugets to test them.

@Vijay27anand
Copy link
Author

Thanks @Ivanidzo4ka it was helpful. I need more details on how to get the scores of individual columns similar to what we get from Azure ML Studio. I have attached the sample ML studio screen shot that showing the score of columns. Similar to this i want to identify the top columns to consider for my Label.
azureml-mi

@CESARDELATORRE
Copy link
Contributor

@Vijay27anand - As Ivan mentioned, with the upcoming ML.NET v08 version we're also releasing improved model explainability, so in 0.8 release, we have included tools for model explainability that we use internally at Microsoft to help machine learning developers better understand the feature importance of models ("Permutation Feature Importance") and create high-capacity models that can be easily interpreted by others ("Generalized Additive Models").
I'm writing a blog post including these features which will be published next week. Stay tuned and check it out next week at the .NET Blog, on? 👍

@Vijay27anand
Copy link
Author

@CESARDELATORRE with 0.8 the MutualInformationFeatureSelectionUtils.Train seems to be removed, i was using to understand the score for the columns in the Mutual feature selection. Please let me know which on should i use now.

@CESARDELATORRE
Copy link
Contributor

Adding @rogancarr - Rogan, is there something comparable to MutualInformationFeatureSelectionUtils.Train in the new model features explainability coming in 0.8?
Any other choice to understand the score for the columns in the Mutual feature selection?

@Vijay27anand
Copy link
Author

@CESARDELATORRE @rogancarr can you answer how to understand the score for the columns in the Mutual feature selection in 0.8

@Vijay27anand
Copy link
Author

it will be implemented with #2328

@justinormont
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants