-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for most learning tasks to PFI #1832
Conversation
src/Microsoft.ML.Transforms/PermutationFeatureImportanceExtensions.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Transforms/PermutationFeatureImportanceExtensions.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Transforms/PermutationFeatureImportanceExtensions.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…d put array delta calculations into a helper function.
IPredictionTransformer<IPredictor> model, | ||
IDataView data, | ||
string label = DefaultColumnNames.Label, | ||
string features = DefaultColumnNames.Features, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be a list per its description. Also, FFM is a case one can have multiple feature columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really good point. Field-Aware Factorization Machines
and at least one other internal learner have multiple feature columns, which is not supported by PFI. I'm going to add this as a separate fix, because it's a somewhat-orthogonal issue and might be a bit extensive.
Good find!
public void TestPfiMulticlassClassificationOnDenseFeatures() | ||
{ | ||
var data = GetDenseDataset(TaskType.MulticlassClassification); | ||
var model = ML.MulticlassClassification.Trainers.LogisticRegression().Fit(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test using FFM with multiple feature columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(See previous comment)
/// y = 10x1 + 10x2vBuff + 30x3 + e. | ||
/// Within xBuff feature 2nd slot will be sparse most of the time. | ||
/// 2nd slot of xBuff has the least importance: Evaluation metrics do not change a lot when this slot is permuted. | ||
/// x2 has the biggest importance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean x3?
This PR adds support to
Permutation Feature Importance
forMulticlass Classification
,Ranking
, andClustering
.Fixes #1771