Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: Interpreting Feature PFI results #4637

Open
lefig opened this issue Jan 8, 2020 · 5 comments
Open

Q: Interpreting Feature PFI results #4637

lefig opened this issue Jan 8, 2020 · 5 comments
Assignees

Comments

@lefig
Copy link

@lefig lefig commented Jan 8, 2020

Hi all,

I have been doing a little deep dive into some of my models in order to understand a little more about feature relevance. My results for running feature explanatory analysis is as follows for bin classification:

2020-01-08 11:34:03.813 +00:00 [INF] BinaryFastTreeParameters
2020-01-08 11:34:03.815 +00:00 [INF] Bias: 0
2020-01-08 11:34:03.816 +00:00 [INF] Feature Weights:
2020-01-08 11:34:03.843 +00:00 [INF] Feature: CloseWeight: 0.1089412
2020-01-08 11:34:03.931 +00:00 [INF] Feature: OpenWeight: 0.3691619
2020-01-08 11:34:03.932 +00:00 [INF] Feature: HighWeight: 0.06676193
2020-01-08 11:34:03.933 +00:00 [INF] Feature: LowWeight: 0.1926264
2020-01-08 11:34:03.934 +00:00 [INF] Feature: STO_FastStochWeight: 0.19846
2020-01-08 11:34:03.938 +00:00 [INF] Feature: STO_StochKWeight: 0.5019926
2020-01-08 11:34:03.941 +00:00 [INF] Feature: STO_StochDWeight: 0.3781931
2020-01-08 11:34:03.942 +00:00 [INF] Feature: STOWeight: 0
2020-01-08 11:34:03.943 +00:00 [INF] Feature: CCI_TypicalPriceAvgWeight: 0.131141
2020-01-08 11:34:03.944 +00:00 [INF] Feature: CCI_TypicalPriceMADWeight: 0.1299266
2020-01-08 11:34:03.946 +00:00 [INF] Feature: CCIWeight: 1
2020-01-08 11:34:03.947 +00:00 [INF] Feature: RSIDownWeight: 0.4761779
2020-01-08 11:34:03.948 +00:00 [INF] Feature: RSIUpWeight: 0.1249975
2020-01-08 11:34:03.951 +00:00 [INF] Feature: RSIWeight: 0.2877662
2020-01-08 11:34:03.952 +00:00 [INF] Feature: MOMWeight: 0.1822069
2020-01-08 11:34:03.953 +00:00 [INF] Feature: ADX_PositiveDirectionalIndexWeight: 0.2435836
2020-01-08 11:34:03.954 +00:00 [INF] Feature: ADX_NegativeDirectionalIndexWeight: 0.4263106
2020-01-08 11:34:03.955 +00:00 [INF] Feature: ADXWeight: 0.1899773
2020-01-08 11:34:03.956 +00:00 [INF] Feature: CMOWeight: 0.2601428

But for PFI I have the following:
2020-01-08 11:34:09.369 +00:00 [INF] Calculating Binary Classification Feature PFI
2020-01-08 11:34:09.371 +00:00 [INF] Feature PFI for learner:BinaryFastTree
2020-01-08 11:34:09.383 +00:00 [INF] Close| 0.000000
2020-01-08 11:34:09.384 +00:00 [INF] Open| 0.000000
2020-01-08 11:34:09.385 +00:00 [INF] High| 0.000000
2020-01-08 11:34:09.386 +00:00 [INF] Low| 0.000000
2020-01-08 11:34:09.391 +00:00 [INF] STO_FastStoch| 0.000000
2020-01-08 11:34:09.400 +00:00 [INF] STO_StochK| 0.000000
2020-01-08 11:34:09.401 +00:00 [INF] STO_StochD| 0.000000
2020-01-08 11:34:09.402 +00:00 [INF] STO| 0.000000
2020-01-08 11:34:09.404 +00:00 [INF] CCI_TypicalPriceAvg| 0.000000
2020-01-08 11:34:09.406 +00:00 [INF] CCI_TypicalPriceMAD| 0.000113
2020-01-08 11:34:09.408 +00:00 [INF] CCI| 0.000000
2020-01-08 11:34:09.414 +00:00 [INF] RSIDown| 0.000221
2020-01-08 11:34:09.416 +00:00 [INF] RSIUp| 0.000000
2020-01-08 11:34:09.431 +00:00 [INF] RSI| 0.000000
2020-01-08 11:34:09.443 +00:00 [INF] MOM| -0.003003
2020-01-08 11:34:09.457 +00:00 [INF] ADX_PositiveDirectionalIndex| 0.000000
2020-01-08 11:34:09.467 +00:00 [INF] ADX_NegativeDirectionalIndex| 0.000000
2020-01-08 11:34:09.470 +00:00 [INF] ADX| 0.000000
2020-01-08 11:34:09.479 +00:00 [INF] CMO| 0.000000

My question is essentially - what should I read (if anything) into zero values for PFI. The evaluation score too:
020-01-08 11:34:17.135 +00:00 [INF] Score: -4.640871
2020-01-08 11:34:17.138 +00:00 [INF] Probability: 0.1351293

I would appreciate any thoughts that you may have regarding using such info to improve model veracity.

Thank you
Fig

@antoniovs1029 antoniovs1029 self-assigned this Jan 8, 2020
@antoniovs1029

This comment has been minimized.

Copy link
Member

@antoniovs1029 antoniovs1029 commented Jan 8, 2020

Can you please share the code you used to print those values to check a couple of things?

@lefig

This comment has been minimized.

Copy link
Author

@lefig lefig commented Jan 10, 2020

Pleasure and thank you for your help!

The logging functions:

private void LogModelWeights(LinearBinaryModelParameters subModel, string name)
        {
            var weights = subModel.Weights.ToList();

            // Log the model parameters.
            Logger.Info(name + $"Parameters");
            Logger.Info("Bias: " + subModel.Bias);
            Logger.Info($"Feature Weights:");

            // 1 Feature Weights
            for (int i = 0; i < features.Length; i++)
            {
                contributions[i].Weight = weights[i];
                contributions[i].Contribution = 0;  // The weight will be assigned by the prediction engine
                                                    // Using CalculateFeatureContribution (bellow)
                Logger.Info(" Feature: " + contributions[i].Name + "Weight: " + contributions[i].Weight);
            }
        }

private void LogPermutationMetics(IDataView transformedData, 
            ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics)
        {
            var allFeatureNames = GetColumnNamesUsedForPFI(transformedData);
            var mapFields = new List<string>();
            for (int i = 0; i < allFeatureNames.Count(); i++)
            {
                var slotField = new VBuffer<ReadOnlyMemory<char>>();
                if (transformedData.Schema[allFeatureNames[i]].HasSlotNames())
                {
                    transformedData.Schema[allFeatureNames[i]].GetSlotNames(ref slotField);
                    for (int j = 0; j < slotField.Length; j++)
                    {
                        mapFields.Add(allFeatureNames[i]);
                    }
                }
                else
                {
                    mapFields.Add(allFeatureNames[i]);
                }
            }

            // Now let's look at which features are most important to the model
            // overall. Get the feature indices sorted by their impact on AUC.
            // The importance, or the absolute average decrease in R-squared metric calculated 
            // by PermutationFeatureImportance can then be ordered from most important to least important.
            var sortedIndices = permutationMetrics
                .Select((metrics, index) => new { index, metrics.AreaUnderRocCurve })
                .OrderByDescending(
                feature => Math.Abs(feature.AreaUnderRocCurve.Mean));

            Console.WriteLine($"Feature indices sorted by their impact on AUC:");

            foreach (var feature in sortedIndices)
            {
                Console.WriteLine($"{mapFields[feature.index],-20}|\t{Math.Abs(feature.AreaUnderRocCurve.Mean):F6}");
            }

            Console.WriteLine($"PMI AUC Logged as the following:");
            // Combine metrics with feature names and format for display
            for (int i = 0; i < permutationMetrics.Length; i++)
            {
                Logger.Info($"{importances[i].Name}|\t{permutationMetrics[i].AreaUnderRocCurve.Mean:F6}");
                importances[i].AUC = permutationMetrics[i].AreaUnderRocCurve.Mean;
            }
        }


@najeeb-kazmi

This comment has been minimized.

Copy link
Member

@najeeb-kazmi najeeb-kazmi commented Jan 14, 2020

Hi @lefig - can you share the code that generates the objects passed to these logging functions?
LinearBinaryModelParameters subModel
IDataView transformedData
ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics

Please also share code for any data processing and model training.

PFI values for features being 0 mean that permuting the feature values did not change AreaUnderRocCurve much. This is not the same as the weight learned by the model being 0. You can have non-zero weights for a feature that are not statistically significant, and you could end up with a situation where PFI metrics are 0.

Note that PFI value is just one indicator of feature importance, not a conclusive statement of feature importance. That said, so many features having PFI of 0 warrants some further investigation. Here are a few reasons I can think of that can possibly explain this.

  • permutationCount used for calculating PFI is 1 (or a small number). Please double check the value of this argument is something reasonable (try something like 10 or 30)
  • The model itself might not be very good, so the change in AreaUnderRocCurve isn't very large when a feature is permuted. What is the actual AreaUnderRocCurve of this model evaluated on the training and test data? AreaUnderRocCurve ~0.5 or ~0.6 would indicate a particularly poor model, which you would expect to be about as poor when a feature is permuted, hence no change in AreaUnderRocCurve.
  • PFI indicates feature importance only on the data it is evaluated on. Are you evaluating ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics on a very small dataset? That could give rise to 0 change in AreaUnderRocCurve.
@lefig

This comment has been minimized.

Copy link
Author

@lefig lefig commented Jan 17, 2020

Hi @najeeb-kazmi

Thank you for your kind help. The code that generates the metrics is as follows (this is an example of one such learner that requires a calibrator).

private void CalculateGamCalibratedClassificationPermutationFeatureImportance(MLContext mlContext, IDataView transformedData,
                                                        ITransformer trainedModel, string learner)
        {
            // Extract the trainer (last transformer in the model)          
            var singleTrainerModel = (trainedModel as BinaryPredictionTransformer<CalibratedModelParametersBase<GamBinaryModelParameters,
                PlattCalibrator>>);

            //Calculate Feature Permutation
            ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics =
                                            mlContext
                                                .BinaryClassification.PermutationFeatureImportance(predictionTransformer: singleTrainerModel,
                                                                                         data: transformedData,
                                                                                         labelColumnName: "Label",
                                                                                         numberOfExamplesToUse: 100, permutationCount: 50);
            Logger.Info("Calculating Binary Classification Feature PFI");
            Logger.Info("Feature PFI for learner:" + learner);
            LogPermutationMetics(transformedData, permutationMetrics);
        }

I tend to think (your point 2) that the model is poor and needs some features removed. Hence I was hoping to have some insight regarding the names of those features so that I can proceed with changing the model.

Best wishes
Fig

@najeeb-kazmi

This comment has been minimized.

Copy link
Member

@najeeb-kazmi najeeb-kazmi commented Jan 17, 2020

@lefig

  • This is a Gam model, your original comment was for a linear model. For which model are you seeing 0 PFI metrics? For linear model, you can use L1 regularization to remove unimportant features (force their weights to be 0). For Gam models, see this example for how to understand feature importance. Basically, features where the bin effects are relatively flat are less important ( and could be removed), while feature whose bin effects show some trend are more important. Try PFI after these features have been removed.

  • What is the AUC of this model?

  • Maybe using only 100 rows is the reason you are not seeing non-zero PFI. Try using the entire dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.