You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@paulbkoch I'm looking for a way to go about showing probabilities in local explanations.
At the moment, this is what we have:
feature contributions in terms of logits (positive and negative)
an intercept
To obtain the predicted score for the positive class (in a binary setting), interpretml does the following:
start a sum using the intercept value as the initial value
iterate over each feature (including interactions)
2.1) get the corresponding logit indexing the model.term_scores_ attribute (which has to do with bins and such, I need to dig deeper in this regard)
2.2) add that logit to the current sum value (which started from the intercept)
add a 0 to the final sum of logits and obtain an array with two columns an N rows depending on the number of samples that we are predicting: [0, sum_of_logits]
apply softmax to get predicted "probabilities" for negative and positive class
What I want to know is the following:
is there a way, starting from the predicted probability of the positive class (let's say 0.85), to know the individual contribution of each feature in terms of probability ?
e.g.
feat1: +0.12
feat2: -0.13
feat3: +0.4
...
so that the sum of each individual contribution (ideally a "partial" probability) add up to the predicted probability of positive class (i.e., 0.85) ?
Additional question: How should we treat the intercept in such a case ?
I tried passing each contribution in terms of logits to the logistic function but obviously it makes little sense
Thank you very much. You did an outstanding work with this library!
The text was updated successfully, but these errors were encountered:
Hi @francescopisu -- Traditionally, you cannot do this since the relationship between changes in probability and logits is nonlinear (softmax). There is however a very interesting paper that came out a few months ago where they were able to do this by using an approximation of the sigmoid function (https://arxiv.org/pdf/2211.06360.pdf). I was thinking someday we might implement this as an option because it would be pretty neat to be able to offer additive probabilities. To do it justice though I think you would want to modify the loss function to better match their sigmoid approximation.
@paulbkoch I'm looking for a way to go about showing probabilities in local explanations.
At the moment, this is what we have:
To obtain the predicted score for the positive class (in a binary setting), interpretml does the following:
2.1) get the corresponding logit indexing the model.term_scores_ attribute (which has to do with bins and such, I need to dig deeper in this regard)
2.2) add that logit to the current sum value (which started from the intercept)
What I want to know is the following:
is there a way, starting from the predicted probability of the positive class (let's say 0.85), to know the individual contribution of each feature in terms of probability ?
e.g.
feat1: +0.12
feat2: -0.13
feat3: +0.4
...
so that the sum of each individual contribution (ideally a "partial" probability) add up to the predicted probability of positive class (i.e., 0.85) ?
Additional question: How should we treat the intercept in such a case ?
I tried passing each contribution in terms of logits to the logistic function but obviously it makes little sense
Thank you very much. You did an outstanding work with this library!
The text was updated successfully, but these errors were encountered: