Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference problem between Local explainability and Global explainability #371

Closed
yaoching0 opened this issue Mar 22, 2022 · 11 comments · Fixed by #372
Closed

Difference problem between Local explainability and Global explainability #371

yaoching0 opened this issue Mar 22, 2022 · 11 comments · Fixed by #372
Labels
bug Something isn't working

Comments

@yaoching0
Copy link

yaoching0 commented Mar 22, 2022

HI,
I train tabnet with 4700-dimension feature,and i check the Global explainability.(Because of its sparseness, I did unique(),the s shows the index in the Global explainability matrix )
and i input training data to the function .explain(Training data), i sum explain_matrix cross the rows, but i get totally different result with Global explainability.
For example,the index of the globally explained maximum value is even 0 in the output of explain() function
1647936738488

@eduardocarvp
Copy link
Collaborator

Hello @yaoching0 ,

Are you using embeddings for categorical features on your model?

@yaoching0
Copy link
Author

@eduardocarvp No, all numerical data between (0,1)

@yaoching0
Copy link
Author

My data format is processed and trained exactly by example.
1647941394526
1647941412820

@yaoching0
Copy link
Author

This is my training data format
1647942053845

@Optimox
Copy link
Collaborator

Optimox commented Mar 22, 2022

The clf.feature_importances are normalized while the individual importance are not, have a look at #180

So you need to divide by the sum of each row. Moreover, you want to average no sum
avg_imp = np.mean(explain_matrix, axis=0) avg_imp = avg_imp / np.sum(avg_imp)
(make sure that axes are correct)

@yaoching0

This comment was marked as outdated.

@yaoching0

This comment was marked as outdated.

@yaoching0
Copy link
Author

@Optimox
I have a new clue and I see that after calculation they have the same value, but are indeed in different positions, why is this happening?
1647947051667

@yaoching0
Copy link
Author

yaoching0 commented Mar 22, 2022

I restart the jupyter kernel,and it seems match

@Optimox
Copy link
Collaborator

Optimox commented Mar 22, 2022

@yaoching0,

Actually I'm surprised that things are working now for you.

I did a test and I was not able to find exactly the feature_importance, this is due to the fact that the internal feature_importance are computed on the train DATALOADER while it creates a new dataloader when calling clf.explain(X).

So if you have any parameters that changes the train dataloader you'll end up with a different score. This can happen in different scenarios:

  • drop_last=True : you are computing the feature importance without a few lines (randomly selected as shuffle=True for training dataloader). Potentially this means that the feature importance is somehow random (but still representative). I think it's ok to call this a bug.
    -weights=1 : you are going to over-sample some examples in your training loader, which will change the final feature_importance.

I think those are the only two reasons, but there might be a few other scenarios that I did not spotted.

In the end: this is a bug, I'll fix it, thank you very much for finding it. In the meantime, the differences are coming from the sample used for internal feature importance. So you can trust both methods: if you see a big change in your case, this is due to the high sparsity of your data, and the final training importance might not be very representative.

@Optimox Optimox reopened this Mar 22, 2022
@Optimox Optimox added the bug Something isn't working label Mar 22, 2022
@eduardocarvp
Copy link
Collaborator

Yes, I think there are some problems as well. I was wondering if the reducing matrix has any influence, since in one case we do the sum first and then reduction and the inverse in the other case. But since there are no np.abs and only sums and averages I guess it should be the same...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants