New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: SHAP Partition explainer fails for a single token text input #3515
Comments
Thanks for the report. Would you be so kind and provide some sample |
Hey, you can use this dataset : https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset |
Thanks for pointing us to the dataset. I don't want to sound rude but we have so many issues that we really need to choose what we are working on, so for us it is best if we can reproduce a bug directly with the code provided in the bug description. If your time allows, it would be great if you could add loading and defining the dataset in your issue, so that we can reproduce it without looking up how to load data from kaggle + defining the |
You can use this code snippet :
Error shown for test1 :
|
Hey @CloseChoice, |
Issue Description
I'm trying to implement SHAP partition explainer for text data using a custom tokeniser function. However when the input contains a text with single token for example "hello", it fails at the masker clustering step.
Minimal Reproducible Example
Traceback
I notice that the value of pt is [] - an empty array
The text was updated successfully, but these errors were encountered: