Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in if (ncol(res) == 1) { : argument is of length zero #51

Open
marboe123 opened this issue Oct 16, 2022 · 5 comments
Open

Error in if (ncol(res) == 1) { : argument is of length zero #51

marboe123 opened this issue Oct 16, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@marboe123
Copy link

Hello,

I receive the error above after I use this command:

shap_values <- fastshap::explain(model_n, X = trainval, exact = TRUE)

model_n is a multiclassification model fitted with xgboost.
trainval is my train-data

When I run this set-up for binary classification, the shap values are calculated correctly.
When I run this set-up for multiclass classification, the error above is generated.

Do you have any idea what can be the cause?

Thanks a lot!

@marboe123
Copy link
Author

ps:

If I use the package SHAPforxgboost to calculate the shap values, I also receive an error:

Error in `colnames<-`(`*tmp*`, value = c(colnames(X_train), "BIAS")) : 
  attempt to set 'colnames' on an object with less than two dimensions

There are several other users who have the same error as well as can be seen at the bottem of this post:

https://liuyanguu.github.io/post/2019/07/18/visualization-of-shap-for-xgboost/

And here:

liuyanguu/SHAPforxgboost#35

Maybe this could be an indication of the rootcause of the error.

Thank you.

@bgreenwell
Copy link
Owner

bgreenwell commented Oct 20, 2022

@marboe123 Can you show what the output is from calling predict(mymodel, newdata = data, predcontrib = TRUE)? I suspect XGBoost returns a list or an array (one element for each class) in the multiclass case. If so, should be a simple fix.

@marboe123
Copy link
Author

@bgreenwell thank you for your response! The output is :
image

If I look into the lists I see BIAS, BIAS.1, BIAS.2 and similar for my variables: x, x.1, x.2 etc.

Are these the shap values I can use if the shap values at itself are sufficient for me?

@bgreenwell
Copy link
Owner

Yes! I’ll fix the package to account for multi class models. But here you have a list with one component of Shapley values for each of your three class outcomes. In the binary case you really only need one. I’ll leave this issue open until I can get a fix, but those are exactly what you’re looking for!

@marboe123
Copy link
Author

That is great. I have already been testing with predict(mymodel, newdata = data, predcontrib = TRUE).
Initially I did xgboost hyperparameter tuning on binary classification with fastshap-shapvalues with a 6GB GPU. This resulted in out of memory errors sometimes.
Next I moved to a 24GB GPU and I had no memory errors anymore.
Currently I changed to multiclass with shapvalues based on predict(mymodel, newdata = data, predcontrib = TRUE) and I do experience the memory limit error again.
Do you think it is possible that fastshap does need less memory than the predict method or is this not possible?
I will test my multiclass script without shapvalues as well to see if it will run without memory errors.
Thanks!

@bgreenwell bgreenwell added the enhancement New feature or request label May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants