Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ngroup=0 when using xgb_model.pred(pred_interactions=True) #4276

Closed
kyoungrok0517 opened this issue Mar 19, 2019 · 10 comments · Fixed by #4522
Closed

ngroup=0 when using xgb_model.pred(pred_interactions=True) #4276

kyoungrok0517 opened this issue Mar 19, 2019 · 10 comments · Fixed by #4522

Comments

@kyoungrok0517
Copy link

kyoungrok0517 commented Mar 19, 2019

Related: shap/shap#464


Problem
I get the following error when I use pred_interactions=True to get the feature interaction. The error occurs when reshaping the result array, and I suspect this is because the ngroup (2nd axis of (35255,0,601,601)) becomes 0, which seems to be an error.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-6baa08e16bb5> in <module>()
----> 1 preds = model.predict(dtest, pred_contribs=True, approx_contribs=True, pred_interactions=True)

~/anaconda/lib/python3.6/site-packages/xgboost/core.py in predict(self, data, output_margin, ntree_limit, pred_leaf, pred_contribs, approx_contribs, pred_interactions, validate_features)
   1237                     preds = preds.reshape(nrow, data.num_col() + 1, data.num_col() + 1)
   1238                 else:
-> 1239                     preds = preds.reshape(nrow, ngroup, data.num_col() + 1, data.num_col() + 1)
   1240             elif pred_contribs:
   1241                 ngroup = int(chunk_size / (data.num_col() + 1))

ValueError: cannot reshape array of size 148317785 into shape (35255,0,601,601)
@kyoungrok0517
Copy link
Author

kyoungrok0517 commented Mar 19, 2019

Here's necessary files for testing.

How to run
It won't take long. The dataset is small.

python SHAP_Analysis_For_Report.py ./model_dense.pkl ./report.dmatrix

Expected error
ValueError: cannot reshape array of size 63105 into shape (15,0,601,601)

[00:02:25] 140190x600 matrix with 76352399 entries loaded from ./data/dense/test.dmatrix
Using 15 samples
Traceback (most recent call last):
  File "SHAP_Analysis_For_Report.py", line 91, in <module>
    preds = model.predict(dtest, **shap_params)
  File "C:\Users\kyoun\Anaconda3\lib\site-packages\xgboost\core.py", line 1306, in predict
    preds = preds.reshape(nrow, ngroup, data.num_col() + 1, data.num_col() + 1)
ValueError: cannot reshape array of size 63105 into shape (15,0,601,601)

@rnarukulla-deloitte
Copy link

rnarukulla-deloitte commented Apr 25, 2019

I have the same error while I am using the xgboost model which is built using CLI interface. Can some one help

@hcho3
Copy link
Collaborator

hcho3 commented May 1, 2019

@kyoungrok0517
Copy link
Author

So the devs are not interested in this problem?

@hcho3
Copy link
Collaborator

hcho3 commented May 31, 2019

@kyoungrok0517 Sorry I didn't have a chance to look at this. We have a backlog of issues currently, and are trying to address them when we are able to. A little patience would be appreciated.

@hcho3
Copy link
Collaborator

hcho3 commented May 31, 2019

@kyoungrok0517 And I apologize for not giving you any update for more than 2 months. I will look at this issue over this weekend.

@hcho3
Copy link
Collaborator

hcho3 commented May 31, 2019

@kyoungrok0517 Quick update: One thing I notice is that your script enables pred_contribs=True, approx_contribs=True, pred_interactions=True. I think only one of pred_contribs and pred_interactions can be used at a time, but somehow XGBoost does not produce error for this. I will see if the problem stays when using pred_interactions=True exclusively.

@kyoungrok0517
Copy link
Author

kyoungrok0517 commented May 31, 2019 via email

@hcho3
Copy link
Collaborator

hcho3 commented May 31, 2019

@kyoungrok0517 Interesting, so I changed your script from

# interactions
shap_params = {
    'pred_contribs': True,
    'approx_contribs': True,
    'pred_interactions': True
}
preds = model.predict(dtest, **shap_params)

to

# interactions
shap_params = {
    'pred_interactions': True,
    'approx_contribs': True
}
preds = model.predict(dtest, **shap_params)

and I don't get ngroup = 0 error any more. In about 20 minutes on C5.9xlarge machine, the modified script completes running.

I filed #4522 to fix the issue. Specifically, you will get an error when both pred_contribs and pred_interactions are set.

@kyoungrok0517
Copy link
Author

kyoungrok0517 commented May 31, 2019 via email

@lock lock bot locked as resolved and limited conversation to collaborators Aug 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants