Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XGBClassifier.predict_proba outputs incorrect array dimension! #1215

Closed
jyu-theartofml opened this issue May 20, 2016 · 3 comments
Closed

XGBClassifier.predict_proba outputs incorrect array dimension! #1215

jyu-theartofml opened this issue May 20, 2016 · 3 comments

Comments

@jyu-theartofml
Copy link

Hi, I'm using the following code to fit the xgb classifier:
clf = XGBClassifier(objective='multi:softprob', n_estimators=300, subsample=0.8, gamma=1.3)
clf.fit(training_d[features], training_d['outcome'], early_stopping_rounds=10, eval_set=[(training_d[features], training_d['outcome']),(validation_d[features], validation_d['outcome'])],
eval_metric='mlogloss')
but when I use the predict_proba function;
predict1=np.array(clf.predict_proba(validation_d[features]))
it spits out the wrong dimension, the validation_d[features] is (6238, 37), and the prediction should be (6238,5), but predict1 gives me (5,12476). I can't figure out where the 12476 is coming from.

Does anyone have similar issue? Thanks!

--Jenny

@KhaoticMind
Copy link
Contributor

12476 = 6238 * 2

I'm getting this same behavior when using XGBClassifier from python. using the xgb native interface (objective = 'multi:softprob') and booster.predict I get correct results.

I'm running on commit 9c26566.

I can try to update to latest commit and see if the problem persists.

@KhaoticMind
Copy link
Contributor

KhaoticMind commented May 31, 2016

Ok! Found the issue!

The problem is here:

if self.objective == "multi:softprob":
   return class_probs
else:
   classone_probs = class_probs
   classzero_probs = 1.0 - classone_probs
   return np.vstack((classzero_probs, classone_probs)).transpose()

self.object is not being updated on the fit method (where it detects the multi:softprob):

if self.n_classes_ > 2:
   # Switch to using a multiclass objective in the underlying XGB instance
   xgb_options["objective"] = "multi:softprob"
   xgb_options['num_class'] = self.n_classes_

The simple fix should be update self.objective inside this if.
I will inplement it and do a PR latter/tomorrow.

KhaoticMind added a commit to KhaoticMind/xgboost that referenced this issue May 31, 2016
Save the actual objective used on xgboost.train.

Not saving it was giving problem in predict_proba, as issue  dmlc#1215
@jyu-theartofml
Copy link
Author

Thank you!

tlorieul pushed a commit to tlorieul/xgboost that referenced this issue Jun 8, 2016
Save the actual objective used on xgboost.train.

Not saving it was giving problem in predict_proba, as issue  dmlc#1215
@lock lock bot locked as resolved and limited conversation to collaborators Oct 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants