XGBClassifier.predict_proba outputs incorrect array dimension! #1215

jyu-theartofml · 2016-05-20T20:13:57Z

Hi, I'm using the following code to fit the xgb classifier:
clf = XGBClassifier(objective='multi:softprob', n_estimators=300, subsample=0.8, gamma=1.3)
clf.fit(training_d[features], training_d['outcome'], early_stopping_rounds=10, eval_set=[(training_d[features], training_d['outcome']),(validation_d[features], validation_d['outcome'])],
eval_metric='mlogloss')
but when I use the predict_proba function;
predict1=np.array(clf.predict_proba(validation_d[features]))
it spits out the wrong dimension, the validation_d[features] is (6238, 37), and the prediction should be (6238,5), but predict1 gives me (5,12476). I can't figure out where the 12476 is coming from.

Does anyone have similar issue? Thanks!

--Jenny

KhaoticMind · 2016-05-31T21:29:33Z

12476 = 6238 * 2

I'm getting this same behavior when using XGBClassifier from python. using the xgb native interface (objective = 'multi:softprob') and booster.predict I get correct results.

I'm running on commit 9c26566.

I can try to update to latest commit and see if the problem persists.

KhaoticMind · 2016-05-31T21:56:25Z

Ok! Found the issue!

The problem is here:

if self.objective == "multi:softprob":
   return class_probs
else:
   classone_probs = class_probs
   classzero_probs = 1.0 - classone_probs
   return np.vstack((classzero_probs, classone_probs)).transpose()

self.object is not being updated on the fit method (where it detects the multi:softprob):

if self.n_classes_ > 2:
   # Switch to using a multiclass objective in the underlying XGB instance
   xgb_options["objective"] = "multi:softprob"
   xgb_options['num_class'] = self.n_classes_

The simple fix should be update self.objective inside this if.
I will inplement it and do a PR latter/tomorrow.

Save the actual objective used on xgboost.train. Not saving it was giving problem in predict_proba, as issue dmlc#1215

jyu-theartofml · 2016-06-01T04:45:47Z

Thank you!

Save the actual objective used on xgboost.train. Not saving it was giving problem in predict_proba, as issue dmlc#1215

KhaoticMind added a commit to KhaoticMind/xgboost that referenced this issue May 31, 2016

Preserve the actal objective used on the booster

19129b2

Save the actual objective used on xgboost.train. Not saving it was giving problem in predict_proba, as issue dmlc#1215

KhaoticMind mentioned this issue May 31, 2016

[py]Preserve the actual objective used on the booster #1241

Merged

terrytangyuan closed this as completed in 9ef8607 Jun 1, 2016

tlorieul pushed a commit to tlorieul/xgboost that referenced this issue Jun 8, 2016

Preserve the actal objective used on the booster

8f74581

Save the actual objective used on xgboost.train. Not saving it was giving problem in predict_proba, as issue dmlc#1215

lock bot locked as resolved and limited conversation to collaborators Oct 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XGBClassifier.predict_proba outputs incorrect array dimension! #1215

XGBClassifier.predict_proba outputs incorrect array dimension! #1215

jyu-theartofml commented May 20, 2016

KhaoticMind commented May 31, 2016

KhaoticMind commented May 31, 2016 •

edited

jyu-theartofml commented Jun 1, 2016

XGBClassifier.predict_proba outputs incorrect array dimension! #1215

XGBClassifier.predict_proba outputs incorrect array dimension! #1215

Comments

jyu-theartofml commented May 20, 2016

KhaoticMind commented May 31, 2016

KhaoticMind commented May 31, 2016 • edited

jyu-theartofml commented Jun 1, 2016

KhaoticMind commented May 31, 2016 •

edited