GradientBoostingClassifier doesn't work with least squares loss #1085

Closed
larsmans opened this Issue Aug 30, 2012 · 10 comments

5 participants

@larsmans
scikit-learn member

Triggered by this SO question: GradientBoostingClassifier's docstring states that loss may be "ls", in which case least squares regression will be performed, but when you only try to do that, a ValueError is raised. I'm not sure if the code or the docs should be changed.

(I also noticed that Huber and quantile loss are not advertised in the regressor's docstring.)

@pprett
scikit-learn member

Thanks for pointing this out - I've changed that in #1036 but I should fix this now. I'll remove "ls" from GradientBoostingClassifier.

@amueller
scikit-learn member

@pprett this is fixed now, right?

@pprett
scikit-learn member

@amueller correct, it has been adressed in #1088

@pprett pprett closed this Sep 6, 2012
@smcinerney

In 0.13.1 the errorstring is still the seriously non-obvious "ValueError: n_classes must be 1 for regression"
e.g. for GradientBoostingClassifier(loss='ls') or (loss='huber')

Could you change this to a more user-friendly "Loss function '%s' is not supported for classifier %s"?

@amueller
scikit-learn member

Well 0.13.1 is the last release and won't change any more ;) Can you test the current development version please?

@larsmans
scikit-learn member

Still the case in master. I'm on it.

@larsmans
scikit-learn member

(Side note: ls is called squared_loss in SGD and mean_squared_error in metrics.)

@larsmans
scikit-learn member

I think I'm going to need some help here. GradientBoostingClassifier docstring says:

loss : {'deviance'}, optional (default='deviance')
    loss function to be optimized. 'deviance' refers to
    deviance (= logistic regression) for classification
    with probabilistic outputs.

I don't know what this means. There is a loss option, but deviance is the only supported value? Then why is it there? @glouppe, @arjoly?

@ogrisel
scikit-learn member

I think you should as @pprett too. The narrative doc has a bit more content: http://scikit-learn.org/dev/modules/ensemble.html#gradient-tree-boosting

@pprett
scikit-learn member

@larsmans I created a PR #2308 that fixes the issue. GradientBoostingClassifier currently supports only one loss ('deviance') - internally it uses either Binomial or Multinomial Deviance depending on the number of class labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment