Skip to content
This repository has been archived by the owner on Feb 28, 2024. It is now read-only.

[WIP] GBTRegressor does not provide uncertainty estimates #9

Merged
merged 3 commits into from Mar 28, 2016

Conversation

betatim
Copy link
Member

@betatim betatim commented Mar 23, 2016

Started a simple wrapper for regressors like GBTRegressor to provide uncertainty estatimates as well
as central predictions.

Any ideas on how to test that this indeed gives the central and 68% confidence interval (right terminology??) predictions? Best idea so far is to use a very large number of samples on a simple problem and check that the predicted standard deviation is approximately equal to the std used when sampling.

Todo:

  • documentation
  • check that GP uses same definition of uncertainty
  • smarter tests

# noise level depends on value of `X`
return np.abs(2.5-X)/2.5 + 0.1

def sample_noise(X, std=0.2, noise=constant_noise):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should pass in rng/use check_random_state

Wrap a GBTRegressor to provide uncertainty estatimates as well
as central predictions
def constant_noise(X):
return np.ones_like(X)

def sample_noise(X, std=0.2, noise=constant_noise):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use random_state pattern instead of relying on global rng

@glouppe
Copy link
Member

glouppe commented Mar 23, 2016

I am not fond of the nomenclature, std != uncertainty != quantile.

@betatim
Copy link
Member Author

betatim commented Mar 23, 2016

I'm flexible, but out of ideas at the moment for a good (short) name.

GBTQuantileRegressor?

GBTQuantiles(quantiles=list_of_quantiles) with per_quantile_prediction = predict(). You set all the quantiles you want estimated and we return each single one? In this case GBTQuantiles([0.18, 0.5, 0.84])

@MechCoder
Copy link
Member

I'm planning to review my GradientBoosting know-how, this weekend. Can have a look after that.

@betatim
Copy link
Member Author

betatim commented Mar 24, 2016

If you need this merge it, I'm away till Monday night without internet

random_state=rng)
for a in self.quantiles]
for rgr in self.regressors_:
rgr.fit(X, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return self

@MechCoder
Copy link
Member

@glouppe @betatim
How do we compute the standard deviation of the predictions given the quantiles? There is no assumption about the conditional distribution of Y given X right?

@betatim
Copy link
Member Author

betatim commented Mar 28, 2016

The last item on the to do list: "check definition of uncertainty" is something I'd like to punt to #23, instead of waiting to merge this till we converge.

@MechCoder
Copy link
Member

This looks ok to me.

@MechCoder MechCoder merged commit fc07663 into scikit-optimize:master Mar 28, 2016
@betatim betatim deleted the trees branch March 28, 2016 16:46
@betatim
Copy link
Member Author

betatim commented Mar 28, 2016

📈

@glouppe
Copy link
Member

glouppe commented Mar 29, 2016

Would be good to expose the base_estimator in this newly introduced class. Default params of GradientBoostingRegressor are very likely to give poor results.

holgern added a commit that referenced this pull request Jan 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants