Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using this package on machine learning results #22

Closed
ivan-marroquin opened this issue Feb 15, 2021 · 4 comments
Closed

Using this package on machine learning results #22

ivan-marroquin opened this issue Feb 15, 2021 · 4 comments

Comments

@ivan-marroquin
Copy link

Hi,

Thanks for making this package available to us! I have a simple question:

I have a data set split into train/validation sets. A regressor (or classifier) is trained, and then by using the validation set I get an estimate Y_prediction. From the same validation set, I have the Y_true. So, how would you suggest to compute the standard deviation on predictions from a regressor (or classifier)?

Many thanks,

Ivan

@IanChar
Copy link
Member

IanChar commented Feb 16, 2021

Hey Ivan, thanks so much for your interest in the toolbox! In order to use our evaluations, your model needs to produce some uncertainty estimate (something our toolbox does not provide). There are many ways to do this, but one easy way is to alter your regression model so that the outputs are the parameters of a conditional Gaussian (i.e. instead of producing y_pred produce mu_pred, sigma_pred for an output) and then train on negative log likelihood. You can make this even better by making an ensemble of these models. I would recommend checking out this paper as a good starting point:
Lakshminarayanan, Balaji, Alexander Pritzel, and Charles Blundell. "Simple and scalable predictive uncertainty estimation using deep ensembles." NeurIPS. 2017.

@ivan-marroquin
Copy link
Author

Hi @IanChar

Many thanks for your prompt answer! What about using bootstrap confidence intervals? They can be applied to estimate uncertainty in the quality of the classification (or prediction). So, I can use the "+/- error" from the confidence interval to estimate a low/high bounds standard deviation on predictions samples. What do you think?

Ivan

@IanChar
Copy link
Member

IanChar commented Feb 18, 2021

Yep that's definitely a valid method too!

@IanChar
Copy link
Member

IanChar commented Feb 19, 2021

Marking this issue as closed.

@IanChar IanChar closed this as completed Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants