-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+2] GaussianNB(): new parameter var_smoothing
#9681
Conversation
On what basis can we assume/know this will not degrade performance elsewhere? |
It's obviously changing other results in our tests |
I agree with @jnothman if you want to do something like this I would add parameter like min_std so you can adjust it. |
I've added |
min_variance
I was thinking about this a couple of weeks ago and couldn't find any references on it. I think it's a bit unclear on whether you want to have a minimum or an additive constant. I think a Bayesian prior would be an additive constant, right? I guess in practice it doesn't make a lot of difference. Having a reference would be good, though. |
Actually a Bayesian prior would pull it towards a specific value, not be an additive constant.... That might be the most natural thing to implement imho? |
sklearn/naive_bayes.py
Outdated
@@ -354,11 +363,13 @@ def _partial_fit(self, X, y, classes=None, _refit=False, | |||
n_classes = len(self.classes_) | |||
self.theta_ = np.zeros((n_classes, n_features)) | |||
self.sigma_ = np.zeros((n_classes, n_features)) | |||
# create a 2d-array of uncorrected variances for further use | |||
# in _update_mean_variance() | |||
self.sigma_uncorrected_ = np.zeros((n_classes, n_features)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not using the same trick as before (cf below)
# put epsilon back each time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't use a constant epsilon anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we can't subtract min_variance
from self.sigma_
(that way we'd always get zero calculating previous sigma).
So far I can't find any difference between adding epsilon and setting minimum variance when using Naive Bayes. But I found that adding epsilon is more stable when used in Non-naive Bayes classification:
|
min_variance
min_variance
min_variance
epsilon
epsilon
epsilon
what do you mean by non-naive bayes? |
btw, this dataset is very non-gaussian and therefore a very odd example. I think I would usually assume that people use the StandardScaler before GaussianNB and that the data is at least somewhat gaussian. |
Non-naive bayes is that one uses full covariance matrix instead of just a variance vector in a multivariate gaussian pdf (see scipy.stats.multivariate_normal.logpdf() ). |
What dataset do you suggest for testing Naive Bayes? |
non-naive Bayes full covariance is quadratic discriminant analysis (aka QDA) and is already in sklearn. |
epsilon is a too generic name. Please try to improve variable name and docstring. I not opposed to exposition this internal epsilon parameter. |
Is |
min_var_scaling ?
|
It is not a scaling parameter, but an additive one. And since 3de55d0 it is not a minimum of variance but an additive value to all variances. |
var_smoothing is fine with me.
|
@agramfort |
epsilon
epsilon
+1 for MRG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Please add an entry to doc/whats_new, and we'll merge. Thanks!
epsilon
epsilon
I added the PR link in whats_new this way it is easier to find out details about the change. |
epsilon
var_smoothing
This one should be merged when the CIs are green. |
Thank you all |
Merging, thanks a lot @Mottl! |
I hope the purpose of adding this parameter will be documented. I am taking a ML course and working out a classification problem. My teacher asked me to explain how I tuned whatever selected models in the report, including the meaning of their parameters. I stuck there unable to find out what this var_smoothing really mean until I find this... |
This pull-request changes epsilon to 1/20 of max of variance to improve prediction strength.
MNIST handwritten digits recognition test:
score before: 0.563
score after: 0.798