[MRG+2] GaussianNB(): new parameter `var_smoothing` #9681

Mottl · 2017-09-03T17:02:31Z

This pull-request changes epsilon to 1/20 of max of variance to improve prediction strength.
MNIST handwritten digits recognition test:
score before: 0.563
score after: 0.798

jnothman · 2017-09-03T21:44:30Z

On what basis can we assume/know this will not degrade performance elsewhere?

jnothman · 2017-09-03T21:44:57Z

It's obviously changing other results in our tests

agramfort · 2017-09-04T18:47:15Z

I agree with @jnothman

if you want to do something like this I would add parameter like min_std so you can adjust it.

Mottl · 2017-09-05T13:58:07Z

I've added min_variance parameter to GaussianNB(), which is by default calculated as 1e-9 multiplied by the maximum variance across all dimensions. It behaves much like adding an epsilon to a variance as in the current code.

amueller · 2017-09-06T16:55:46Z

I was thinking about this a couple of weeks ago and couldn't find any references on it. I think it's a bit unclear on whether you want to have a minimum or an additive constant. I think a Bayesian prior would be an additive constant, right? I guess in practice it doesn't make a lot of difference. Having a reference would be good, though.

amueller · 2017-09-06T16:57:08Z

Actually a Bayesian prior would pull it towards a specific value, not be an additive constant.... That might be the most natural thing to implement imho?

Mottl · 2017-09-07T09:15:45Z

Adding ε (or setting a minimum variance) is necessary for the single purpose — make the calculation of Gaussian PDF computable as we have σ in the denominator:

It is not obvious for me why we should add ε to variances of all features instead of those that produce division by zero error.

So it is seems to me it is slightly better to use a minimum variance than to add ε to all variances, but the difference is negligible.

agramfort · 2017-09-07T09:58:19Z

sklearn/naive_bayes.py

@@ -354,11 +363,13 @@ def _partial_fit(self, X, y, classes=None, _refit=False,
            n_classes = len(self.classes_)
            self.theta_ = np.zeros((n_classes, n_features))
            self.sigma_ = np.zeros((n_classes, n_features))
+            # create a 2d-array of uncorrected variances for further use
+            # in _update_mean_variance()
+            self.sigma_uncorrected_ = np.zeros((n_classes, n_features))


why not using the same trick as before (cf below)

# put epsilon back each time?

Because we don't use a constant epsilon anymore.

And we can't subtract min_variance from self.sigma_ (that way we'd always get zero calculating previous sigma).

Mottl · 2017-09-07T16:24:12Z

So far I can't find any difference between adding epsilon and setting minimum variance when using Naive Bayes. But I found that adding epsilon is more stable when used in Non-naive Bayes classification:

Non-naive Bayes classification
==============================
MNIST handwritten digits
Train dataset: (1000, 784), test dataset: (1000, 784)
Score (epsilon=1e-09)       = 0.832
Score (min_variance=1e-09)  = 0.832
Score (epsilon=1e-08)       = 0.919
Score (min_variance=1e-08)  = 0.48
Score (epsilon=1e-07)       = 0.919
Score (min_variance=1e-07)  = 0.479
Score (epsilon=1e-06)       = 0.919
Score (min_variance=1e-06)  = 0.487
Score (epsilon=1e-05)       = 0.919
Score (min_variance=1e-05)  = 0.496
Score (epsilon=0.0001)      = 0.919
Score (min_variance=0.0001) = 0.588
Score (epsilon=0.001)       = 0.921
Score (min_variance=0.001)  = 0.668
Score (epsilon=0.01)        = 0.933
Score (min_variance=0.01)   = 0.729
Score (epsilon=0.1)         = 0.936
Score (min_variance=0.1)    = 0.488
Score (epsilon=1)           = 0.865
Score (min_variance=1)      = 0.928
Score (epsilon=10)          = 0.709
Score (min_variance=10)     = 0.87
Score (epsilon=100)         = 0.351
Score (min_variance=100)    = 0.204

amueller · 2017-09-08T19:11:03Z

what do you mean by non-naive bayes?

amueller · 2017-09-08T19:14:37Z

btw, this dataset is very non-gaussian and therefore a very odd example. I think I would usually assume that people use the StandardScaler before GaussianNB and that the data is at least somewhat gaussian.

Mottl · 2017-09-08T19:24:59Z

Non-naive bayes is that one uses full covariance matrix instead of just a variance vector in a multivariate gaussian pdf (see scipy.stats.multivariate_normal.logpdf() ).
I'm planning to add non-naive bayes class in a couple of days in another PR. That's why sharing the same parameter epsilon becomes somewhat better.

Mottl · 2017-09-08T19:27:12Z

What dataset do you suggest for testing Naive Bayes?

agramfort · 2017-09-12T20:20:49Z

non-naive Bayes full covariance is quadratic discriminant analysis (aka QDA) and is already in sklearn.

agramfort · 2017-09-12T20:22:56Z

epsilon is a too generic name. Please try to improve variable name and docstring. I not opposed to exposition this internal epsilon parameter.

Mottl · 2017-09-12T20:56:51Z

Is smoothing looks better than epsilon?

agramfort · 2017-09-13T09:20:26Z

min_var_scaling ?

Mottl · 2017-09-13T10:00:44Z

It is not a scaling parameter, but an additive one. And since 3de55d0 it is not a minimum of variance but an additive value to all variances.
var_smoothing?

agramfort · 2017-09-14T10:58:42Z

var_smoothing is fine with me.

Mottl · 2017-09-15T12:27:46Z

@agramfort
Done.

agramfort · 2017-09-17T14:02:37Z

+1 for MRG

jnothman

Looks great! Please add an entry to doc/whats_new, and we'll merge. Thanks!

lesteve · 2017-09-18T09:33:55Z

I added the PR link in whats_new this way it is easier to find out details about the change.

lesteve · 2017-09-18T09:51:11Z

This one should be merged when the CIs are green.

Mottl · 2017-09-18T11:52:37Z

Thank you all

lesteve · 2017-09-18T12:36:10Z

Merging, thanks a lot @Mottl!

hongwen000 · 2019-04-25T18:07:05Z

I hope the purpose of adding this parameter will be documented. I am taking a ML course and working out a classification problem. My teacher asked me to explain how I tuned whatever selected models in the report, including the meaning of their parameters. I stuck there unable to find out what this var_smoothing really mean until I find this...

Changed epsilon to to improve prediction strength

b7e822c

Mottl changed the title ~~Changed epsilon to improve prediction strength of Naive Bayes classification~~ [MRG] Changed epsilon to improve prediction strength of Naive Bayes classification Sep 3, 2017

Mottl changed the title ~~[MRG] Changed epsilon to improve prediction strength of Naive Bayes classification~~ Changed epsilon to improve prediction strength of Naive Bayes classification Sep 3, 2017

Mottl changed the title ~~Changed epsilon to improve prediction strength of Naive Bayes classification~~ [WIP] Changed epsilon to improve prediction strength of Naive Bayes classification Sep 4, 2017

Mottl added 2 commits September 5, 2017 15:47

Added min_variance parameter for GaussianNB()

3c637be

Fixed docs

7df71a1

Mottl changed the title ~~[WIP] Changed epsilon to improve prediction strength of Naive Bayes classification~~ [MRG] GaussianNB(): new parameter min_variance Sep 5, 2017

Mottl added 3 commits September 5, 2017 17:20

Removed min_variance param from the scale test

75b24e0

Fix: amend to previous commit - misplaced file

306aec6

Added accuracy test for min_variance parameter

f155bd9

agramfort reviewed Sep 7, 2017

View reviewed changes

Mottl changed the title ~~[MRG] GaussianNB(): new parameter min_variance~~ [WIP] GaussianNB(): new parameter min_variance Sep 8, 2017

Changed min_variance parameter to epsilon

3de55d0

Mottl changed the title ~~[WIP] GaussianNB(): new parameter min_variance~~ [WIP] GaussianNB(): new parameter epsilon Sep 8, 2017

Fixed issues produced during Doctest

a2f5e86

Mottl changed the title ~~[WIP] GaussianNB(): new parameter epsilon~~ [MRG] GaussianNB(): new parameter epsilon Sep 8, 2017

Mottl added 2 commits September 15, 2017 12:46

Changed epsilon to var_smoothing

a91d080

Changed docstring

5f795d9

agramfort approved these changes Sep 17, 2017

View reviewed changes

agramfort changed the title ~~[MRG] GaussianNB(): new parameter epsilon~~ [MRG+1] GaussianNB(): new parameter epsilon Sep 17, 2017

jnothman reviewed Sep 18, 2017

View reviewed changes

jnothman changed the title ~~[MRG+1] GaussianNB(): new parameter epsilon~~ [MRG+2] GaussianNB(): new parameter epsilon Sep 18, 2017

Mottl and others added 3 commits September 18, 2017 12:07

Merge remote-tracking branch 'upstream/master' into native_bayes_epsilon

732927d

whats new

44007fb

Added PR number in whats_new

b1d71da

Tweak indentation in docstring

6127481

Mottl changed the title ~~[MRG+2] GaussianNB(): new parameter epsilon~~ [MRG+2] GaussianNB(): new parameter var_smoothing Sep 18, 2017

lesteve merged commit 08b524d into scikit-learn:master Sep 18, 2017

Mottl deleted the native_bayes_epsilon branch September 18, 2017 12:45

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+2] add var_smoothing parameter to GaussianNB (scikit-learn#9681)

a8ef356

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+2] add var_smoothing parameter to GaussianNB (scikit-learn#9681)

3fd46b1

NicolasHug mentioned this pull request Jun 9, 2019

var_smoothing parameter of GaussianNB isn't documented nor tested #14054

Open

avm19 mentioned this pull request Jan 30, 2023

Improvement for Gaussian NB by rethinking the variance smoothing #24732

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+2] GaussianNB(): new parameter `var_smoothing` #9681

[MRG+2] GaussianNB(): new parameter `var_smoothing` #9681

Mottl commented Sep 3, 2017

jnothman commented Sep 3, 2017

jnothman commented Sep 3, 2017

agramfort commented Sep 4, 2017

Mottl commented Sep 5, 2017 •

edited

Loading

amueller commented Sep 6, 2017

amueller commented Sep 6, 2017

Mottl commented Sep 7, 2017 •

edited

Loading

agramfort Sep 7, 2017 •

edited

Loading

Mottl Sep 7, 2017

Mottl Sep 7, 2017 •

edited

Loading

Mottl commented Sep 7, 2017 •

edited

Loading

amueller commented Sep 8, 2017

amueller commented Sep 8, 2017

Mottl commented Sep 8, 2017 •

edited

Loading

Mottl commented Sep 8, 2017

agramfort commented Sep 12, 2017

agramfort commented Sep 12, 2017

Mottl commented Sep 12, 2017

agramfort commented Sep 13, 2017 via email

Mottl commented Sep 13, 2017 •

edited

Loading

agramfort commented Sep 14, 2017 via email

Mottl commented Sep 15, 2017

agramfort commented Sep 17, 2017

jnothman left a comment

lesteve commented Sep 18, 2017 •

edited

Loading

lesteve commented Sep 18, 2017

Mottl commented Sep 18, 2017

lesteve commented Sep 18, 2017

hongwen000 commented Apr 25, 2019

[MRG+2] GaussianNB(): new parameter var_smoothing #9681

[MRG+2] GaussianNB(): new parameter var_smoothing #9681

Conversation

Mottl commented Sep 3, 2017

jnothman commented Sep 3, 2017

jnothman commented Sep 3, 2017

agramfort commented Sep 4, 2017

Mottl commented Sep 5, 2017 • edited Loading

amueller commented Sep 6, 2017

amueller commented Sep 6, 2017

Mottl commented Sep 7, 2017 • edited Loading

agramfort Sep 7, 2017 • edited Loading

Choose a reason for hiding this comment

Mottl Sep 7, 2017

Choose a reason for hiding this comment

Mottl Sep 7, 2017 • edited Loading

Choose a reason for hiding this comment

Mottl commented Sep 7, 2017 • edited Loading

amueller commented Sep 8, 2017

amueller commented Sep 8, 2017

Mottl commented Sep 8, 2017 • edited Loading

Mottl commented Sep 8, 2017

agramfort commented Sep 12, 2017

agramfort commented Sep 12, 2017

Mottl commented Sep 12, 2017

agramfort commented Sep 13, 2017 via email

Mottl commented Sep 13, 2017 • edited Loading

agramfort commented Sep 14, 2017 via email

Mottl commented Sep 15, 2017

agramfort commented Sep 17, 2017

jnothman left a comment

Choose a reason for hiding this comment

lesteve commented Sep 18, 2017 • edited Loading

lesteve commented Sep 18, 2017

Mottl commented Sep 18, 2017

lesteve commented Sep 18, 2017

hongwen000 commented Apr 25, 2019

[MRG+2] GaussianNB(): new parameter `var_smoothing` #9681

[MRG+2] GaussianNB(): new parameter `var_smoothing` #9681

Mottl commented Sep 5, 2017 •

edited

Loading

Mottl commented Sep 7, 2017 •

edited

Loading

agramfort Sep 7, 2017 •

edited

Loading

Mottl Sep 7, 2017 •

edited

Loading

Mottl commented Sep 7, 2017 •

edited

Loading

Mottl commented Sep 8, 2017 •

edited

Loading

Mottl commented Sep 13, 2017 •

edited

Loading

lesteve commented Sep 18, 2017 •

edited

Loading