# [MRG] Generative Classification #2468

Open
wants to merge 20 commits into
from
+655 −6

## Conversation

Projects
None yet
Member

### jakevdp commented Sep 21, 2013

 This PR adds a simple meta-estimator which accepts any generative model (normal approximation, GMM, KernelDensity, etc.) and uses it to construct a generative Bayesian classifier. Todo: code documentation narrative docs testing examples allow class-wise cross validation for the density model?
 jakevdp  basic generative classification framework  6e89ddd
Owner

### ogrisel commented Sep 22, 2013

 I think to make the discussion more fruitful it would be great to provides some examples on datasets where such models are actually useful either from a pure classification performance point of view, or more likely as samplers to generate new labeled samples for specific classes (a bit like you did with this KDE sampling example for digits).
Member

### jakevdp commented Sep 22, 2013

 ...or more likely as samplers to generate new labeled samples for specific classes Ah, I hadn't even thought of that possibility! Yes, we could implement a sample routine, which would use the underlying models to return a random set of new observations fitting the training data. Great idea! I'll work on some examples soon to make the utility of this approach more clear.

### jakevdp added some commits Sep 22, 2013

 jakevdp  don't duplicate BaseNB methods in GenerativeBayes  2100f1d jakevdp  add GenerativeBayes tests  d530d27
Member

### jakevdp commented Sep 22, 2013

 I added doc strings and tests. An incompatibility came up in the case of GMM: I opened an issue at #2473.

### jakevdp added some commits Sep 22, 2013

 jakevdp  add test for generative sampling  bebdf97 jakevdp  make GenerativeBayes cloneable  0b294de

### mblondel and 1 other commented on an outdated diff Sep 23, 2013

sklearn/generative.py
 + additional keyword arguments to be passed to the constructor + specified by density_estimator. + """ + def __init__(self, density_estimator, **kwargs): + self.density_estimator = density_estimator + self.kwargs = kwargs + + # run this here to check for any exceptions; we avoid assigning + # the result here so that the estimator can be cloned. + self._choose_estimator(density_estimator, **kwargs) + + def _choose_estimator(self, density_estimator, **kwargs): + if isinstance(density_estimator, str): + dclass = MODEL_TYPES.get(density_estimator) + return dclass(**kwargs) + elif isinstance(density_estimator, type):

#### mblondel Sep 23, 2013

Owner

Looks like type is undefined.

Owner

it's a builtin

Owner

### larsmans commented Sep 25, 2013

 I don't think this should be combined with Naive Bayes, except in the docs. The charm of Naive Bayes lies in its speed and simple code, no need to mess with that.

### larsmans and 1 other commented on an outdated diff Sep 25, 2013

sklearn/generative.py
 @@ -0,0 +1,258 @@ +""" +Bayesian Generative Classification +================================== +This module contains routines for general Bayesian generative classification. +Perhaps the best-known instance of generative classification is the Naive +Bayes Classifier, in which the distribution of each training class is +approximated by an axis-aligned multi-dimensional normal distribution, and +unknown points are evaluated by comparing their posterior probability under +each model.

#### larsmans Sep 25, 2013

Owner

There's no such thing as "the" Naive Bayes classifier. You're thinking of Gaussian NB, but NLP people will variously think of Bernoulli or multinomial NB. (The first time I encountered the Gaussian variant was while reading sklearn source code :)

#### jakevdp Sep 25, 2013

Member

Interesting! The first time I encountered any version other than Gaussian NB was reading the sklearn source code 😄

Owner

### larsmans commented Sep 25, 2013

 On second thought: @jakevdp, is it too much of a stretch to merge this thing into the naive_bayes module? I guess it's not really "naive" in the NB sense, but it would remove some clutter in the top-level module. Also, take a look at the NB narrative docs, which explain pretty much the same thing that you're explaining in the module docstring.
Member

### jakevdp commented Sep 25, 2013

 I initially thought about putting this within sklearn.naive_bayes (given that it inherits from BaseNB!) but didn't because, though it is Bayesian, it's distinctly not Naive in the sense that the term is used. If we could start over, it would be make more sense to have a submodule for generative classification of which naive bayes is a part, rather than the other way around. But given that we've made the API choice to have a naive_bayes submodule, I thought it would be less confusing to put general generative classification in its own module. Regardless of where the code goes, I had envisioned combining the narrative documentation for the two: as you mention, we can adapt the theoretical background currently put under the heading of Naive Bayes and show how it applies in both the Naive and the general case.
Owner

### larsmans commented Sep 25, 2013

 it would be make more sense to have a submodule for generative classification of which naive bayes is a part True, but I've seen other people's production codebases that depend on MultinomialNB being in naive_bayes.py, and I'd have some explaining to do if we broke that :p Combining the narratives was the main what I was aiming at. It's your call to decide if it fits well enough to also combine the code. (FYI, I see you're using BaseNB. I've been thinking about killing that, because MultinomialNB and BernoulliNB can be implemented more straightforwardly as pure linear models, sharing no code with GaussianNB.)

### jakevdp added some commits Nov 13, 2013

 jakevdp  Bug fix: generative normalization  404c8b3 jakevdp  move GenerativeBayes to naive_bayes module  8d390cf
Member

### jakevdp commented Nov 23, 2013

 @larsmans - I ended up following your advice and moving everything into the naive_bayes submodule. That location might be a bit misleading, but I think it is cleaner. Still some tests failing... I'm going to try to fix those.

### jakevdp added some commits Nov 23, 2013

 jakevdp  change GenerativeBayes to be cloneable  a029445 jakevdp  fix NormalApproximation normalization  ead86cd jakevdp  DOC: add 1D generative classification example  714e4c3 jakevdp  DOC: add generative classification sampling example  fb49106 jakevdp  DOC: document GenerativeBayes  ed3ac25 jakevdp  TST: fix GenerativeBayes to pass common tests  868f650
Member

### jakevdp commented Dec 10, 2013

 I think this is pretty close to finished now. I added narrative documentation, examples, and the tests should pass. One missing feature that would be really helpful would be the ability to do class-wise cross-validation of the density estimators within GenerativeBayes. I'm not sure what the right interface would be for that, however... any ideas?
Member

### jakevdp commented Dec 10, 2013

 Hmm... is there any way that program state can affect the results of cross_val_score? It fails here, but passes on my machine, and passes when I run the code alone. There doesn't seem to be any random element that would affect it... that's really strange.
 jakevdp  Merge remote-tracking branch 'upstream/master' into generative_class  90cd1a6
Member

### jakevdp commented Dec 10, 2013

 Ah - looks like it was something that had changed in master. I'll adjust the tests so that they will pass.
 jakevdp  TST: modify GenerativeBayes tests to pass  061f3fb

### coveralls commented Dec 10, 2013

 Coverage remained the same when pulling 061f3fb on jakevdp:generative_class into ffde690 on scikit-learn:master.
Member

### jakevdp commented Dec 10, 2013

 Changing status to MRG: I think this is ready for a final review, unless we want to add class-wise cross-validation at this time.

### ogrisel commented on an outdated diff Dec 11, 2013

doc/modules/naive_bayes.rst
 +density model to each category to estimate :math:P(x_i \mid y). Some +examples of more flexible density models are: + +- :class:sklearn.neighbors.KernelDensity: discussed in :ref:kernel_density +- :class:sklearn.mixture.GMM: discussed in :ref:clustering + +Though it can be much more computationally intense, +using one of these models rather than a naive Gaussian model can lead to much +better generative classifiers, and can be especially applicable in cases of +unbalanced data where accurate posterior classification probabilities are +desired. + +.. figure:: ../auto_examples/images/plot_1d_generative_classification_1.png + :target: ../auto_examples/plot_1d_generative_classification.html + :align: center + :scale: 50%from the training data

#### ogrisel Dec 11, 2013

Owner

from the training data?

### ogrisel commented on an outdated diff Dec 11, 2013

doc/modules/naive_bayes.rst
 + >>> from sklearn.datasets import make_blobs + >>> X, y = make_blobs(10, centers=2, random_state=0) + >>> clf = GenerativeBayes(density_estimator='kde') + >>> clf.fit(X, y) + >>> clf.predict(X) + array([0, 1, 0, 1, 1, 0, 1, 0, 0, 1]) + >>> y + array([0, 1, 0, 1, 1, 0, 1, 0, 0, 1]) + +The KDE-based Generative classifier for this problem has 100% accuracy on +the training data. +The specified density estimator can be 'kde', 'gmm', 'norm_approx', +or a custom class which has the same semantics as +:class:sklearn.neighbors.KernelDensity (see the documentation of +:class:GenerativeBayes for details). +

#### ogrisel Dec 11, 2013

Owner

Your explanation is clear but I think it would be great if you could find a good online reference from the the literature for people who want to dig further.

### ogrisel commented on an outdated diff Dec 11, 2013

sklearn/naive_bayes.py
 + 'gmm': GMM, + 'kde': KernelDensity} + + +class GenerativeBayes(BaseNB): + """ + Generative Bayes Classifier + + This is a meta-estimator which performs generative Bayesian classification + using flexible underlying density models. + + Parameters + ---------- + density_estimator : str, class, or instance + The density estimator to use for each class. Options are + 'norm_approx' : Normal Approximation (i.e. naive Bayes)

#### ogrisel Dec 11, 2013

Owner

I think using 'normal_approximation' would be a more explicit name. If you do the change, don't forget to update the narrative doc.

#### ogrisel Dec 11, 2013

Owner

Also I would be more explicit by replacing: "Normal Approximation (i.e. naive Bayes)" by "Axis-aligned Normal Approximation (i.e. Gaussian naive Bayes)"

#### ogrisel Dec 11, 2013

Owner

If 'normal_approximation' is too long, at least 'normal_approx' instead of norm_approx which I find too confusing.

Owner

### ogrisel commented Dec 11, 2013

 What about the "allow class-wise cross validation for the density model" item in your todo list?
Owner

### ogrisel commented Dec 11, 2013

 Is GenerativeBayes(density_estimator="norm_approx") strictly equivalent to GaussianNB (speed, public API including fitted attributes)? If so, why mark GaussianNB deprecated in favor of GenerativeBayes(density_estimator="norm_approx")?
Member

### jakevdp commented Dec 11, 2013

 Hi @ogrisel - thanks for the comments. A few responses: I'll change norm_approx to normal_approximation and update the docs. Regarding cross-validation: I've been thinking about the most intuitive way to do this: we could build the functionality into GenerativeBayes, but then it wouldn't be available outside the class. It might be more useful in the long-run to create a new DensityEstimatorMixin class similar to ClassifierMixin and RegressorMixin which would contain some sort of cross-validation tool (as well as computing score from score_samples, and other generally applicable methods). Then any estimator which works in GenerativeBayes could inherit from this, and use the cross-validation from there. Regarding deprecating GaussianNB: I hadn't considered this, primarily because I thought the new tool would be slower for the gaussian NB case. I initially added normal_approximation just for ease of testing against GaussianNB. But I did some benchmarks, and the new method seems to be marginally faster than GaussianNB, and it returns the same results by construction. Given that, deprecation might be worth considering.
Member

### jakevdp commented Dec 11, 2013

 There is one difference between GenerativeBayes('normal_approximation') and GaussianNB, however: GenerativeBayes doesn't have sigma_ and theta_ attributes exposed. We could do this with some model introspection, however...
Owner

### ogrisel commented Dec 11, 2013

 How faster is marginally faster? Is this a fixed ratio or does it depend on n_samples / n_features?
Owner

### ogrisel commented Dec 11, 2013

 I am not sure what you mean by class wise CV but I agree this can always be tackled in another PR later.
Member

### jakevdp commented Dec 11, 2013

 By class-wise CV I mean this: the GenerativeBayes classifier fits a density estimation (i.e. normal approximation, KDE, GMM, etc.) to the distribution of training points for each class. That is, for data with three classes, it fits KDE three times to subsets of the data. Currently, you have to choose the same hyper-parameters for each, which is not optimal. It would be best to do separate cross-validation on each of the three density estimators. This is what I mean by class-wise CV: I'm not sure what the best interface is for something like this.
 jakevdp  Address ogrisel comments in GenerativeBayes  476f3ba
Member

### jakevdp commented Dec 11, 2013

 Hi, I addressed all of @ogrisel's comments. Regarding the benchmarks, it seems that GenerativeBayes is ~10 percent slower for small problems, and ~10 percent faster for large problems (see the timings in this notebook). The speed difference in the small case likely comes from the fact that there's some overhead in creating multiple classes in GenerativeBayes. The speed difference in the large case likely comes from the fact that GenerativeBayes constructs each masked arrays only once, while NaiveBayes, as currently written, constructs it twice (this is silly and should be fixed regardless: see line 170).

### coveralls commented Dec 11, 2013

 Coverage remained the same when pulling 476f3ba on jakevdp:generative_class into ffde690 on scikit-learn:master.
Member

### jakevdp commented Dec 11, 2013

 Fixed the GaussianNB thing in #2659. Once it's merged I'll re-do the benchmark script.

### mblondel and 1 other commented on an outdated diff Dec 12, 2013

sklearn/naive_bayes.py
 + Training data. shape = [n_samples, n_features] + + y : array-like + Target values, array of float values, shape = [n_samples] + """ + X, y = check_arrays(X, y, sparse_format='dense') + y = column_or_1d(y, warn=True) + + estimator = self._choose_estimator(self.density_estimator, + self.model_kwds) + + self.classes_ = np.sort(np.unique(y)) + n_classes = len(self.classes_) + n_samples, self.n_features_ = X.shape + + masks = [(y == c) for c in self.classes_]

#### mblondel Dec 12, 2013

Owner

You could use LabelBinarizer or label_binarize from the preprocessing module.

#### jakevdp Dec 12, 2013

Member

The output of label_binarize would have to be converted to boolean, though... I'm not sure that would be either more efficient or more readable. What do you think?

#### mblondel Dec 12, 2013

Owner

Indeed, you're right. I guess a comment like "class membership masks" would help understanding.

### mblondel commented on an outdated diff Dec 12, 2013

sklearn/naive_bayes.py
 + Parameters + ---------- + density_estimator : str, class, or instance + The density estimator to use for each class. Options are + - 'normal_approximation' : Axis-aligned Normal Approximation + (i.e. Gaussian Naive Bayes) + - 'gmm' : Gaussian Mixture Model + - 'kde' : Kernel Density Estimate + The default is 'normal_approximation'. + Alternatively, a class or class instance can be specified. The + instantiated class should be a sklearn estimator, and contain a + score_samples method with semantics similar to those in + :class:sklearn.neighbors.KDE. + model_kwds : dict or None + Additional keyword arguments to be passed to the constructor + specified by density_estimator. Default=None.

#### mblondel Dec 12, 2013

Owner

Could you document the fitted attributes?

### mblondel commented on an outdated diff Dec 12, 2013

sklearn/naive_bayes.py
 + +DENSITY_MODELS = {'normal_approximation': _NormalApproximation, + 'gmm': GMM, + 'kde': KernelDensity} + + +class GenerativeBayes(BaseNB): + """ + Generative Bayes Classifier + + This is a meta-estimator which performs generative Bayesian classification + using flexible underlying density models. + + Parameters + ---------- + density_estimator : str, class, or instance

#### mblondel Dec 12, 2013

Owner

Do you need to support classes? I think the rest of the scikit usually only supports instances. If there's no compelling reason, I'd rather remove the feature so as to not create any inconsistencies in user code.

Owner

### mblondel commented Dec 12, 2013

 The user guide is really nice. I'm totally sold !

### mblondel commented on an outdated diff Dec 12, 2013

doc/modules/naive_bayes.rst
 +This type of classification can be performed with the :class:GenerativeBayes +estimator. The estimator can be used very easily: + + >>> from sklearn.naive_bayes import GenerativeBayes + >>> from sklearn.datasets import make_blobs + >>> X, y = make_blobs(10, centers=2, random_state=0) + >>> clf = GenerativeBayes(density_estimator='kde') + >>> clf.fit(X, y) + GenerativeBayes(density_estimator='kde', model_kwds=None) + >>> clf.predict(X) + array([0, 1, 0, 1, 1, 0, 1, 0, 0, 1]) + >>> y + array([0, 1, 0, 1, 1, 0, 1, 0, 0, 1]) + +The KDE-based Generative classifier for this problem has 100% accuracy on +the training data.

#### mblondel Dec 12, 2013

Owner

I'd rather use test data if possible (achieving 100% accuracy on training data is not necessarily a good sign). Also, could you say a few words on how to prevent overfitting? For example, when using GMM as the base density estimator, n_components should not be set too high.

Member

### jakevdp added some commits Dec 12, 2013

 jakevdp  address mblondel's comments  b7295f6 jakevdp  clarification: change masks to class_membership_masks  4d23183

### coveralls commented Dec 12, 2013

 Coverage remained the same when pulling 4d23183 on jakevdp:generative_class into ffde690 on scikit-learn:master.

### coveralls commented Dec 12, 2013

 Coverage remained the same when pulling 4d23183 on jakevdp:generative_class into ffde690 on scikit-learn:master.
Member

### jakevdp commented Dec 12, 2013

 Addressed @mblondel's comments, except for the suggestion to add a note about over-fitting. I'm realizing that this really shouldn't be considered complete without a way to cross-validate the density model for each class. A few ideas for how to approach this: build cross-validation machinery into GenerativeBayes. Advantage: simple and straightforward. Disadvantage: people might want the functionality outside the classifier. create a DensityEstimator mixin that contains a cross-validation routine for each density estimator. Advantage: the automated cross-validation could then be used outside GenerativeBayes. Disadvantage: perhaps confusing? Not all estimators have CV built-in. expose per-class estimator attributes, in much the same way that Pipeline objects expose the underlying attributes of their steps. (for example, with three KDE estimators, you might allow passing bandwidth as an array, which will be spread among the estimators). Advantage: this would allow the cross-validation to be performed by the user, which is more typical of the scikit-learn interface. Disadvantage: the final classification score is not the right metric for the underlying estimators... you'd end up having to hack the score function and perform multiple grid searches by-hand to do it correctly. I'd love to hear any thoughts you have on this: the best path is not entirely apparent to me. Are there any other meta-estimators in the package where underlying estimators are cross-validated independently?
Owner

### mblondel commented Dec 12, 2013

 In practice, do you observe much better performance by tuning the parameters for each class?
Member

### jakevdp commented Dec 12, 2013

 In practice, do you observe much better performance by tuning the parameters for each class? I actually haven't tried that in particular, but I'm anticipating such a request from users! I think in an extremely unbalanced problem, it would probably make a difference.
 jakevdp  Merge remote-tracking branch 'upstream/master' into generative_class  811d4b0
Owner

### ogrisel commented Dec 12, 2013

 I am not sure I fully understand the tradeoff. I think I need to see some code for the CV of such nested models better grasp it and give you feedback. Maybe you could implement: 1- build cross-validation machinery into GenerativeBayes. Advantage: simple and straightforward. Disadvantage: people might want the functionality outside the classifier. as a start and we can discuss it whether it we should remove it or refactor it into one of the other 2 options?
Member

### jakevdp commented Dec 12, 2013

 In practice, do you observe much better performance by tuning the parameters for each class? I haven't actually checked this, but I'd imagine that in the case of an unbalanced dataset, it could make a difference.
Owner

### ogrisel commented Dec 12, 2013

 I get the following error when running the example to build the doc: Traceback (most recent call last): File "examples/plot_1d_generative_classification.py", line 56, in clf = GenerativeBayes(density_estimator=density_estimators[i]) File "/Users/ogrisel/code/scikit-learn/sklearn/naive_bayes.py", line 710, in __init__ self._choose_estimator(density_estimator, self.model_kwds) File "/Users/ogrisel/code/scikit-learn/sklearn/naive_bayes.py", line 722, in _choose_estimator raise ValueError('invalid density_estimator') ValueError: invalid density_estimator  The error message should be more explicit and include the name of the passed estimator (or its str representation) and the reason why it's not valid. In this case we are passing an instance and it looks up a class. I guess this code needs to be updated and test needs to be added to check that invalid input check.

### ogrisel commented on an outdated diff Dec 12, 2013

sklearn/naive_bayes.py
 + + # run this here to check for any exceptions; we avoid assigning + # the result here so that the estimator can be cloned. + self._choose_estimator(density_estimator, self.model_kwds) + + def _choose_estimator(self, density_estimator, kwargs=None): + """Choose the estimator based on the input""" + dclass = DENSITY_MODELS.get(density_estimator) + + if dclass is not None: + if kwargs is None: + kwargs = {} + density_estimator = dclass(**kwargs) + + if not hasattr(dclass, 'score_samples'): + raise ValueError('invalid density_estimator')

#### ogrisel Dec 12, 2013

Owner

This should better be:

        if not hasattr(density_estimator, 'score_samples'):
raise TypeError('Invalid density_estimator: %s.'
' Missing required score_samples method.' % density_estimator)

### ogrisel commented on an outdated diff Dec 12, 2013

doc/modules/naive_bayes.rst
 + + >>> from sklearn.naive_bayes import GenerativeBayes + >>> from sklearn.datasets import make_blobs + >>> X, y = make_blobs(100, centers=2, random_state=0) + >>> clf = GenerativeBayes(density_estimator='kde') + >>> clf.fit(X[:-10], y[:-10]) + GenerativeBayes(density_estimator='kde', model_kwds=None) + >>> clf.predict(X[-10:]) + array([1, 1, 1, 1, 0, 0, 1, 1, 0, 1]) + >>> y[-10:] + array([1, 1, 1, 1, 0, 0, 1, 1, 0, 1]) + +The KDE-based Generative classifier for this problem has 100% accuracy on +this small subset of test data. +The specified density estimator can be 'kde', 'gmm', +'normal_approximation', or any class or estimator

#### ogrisel Dec 12, 2013

Owner

"any class or estimator" => "any estimator" if we drop the class support.

### ogrisel commented on an outdated diff Dec 12, 2013

doc/modules/naive_bayes.rst
 +points drawn from the model. + +This type of generative model can be used in higher dimensions to do some +very interesting analysis. For example, here's a generative bayes model +which uses kernel density estimation trained on the digits dataset. The +top panel shows a selection of the input digits, while the bottom panel +shows draws from the class-wise probability distributions. These give an +intuitive feel to what the model "thinks" each digit looks like: + +.. figure:: ../auto_examples/images/plot_generative_sampling_2.png + :target: ../auto_examples/plot_generative_sampling.html + :align: center + :scale: 50% + +This result can be compared to the +similar figure <../auto_examples/neighbors/plot_digits_kde_sampling.html_

#### ogrisel Dec 12, 2013

Owner

Missing ">" before the "".

 jakevdp  address ogrisel's comments in GenerativeBayes  3f8666a
Member

### jakevdp commented Dec 12, 2013

 Thanks @ogrisel. I've addressed all your comments. Regarding the CV issue: I think the first-order solution is to simply expose the estimator parameters using the get_params machinery in BaseEstimator. We can internally label the estimators, e.g. "est1", "est2", so that the fit parameters would become est1__paramname, est2__paramname, etc. This would be a quick addition, and allow the usual cross-validation tools to have access to the parameters.

### coveralls commented Dec 12, 2013

 Coverage remained the same when pulling 3f8666a on jakevdp:generative_class into aa8139b on scikit-learn:master.
Owner

### ogrisel commented Dec 16, 2013

 Regarding the CV issue: I think the first-order solution is to simply expose the estimator parameters using the get_params machinery in BaseEstimator. We can internally label the estimators, e.g. "est1", "est2", so that the fit parameters would become est1__paramname, est2__paramname, etc. This would be a quick addition, and allow the usual cross-validation tools to have access to the parameters. I am not sure that will work as the number of sub-estimators is dependent on the number of classes . The list of subestimators in the estimators_ attribute is therefore only generated once we see the data in fit so as to be able to extract the number of classes or features from the data shape. On the other hand the grid search tooling manipulates the model and its parameters independently of the data, in particular prior to any call to fit. Hence we have a design mismatch. Maybe it would be possible to hack get/set_params to store the subestimators parameters on the GenerativeBayes object itself and delay the call to the recursive call set_params method on the sub-estimators objects at fit time.
Member

### jakevdp commented Dec 16, 2013

 yes, I ran into that mismatch when I gave this strategy a shot. I'll think about your idea of hacking get/set_params, but I'm starting to think that just providing a CV tool within GenerativeBayes itself might be the answer.
Owner

### ogrisel commented Dec 18, 2013

 That might indeed be a better way. Note however that we have a similar issue for multi-class or multi-label classifiers that implement the OvR strategy by combining n_classes binary classifiers. It is possible that having per-classifier hyperparameter tuning (e.g. regularizer strength) would be beneficial for the overall performance of the model. @mblondel @pprett might want to pitch-in.
Owner

### mblondel commented Dec 18, 2013

 I don't have any experience with tuning each binary classifier separately. One concern I have is that each binary classifier may produce predictions with different scales (e.g. one with predictions in [-1, 1], another one with predictions in [-5, 5]) and thus the argmax rule might not work at all. In any case, this is a combinatorial search and thus randomized search seems the way to go.

### jgbos commented Jan 15, 2014

 Hey guys, I hope I'm not just wasting space in your inbox. I've tried to follow this discussion, but wanted to provide a couple notes from a user. I have utilized GMM classifiers in the past. I've also started playing with this commit to see the results using a GMM. One big feature needed for this function is the capability of tuning the number of components,  n_components, for each class. I saw Jake was concerned with some features users would be interested in having, this is a biggie for people who use this type of classifier. It definitely impacts performance. Unfortunately I cannot provide you an example of a dataset (company policy).
Member

### jakevdp commented Jan 15, 2014

 Thanks @jgbos - I agree that individually tuning hyperparameters is a vital feature of this. I'm still trying to figure out the best way to approach that, though (and I haven't had much time to work on this lately)

Open

Closed

### ngaloppo commented Jan 14, 2016

 Is there any chance that there would be some progress on this PR, or is it buried forever? I understand that we are hung up on the last TODO item. I'm wondering if we can come to a solution that does not require the ability to do class-wise cross validation for the density model?

### agramfort commented on the diff Jan 14, 2016

doc/modules/naive_bayes.rst
 +Non-naive Bayes +--------------- + +As mentioned above, naive Bayesian methods are generally very fast, but often +inaccurate estimators. This can be addressed by relaxing the assumptions that +make the models naive, so that more accurate classifications are possible. + +If we return to the general formalism outlined above, we can see that the +generic model for Bayesian classification is: + +.. math:: + \hat{y} = \arg\max_y P(y) \prod_{i=1}^{n} P(x_i \mid y). + +This model only becomes "naive" when we introduce certain assumptions about +the form of :math:P(x_i \mid y), e.g. that each class is drawn from an +axis-aligned normal distribution (the assumption for Gaussian Naive Bayes).

#### agramfort Jan 14, 2016

Owner

what makes the model naive is that your assume conditional independence of the features. I find this paragraph not clear.

### agramfort commented on the diff Jan 14, 2016

doc/modules/naive_bayes.rst
 +inaccurate estimators. This can be addressed by relaxing the assumptions that +make the models naive, so that more accurate classifications are possible. + +If we return to the general formalism outlined above, we can see that the +generic model for Bayesian classification is: + +.. math:: + \hat{y} = \arg\max_y P(y) \prod_{i=1}^{n} P(x_i \mid y). + +This model only becomes "naive" when we introduce certain assumptions about +the form of :math:P(x_i \mid y), e.g. that each class is drawn from an +axis-aligned normal distribution (the assumption for Gaussian Naive Bayes). + +However, assumptions like these are in no way required for generative +Bayesian classification formalism: we can equally well fit any suitable +density model to each category to estimate :math:P(x_i \mid y). Some

#### agramfort Jan 14, 2016

Owner

this gives the impression that your code estimates a KDE/GMM for each feature but you actually estimate P(x \mid y)`

note that this can be problematic in high dimension (kde has issues in high dim). A middle ground could be to support also KDE/GMM for each feature ie keep the naive independence. This could be done with an option.

Owner

### agramfort commented Jan 14, 2016

 really cool examples :) @jakevdp you'll need to rebase

### danielravina commented Jun 25, 2016

 @jakevdp just wondering, will you merge this anytime soon?
Owner

### agramfort commented Jun 26, 2016

 @danielravina I am not sure @jakevdp has time to finish this. Please take over if you want and see my comments.
Member

### jakevdp commented Jun 26, 2016

 Probably will not be finishing this myself. The main reason I never finished the PR is that I never really figured out how to deal cleanly with per-class hyperparameters.

### jengelman commented Jul 19, 2017

 @danielravina @jakevdp did either of you or anyone else end up picking this back up? would be interested in working on this if not.
Member

### jmschrei commented Jul 19, 2017

 This PR is actually fairly similar to the BayesClassifier / NaiveBayes classifiers in pomegranate (see tutorial here: https://github.com/jmschrei/pomegranate/blob/master/tutorials/Tutorial_5_Bayes_Classifiers.ipynb). If you pick this up I'd be happy to review it, but be sure to read the above discussion thoroughly to understand what the stalling issues were.