-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Generative Classification #2468
Conversation
I think to make the discussion more fruitful it would be great to provides some examples on datasets where such models are actually useful either from a pure classification performance point of view, or more likely as samplers to generate new labeled samples for specific classes (a bit like you did with this KDE sampling example for digits). |
Ah, I hadn't even thought of that possibility! Yes, we could implement a I'll work on some examples soon to make the utility of this approach more clear. |
I added doc strings and tests. An incompatibility came up in the case of GMM: I opened an issue at #2473. |
sklearn/generative.py
Outdated
if isinstance(density_estimator, str): | ||
dclass = MODEL_TYPES.get(density_estimator) | ||
return dclass(**kwargs) | ||
elif isinstance(density_estimator, type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like type
is undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a builtin
I don't think this should be combined with Naive Bayes, except in the docs. The charm of Naive Bayes lies in its speed and simple code, no need to mess with that. |
sklearn/generative.py
Outdated
Bayes Classifier, in which the distribution of each training class is | ||
approximated by an axis-aligned multi-dimensional normal distribution, and | ||
unknown points are evaluated by comparing their posterior probability under | ||
each model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no such thing as "the" Naive Bayes classifier. You're thinking of Gaussian NB, but NLP people will variously think of Bernoulli or multinomial NB. (The first time I encountered the Gaussian variant was while reading sklearn source code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! The first time I encountered any version other than Gaussian NB was reading the sklearn source code 😄
On second thought: @jakevdp, is it too much of a stretch to merge this thing into the |
I initially thought about putting this within Regardless of where the code goes, I had envisioned combining the narrative documentation for the two: as you mention, we can adapt the theoretical background currently put under the heading of Naive Bayes and show how it applies in both the Naive and the general case. |
True, but I've seen other people's production codebases that depend on Combining the narratives was the main what I was aiming at. It's your call to decide if it fits well enough to also combine the code. (FYI, I see you're using |
@larsmans - I ended up following your advice and moving everything into the Still some tests failing... I'm going to try to fix those. |
I think this is pretty close to finished now. I added narrative documentation, examples, and the tests should pass. One missing feature that would be really helpful would be the ability to do class-wise cross-validation of the density estimators within |
Hmm... is there any way that program state can affect the results of |
Ah - looks like it was something that had changed in master. I'll adjust the tests so that they will pass. |
Changing status to MRG: I think this is ready for a final review, unless we want to add class-wise cross-validation at this time. |
Thanks @ogrisel. I've addressed all your comments. Regarding the CV issue: I think the first-order solution is to simply expose the estimator parameters using the |
I am not sure that will work as the number of sub-estimators is dependent on the number of classes . The list of subestimators in the |
yes, I ran into that mismatch when I gave this strategy a shot. I'll think about your idea of hacking |
That might indeed be a better way. Note however that we have a similar issue for multi-class or multi-label classifiers that implement the OvR strategy by combining |
I don't have any experience with tuning each binary classifier separately. One concern I have is that each binary classifier may produce predictions with different scales (e.g. one with predictions in [-1, 1], another one with predictions in [-5, 5]) and thus the argmax rule might not work at all. In any case, this is a combinatorial search and thus randomized search seems the way to go. |
Hey guys, I hope I'm not just wasting space in your inbox. I've tried to follow this discussion, but wanted to provide a couple notes from a user. I have utilized GMM classifiers in the past. I've also started playing with this commit to see the results using a GMM. One big feature needed for this function is the capability of tuning the number of components, |
Thanks @jgbos - I agree that individually tuning hyperparameters is a vital feature of this. I'm still trying to figure out the best way to approach that, though (and I haven't had much time to work on this lately) |
Is there any chance that there would be some progress on this PR, or is it buried forever? I understand that we are hung up on the last TODO item. I'm wondering if we can come to a solution that does not require the ability to do class-wise cross validation for the density model? |
|
||
This model only becomes "naive" when we introduce certain assumptions about | ||
the form of :math:`P(x_i \mid y)`, e.g. that each class is drawn from an | ||
axis-aligned normal distribution (the assumption for Gaussian Naive Bayes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what makes the model naive is that your assume conditional independence of the features. I find this paragraph not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this paragraph erroneous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it’s wrong as I suggested in 2016 ;)
really cool examples :) @jakevdp you'll need to rebase |
@jakevdp just wondering, will you merge this anytime soon? |
@danielravina I am not sure @jakevdp has time to finish this. Please take over if you want and see my comments. |
Probably will not be finishing this myself. The main reason I never finished the PR is that I never really figured out how to deal cleanly with per-class hyperparameters. |
@danielravina @jakevdp did either of you or anyone else end up picking this back up? would be interested in working on this if not. |
This PR is actually fairly similar to the BayesClassifier / NaiveBayes classifiers in pomegranate (see tutorial here: https://github.com/jmschrei/pomegranate/blob/master/tutorials/Tutorial_5_Bayes_Classifiers.ipynb). If you pick this up I'd be happy to review it, but be sure to read the above discussion thoroughly to understand what the stalling issues were. |
Should this be moved to scikit-learn-extras or is it not complete enough? |
Hi, I am facing the same problem with a few estimators I wrote that also require per-class hyperparameters, i.e., lists/arrays/tuples. The way I see it there are two things you can do:
|
This PR adds a simple meta-estimator which accepts any generative model (normal approximation,
GMM
,KernelDensity
, etc.) and uses it to construct a generative Bayesian classifier.Todo: