Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Multinomial and Bernoulli Naive Bayes algorithms #183

Open
1 of 2 tasks
sgrigory opened this issue Nov 29, 2021 · 3 comments
Open
1 of 2 tasks

Add Multinomial and Bernoulli Naive Bayes algorithms #183

sgrigory opened this issue Nov 29, 2021 · 3 comments

Comments

@sgrigory
Copy link
Contributor

sgrigory commented Nov 29, 2021

Currently linfa-bayes crate contains Gaussian Naive Bayes algorithm. It should not be very difficult to add other kinds of Naive Bayes present in sklearn:

  • Multinomial Naive Bayes
  • Bernoulli Naive Bayes

For a new algorithm one needs to reimplement methods joint_log_likelihood and update_feature_log_prob and the hyperparameters - the rest of the code stays more or less the same.

I have created a draft implementation of Multinomial Naive Bayes in this branch, based on the current code of Gaussian Naive Bayes and the sklearn implementation of MultinomialNB. At the moment a large part of code is copy-pasted from Gaussian Naive Bayes, but it is possible to refactor both to deduplicate the shared code.

Would you consider this a useful feature to have? If yes, I can finalise the draft and open a PR .

@VasanthakumarV @bytesnake, tagging you since you authored and reviewed the original Gaussian Naive Bayes implementation in #51

@bytesnake
Copy link
Member

Hi @sgrigory,

It should not be very difficult to add other kinds of Naive Bayes present in sklearn

no not really, the question is rather how we want to add them. The type system should allow us to be generic over the distribution, there are some distribution libraries in Rust but few with MAP estimation.

For a new algorithm one needs to reimplement methods joint_log_likelihood and update_feature_log_prob and the hyperparameters - the rest of the code stays more or less the same.

sounds like a good candidate for a trait

I have created a draft implementation of Multinomial Naive Bayes in this branch, based on the current code of Gaussian Naive Bayes [..]

👍

Would you consider this a useful feature to have? If yes, I can finalise the draft and open a PR .

yes, we would accept such a PR. To be really useful, we have to figure out how-to

  1. handle mixed-type datasets
  2. handle distributions with different parametrizations

It may therefore be refactored in the future, but nevertheless we will accept such a PR gladly :)

@yuancc06
Copy link

yuancc06 commented Aug 2, 2022

Hi. I have tried multinomial naive bayes and it works very well in predicting the correct result. However, in some cases I need to get the joint likelihood for further calculations, but I cannot get those numbers because the corresponding function is in pub(crate). I wonder if the developers have plans to make the likelihood/probability function public. Thank you.

@YuhanLiin
Copy link
Collaborator

That would require making the NaiveBayes trait public. I'd accept a PR which does this, but with the other method hidden from the docs so that people don't rely on the traits for things other than joint likelihood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants