Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix Naive Bayes classifier computations in high dimensions. #2022
This is a fix for #2017. I found out
Below are the results using test data that were 20% from the total, and I used the code and dataset provided by @marcovirgolin
Oh, good point. Should be easy, I think all that's needed is to create an 1000-dimensional linearly separable dataset and then expect that NBC is able to properly classify it.
Great. Also, I found the last update failed to pass NBCTest, I'll look into it.
@rcurtin Sorry for the slow progress. Yeah, I want this work to be included the mlpack-3.2.1 patch release. :)
Also, I made training and test datasets for the high dimensions check. There are 200 data that represent 5 classes in the training dataset. In addition, each datum has 1,000 dimensions. The test dataset has 50 data and the rest is the same with the training dataset.
Below is the code that I used for dataset generation.