My Naive Bayes Classifier for ham/spam message classification.
The probability of belonging to a class
To avoid the problem of unfamiliar words, additive smoothing is used:
Because of the small values, it is easier to work with logarithms by converting the formula according to the rule of logarithms
You can find file with training set spam.txt in resources folder. Used data can be found here. All data should consist of two fields: type and message, diveded by '\t'.
In Main file you can change filepath of training set and message variable to classify.