Fix for #4 - no prior used in Bayes Classifier #5

bmuller · 2010-12-08T12:37:55Z

Without taking the prior probabilities for each class under consideration, you have a classifier that uses only likelihood (more specifically, log likelihood) rather than a classifier that uses Bayes theorem.

dabble · 2011-10-08T22:25:04Z

Thank you.

I spent a while reviewing http://en.wikipedia.org/wiki/Naive_Bayes_classifier
to better understand this change.

My feedback would be to update your change to replace spaces with tabs as
the original file is mostly tabbed. [Note: I don't have rights to this repository.]

The only downside I see to accepting this change is if someone called train_
multiple times per document instead of 1 time per document.

bmuller · 2011-10-09T01:13:46Z

Calling train more than once per document would result in bias even for a classifier just using log likelihoods.

Right now, the classifier uses only the likelihood to estimate the posterior - which means it's not a naive bayes classifier at all. The prior must be taken into account.

Since it looks like this rather egregious bug is not going to be fixed - I suggest using https://github.com/livingsocial/ankusa

dabble · 2011-10-09T04:37:42Z

Actually, I was thinking of calling train_ once each for title, author, body of a document.
Since without this fix, it only counted words, this would not affect the outcome.

The only reason I had for doing it (and I didn't) was to avoid concatenating strings only
to break them apart again.

As you point out, without this, this isn't really naive bayes. (I'm not sure counting words
multiple times per document is really naive bayes either...)

I hadn't run across the ankusa classifier. Thanks.

Fix for #4 - no prior used in Bayes Classifier

cardmagic · 2013-12-31T23:44:08Z

I am sorry it took so long to merge this in

fix for issue cardmagic#4 - no prior used in Bayes classifier

00bcabe

cardmagic pushed a commit that referenced this pull request Dec 31, 2013

Merge pull request #5 from bmuller/master

8eefec5

Fix for #4 - no prior used in Bayes Classifier

cardmagic merged commit 8eefec5 into cardmagic:master Dec 31, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for #4 - no prior used in Bayes Classifier #5

Fix for #4 - no prior used in Bayes Classifier #5

bmuller commented Dec 8, 2010

dabble commented Oct 8, 2011

bmuller commented Oct 9, 2011

dabble commented Oct 9, 2011

cardmagic commented Dec 31, 2013

Fix for #4 - no prior used in Bayes Classifier #5

Fix for #4 - no prior used in Bayes Classifier #5

Conversation

bmuller commented Dec 8, 2010

dabble commented Oct 8, 2011

bmuller commented Oct 9, 2011

dabble commented Oct 9, 2011

cardmagic commented Dec 31, 2013