GitHub - xqbumu/bayesian: Naive Bayesian Classification for Golang.

Naive Bayesian Classification with TF-IDF support

Perform naive Bayesian classification into an arbitrary number of classes on sets of strings.

Forked from github.com/jbrukh/bayesian

Added TF-IDF (term frequency–inverse document frequency) capability. Gain quite a bit of accurancy !

Background

See code comments for a refresher on naive Bayesian classifiers.

Installation

Using the go command:

go get github.com/jbrukh/bayesian
go install !$

Documentation

See the GoPkgDoc documentation here.

Features

Conditional probability and "log-likelihood"-like scoring.
Underflow detection.
Simple persistence of classifiers.
Statistics.

Example 1 (plain no tf-idf)

To use the classifier, first you must create some classes and train it:

import . "bayesian"

const (
    Good Class = "Good"
    Bad Class = "Bad"
)

classifier := NewClassifier(Good, Bad)
goodStuff := []string{"tall", "rich", "handsome"}
badStuff  := []string{"poor", "smelly", "ugly"}
classifier.Learn(goodStuff, Good)
classifier.Learn(badStuff,  Bad)

Then you can ascertain the scores of each class and the most likely class your data belongs to:

scores, likely, _ := classifier.LogScores(
                        []string{"tall", "girl"}
                     )

Magnitude of the score indicates likelihood. Alternatively (but with some risk of float underflow), you can obtain actual probabilities:

probs, likely, _ := classifier.ProbScores(
                        []string{"tall", "girl"}
                     )

Example 2 (TF-IDF)

To use the TF-IDF classifier, first you must create some classes and train it AND you need to call ConvertTermsFreqToTfIdf() AFTER training and before Classifying methods(LogScore,ProbSafeScore,ProbScore)

import . "bayesian"

const (
    Good Class = "Good"
    Bad Class = "Bad"
)

classifier := NewClassiferTfIdf(Good, Bad) // Extra constructor
goodStuff := []string{"tall", "rich", "handsome"}
badStuff  := []string{"poor", "smelly", "ugly"}
classifier.Learn(goodStuff, Good)
classifier.Learn(badStuff,  Bad)

classifier.ConvertTermsFreqToTfIdf() // IMPORTANT !!

Then you can ascertain the scores of each class and the most likely class your data belongs to:

scores, likely, _ := classifier.LogScores(
                        []string{"tall", "girl"}
                     )

Magnitude of the score indicates likelihood. Alternatively (but with some risk of float underflow), you can obtain actual probabilities:

probs, likely, _ := classifier.ProbScores(
                        []string{"tall", "girl"}
                     )

Use wisely.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
LICENSE		LICENSE
README.md		README.md
bayesian.go		bayesian.go
bayesian_test.go		bayesian_test.go
todo.txt		todo.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

bayesian.go

bayesian.go

bayesian_test.go

bayesian_test.go

todo.txt

todo.txt

Repository files navigation

Naive Bayesian Classification with TF-IDF support

Background

Installation

Documentation

Features

Example 1 (plain no tf-idf)

Example 2 (TF-IDF)

About

Releases

Packages

Languages

License

xqbumu/bayesian

Folders and files

Latest commit

History

Repository files navigation

Naive Bayesian Classification with TF-IDF support

Background

Installation

Documentation

Features

Example 1 (plain no tf-idf)

Example 2 (TF-IDF)

About

Resources

License

Stars

Watchers

Forks

Languages