bow_mnb

Given training data in BOW format, trains a multinomial Naive Bayes classifier

One method for training and testing, optional parameters - one line of code to train and test the classifier

Usage example: First download and install the package in your terminal:

$python3 -m pip install bow_mnb

Then in your python file:

import bow_mnb.mnb as mnb
mnb.train_and_test('./training/files/', './validation/files/')

Specify paths for training and testing data as folders containing subfolders, each with the name of their class label

Example:

Class labels A, B, and C

Directory structure should be as follows:

training
    ├──A
       ├──doc1.txt
       ├──doc2.txt
       └──...
    ├──B
       ├──doc1.txt
       ├──doc2.txt
       └──...
    └──C
       ├──doc1.txt
       ├──doc2.txt
       └──...

For the above directory structure, I would train the classifier as follows:

mnb.train_and_test('./training/')

An individiual document should be in BOW format (i.e., each line is a given word and its number of occurences, followed by a newline character)

Example:

snow 5
orange 2
the 98
blog 4
...

currently no support for non-BOW format // TO-DO - more file type and format handling

Optional paramters: set fileType to be txt or res, default is txt set tfidf=True to use TF-IDF vectorization instead of count vectorization (default) set stem=True to use SnowballStemmer on tokens try different laplace value to tune classifier (0.1, 0.05, 0.01, 0.005, 0.001, etc.) set outputFile=True to write predicted labels for some test or validation data (per document) to a single text file

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.vscode		.vscode
dist		dist
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bow_mnb

Given training data in BOW format, trains a multinomial Naive Bayes classifier

One method for training and testing, optional parameters - one line of code to train and test the classifier

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bow_mnb

Given training data in BOW format, trains a multinomial Naive Bayes classifier

One method for training and testing, optional parameters - one line of code to train and test the classifier

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages