Find file History
Latest commit 3185652 Jul 12, 2018
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
testdata/fakenews added textclass Mar 30, 2018
README.md updated menus Mar 30, 2018
main.go stopped printing results Jul 12, 2018

README.md

textclass

Text classification tool for Classificationbox.

Usage

  1. Prepare teaching data
  2. Run Classificationbox
  3. Teach and test

Prepare teaching data

Create a directory structure that organizes the files into classes, with each folder as the class name:

/teaching-items
	/class1
		class1example1.txt
		class1example2.txt
		class1example3.txt
	/class2
		class2example1.txt
		class2example2.txt
		class2example3.txt
	/class3
		class3example1.txt
		class3example2.txt
		class3example3.txt

The files can be text of any size, one file per example.

Run Classificationbox

In a terminal do:

docker run -p 8080:8080 -e "MB_KEY=$MB_KEY" machinebox/classificationbox

Teach and test

Use the textclass tool to teach the

textclass -teachratio 0.8 -src ./teaching-items

The tool will post a random 80% (-teachratio 0.8) of the files to Classificationbox for teaching, and the remaining items will be used to test the model.

Watch the magic happen

You will be prompted a few times as the tool goes through its various stages. The tool will:

  1. Create a new model
  2. Use a percentage of the data to teach the model
  3. Use the remaining items to validate the model
  4. Display the results, including the percentage accurary of the model