Skip to content

taylorwood/ADRDemo

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ADRDemo

A simple F# demonstration of Automated Document Recognition using techniques like text tokenization, n-grams, TF-IDF weighting, CSV parsing, and text classification.

The code assumes the existence of some training data, in the form of plaintext files organized into folders by category:

  • \TrainingData
    • \CategoryA
      • \Sample1.txt
      • \Sample2.txt
    • \CategoryB
      • \SampleA.txt

...and a plain text file to be classified: "unknown.txt".

It also assumes the existence of a word whitelist CSV file, but this can be easily changed to a blacklist ("stopwords") or removed altogether.

Releases

No releases published

Packages

No packages published

Languages