ghalib/perceptron
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Fast averaged-perceptron binary classifier. Includes weighted-tf feature extractor. Weight and feature vectors represented as cache-friendly std::vectors, which resulted in a 4-5x speedup over the previous representation as std::maps. Evaluation: Initial rough-cut accuracy of 95.8% achieved with 2 training iterations on the news20.binary dataset at http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html. Training set consists of 8000 examples: 4000 of +1 and 4000 of -1 taken from news20.binary. Remaining 11996 examples used for testing. Thanks to Justin Palmer for explaining the difference between an averaged and a regular perceptron as well as teaching me the lazy update 'trick'. It caused a huge speedup. Any mistakes in the code are mine alone.