Large scale k-nn experiments
tdunning Fixed vectorizer to avoid various NPE's, handle corpus normalization,…
… vector normalization and added csv output.
Latest commit e1480a0 Dec 28, 2012

This is a large scale knn project designed to test various approaches from the literature.

To recreate the pdf paper on k-means clustering, use the following commands in the docs/scaling-k-means directory:

$ /usr/texbin/pdflatex scaling-k-means.tex 
$ /usr/texbin/bibtex scaling-k-means
$ /usr/texbin/pdflatex scaling-k-means.tex 
$ /usr/texbin/pdflatex scaling-k-means.tex 

You will need to install pdftex to do this. MacTex and TexShop provide nice capabilities for dealing with latex files. See and

More details anon