This is an odd little project where we map Wikipedia documents to vectors of integers and then we compress them.
Sorry for the poor documentation. This is mostly a personal project.
Example: java -cp target/CompressWikipediaAsIntegerVectors-0.0.1-SNAPSHOT.jar:target/lib/* me.lemire.wikipediacompress.Benchmark /home/dlemire/enwiki-20130102-pages-articles.xml.bz2 ../IndexWikipedia/crdict.txt