Tools and scripts for testing and evaluating perlin
This tool is used to generate collections on which the testbench can run.
It takes the number of documents, length of these documents, optionally size of the vocabulary and output file:
collection-generator -d 1000 -l 10 -v 100 collection.bin
Would generate a collection with 1000 documents each 10 terms long with a total of 100 distinct terms. The collection would be written to
Currently only used to measure and profile indexing. It takes a collection generated by
collection-generator as input: