A library and application built on top of Lucene, using trec_eval to evaluate querries for the cacm collection
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
doc
src
.gitignore
COPYING
README.md

README.md

#LuceneEval A library and application built on top of Lucene. Convenient methods to load the cacm collection and its relative files, search its queries and test the results using trec_eval

##Configuration Currently there is no external configuration, you need to modify the code.
The file LuceneEval.java holds the configuration variables.

Default configuration is: DATAFILE = "data/cacm/cacm.all"; CACM_XML = "data/results/cacm.all.xml"; QUERYFILE = "data/cacm/query.text"; STOPWORDLIST = "data/cacm/common_words"; CACM_QRELS_FILE = "data/cacm/qrels.text"; TREC_QRELS_FILE = "data/results/trec_qrels"; TREC_SEARCHRESULTS_FILE = "data/results/trec_searchresults"; TREC_RESULTS_FILE = "data/results/trec_results"; RESULTS_LIMIT = 20;

##Dependencies Lucene - provides a Java-based indexing and search implementation, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities
SimpeXML - high performance XML serialization and configuration framework for Java
trec_eval - the standard tool used for evaluating an ad hoc retrieval run, given the results file and a standard set of judged results

##License LuceneEval by Ivan Kanakarakis is licensed under GNU GPLv3 license.
Further see COPYING.