A Java implementation of a Lucene-based search engine. The corpus for this engine is a collection of news articles aggregated from 4 different sources.
- Financial Times Limited (1991, 1992, 1993, 1994)
- Federal Register (1994)
- Foreign Broadcast Information Service (1996)
- Los Angeles Times (1989, 1990)
$ git clone https://github.com/httpdaniel/CorpusSE.git
$ cd CorpusSE
$ mvn package$ java -cp target/CorpusSE-1.0-SNAPSHOT.jar CreateIndex$ java -cp target/CorpusSE-1.0-SNAPSHOT.jar CorpusSearchThe results will be outputted to a file "Results.txt" in the corpus folder
$ cd corpus
$ ./trec_eval-9.0.7/trec_eval qrels.assignment2.part1 Results.txt$ ./trec_eval-9.0.7/trec_eval -m map -m recall qrels.assignment2.part1 Results.txt