Skip to content

dipanshu124/xapian-evaluation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evaluation
==========

Evaluation module is one of essential module for a search engine to be used in academia. This module eases to calculate various metric precision, recall, precision @ rank, Precision @ recall which validates implementation and check the engine on result presented on literature. Xapian being an open source engine have implemented evaluation module to help people from academia.

Buiding Evaluation Module
-------------------------

User needs to build Evaluation module on their system after cloning the source repository. This can be done using
::

 autoreconf -fiv
 ./configure
 make
 sudo make install

Using Evaluation Module
------------------------

Evaluation metric currently in xapian
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 * Mean Average precision(MAP)
 * Mean Relevance Precision(MRP)
 * Precision at Rank
 * Precision at Recall
 * Recall

Indexing Collection with evaluation Module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

User can index a collection using
::

 trec_index config

Find a sample config file in current directory names 'config'

User needs to compulsory configure textfile, db and stopsfile parameters to be able to use index a collection. User needs to set indexbigram parameter to true to be able to index bigrams while indexing to use bigram language model weighting scheme.

Searching Topics file and creating run with evaluation module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

User can search a collection using index build to create a run with queries in topic file.
::

 trec_query config
 trec_search config

User need to compulsory configure queryfile, topicfile and stopsfile to search using a topic file, needs to set queryparserbigram to true to be able to add bigrams in query object formed by bigram.

Evaluating the run
^^^^^^^^^^^^^^^^^^^

User needs to run to get the evaluation result for your run.
::

 trec_adhoceval config

It's compulsory to set relfile and runfile for using this option.

Configuring evaluation module
------------------------------

Configurable Parameters:
^^^^^^^^^^^^^^^^^^^^^^^^

	 - textfile     - (INPUT) path/filename of text file to be indexed for evaluation
	 - language     - (INPUT) corpus language to be indexed for stemming
	 - db           - (OUTPUT) path of database which is index of above textfile
	 - querytype    - (INPUT) type of query
	 - queryfile    - (OUTPUT) path/filename of query file to be ran and searched
	 - resultsfile  - (OUTPUT) path/filename of results file(run file)
	 - transfile    - (INPUT) path/filename of transaction file
	 - noresults    - (INPUT) no of results to save in results log file
	 - topicfile    - (INPUT) path/filename of topic file for which relevance judgement file is available
	 - topicfields  - (INPUT) fields of topic to use from topic file in evaluation
	 - relfile      - (INPUT) path/filename of relevance judgements file
	 - runname      - (INPUT) name of the evaluation run
	 - nterms       - (INPUT) no of terms to pick from the topic
	 - stopsfile    - (INPUT) name of the stopword file
	 - evaluationfiles - (OUTPUT) path/filename of the evaluation files(where evaluation result is stored).
	 - indexbigram  - (INPUT) Index Bigram in the index.
	 - queryparsebigram -(INPUT) Parse and add bigrams to the Query.
	 - weightingscheme - (INPUT) Specifies the weighting scheme and any parameters (passed to ``Xapian::Weight::create()``)

About

Evaluation Module for Xapian

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 85.6%
  • M4 6.9%
  • C 6.9%
  • Makefile 0.6%