More detailed results and the implementation approach are described in the file report.pdf. All the computed results from various corpora are located in the results folder.
At first, install all required dependencies from the requirements.txt file. The folder data contains all the required corpora, and the corresponding questions are located in the questions_df.csv file. The evaluation can be run using the command:
python3 src/retrieval_evaluation_pipeline.pyBefore running, you may want to modify the parameters of the RetrievalEvaluationPipeline in retrieval_evaluation_pipeline.py file, such as the embedding function, corpus, and labels for the questions:
rep = RetrievalEvaluationPipeline(
embedding_function,
corpus_file='data/chatlogs.md',
questions_label='chatlogs'
)