# Step 4 - Evaluate Retrieved Results

This script is to evaluate retrieval results generated from the last step using two metrics. 

More info - 
Discounted Cumulative Gain: https://en.wikipedia.org/wiki/Discounted_cumulative_gain
Mean Reciprocal Rank: https://en.wikipedia.org/wiki/Mean_reciprocal_rank.

Input:
1. database file
2. answer file
3. result file

Output:
1. score based on DCG, larger value means better results.

In [1]:
import os
import json
from evaluation import Evaluation_MRR, Evaluation_nDCG
%pylab inline

Populating the interactive namespace from numpy and matplotlib


In [2]:
"""
Make sure these variables are correctly set.
model: file path to pretrained model
database_file: file path to database file generated from previous steps
answer_file: file path to answer file
result_file: an list of result files
"""

database_file = './data source/The Last Of Us/output/The_Last_Of_Us_top3keywords.json'
answer_file = './data source/The Last Of Us/output/correct_answers.json'
result_file = ['./data source/The Last Of Us/output/The_Last_Of_Us_results.json',
              './data source/The Last Of Us/output/The_Last_Of_Us_ma1_results.json',
              './data source/The Last Of Us/output/The_Last_Of_Us_ma3_results.json',
              './data source/The Last Of Us/output/The_Last_Of_Us_ma5_results.json',
              './data source/The Last Of Us/output/The_Last_Of_Us_top3keywords_results.json',
              './data source/The Last Of Us/output/The_Last_Of_Us_top3keywords_ma1_results.json',
              './data source/The Last Of Us/output/The_Last_Of_Us_top3keywords_ma3_results.json',
              './data source/The Last Of Us/output/The_Last_Of_Us_top3keywords_ma5_results.json']

In [3]:
tester0 = Evaluation_MRR(database_file)
best_score = 0
best_result = ""
for idx, file in enumerate(result_file):
    print(f'\nEvaluating result file: {file}\n')
    score = tester0.evaluation_mean_rec_rank(file, answer_file)
    if score > best_score:
        best_score = score
        best_result = file
    print(f'MRR: {score}')
print(f'\nHighest Performance:\n score = {best_score}, \n result = {best_result}\n')


Evaluating result file: ./data source/The Last Of Us/output/The_Last_Of_Us_results.json

number of results: 8
query: daughter got shot dead
query: fireflies
query: giraffe
query: infected girl
******Hit! Found an answer at rank 0!******
query: clickers are hear
query: brother tommy
query: Ellie riding a horse
******Hit! Found an answer at rank 1!******
query: weapon bow and arrow
******Hit! Found an answer at rank 3!******
MRR: 0.21875

Evaluating result file: ./data source/The Last Of Us/output/The_Last_Of_Us_ma1_results.json

number of results: 8
query: daughter got shot dead
query: fireflies
query: giraffe
query: infected girl
******Hit! Found an answer at rank 0!******
query: clickers are hear
query: brother tommy
query: Ellie riding a horse
******Hit! Found an answer at rank 1!******
query: weapon bow and arrow
******Hit! Found an answer at rank 3!******
MRR: 0.21875

Evaluating result file: ./data source/The Last Of Us/output/The_Last_Of_Us_ma3_results.json

number of results: 8

In [4]:
tester1 = Evaluation_nDCG(database_file)
best_score = 0
best_result = ""
for idx, file in enumerate(result_file):
    print(f'\nEvaluating result file: {file}\n')
    score = tester1.evaluation_mean_nDCG(file, answer_file)
    if score > best_score:
        best_score = score
        best_result = file
    print(f'DCG: {score}')
    
print(f'\nHighest Performance:\n score = {best_score}, \n result = {best_result}')


Evaluating result file: ./data source/The Last Of Us/output/The_Last_Of_Us_results.json

query: daughter got shot dead
DCG=0.0, IDCG=4.543559338088346
query: fireflies
DCG=0.0, IDCG=0.0
query: giraffe
DCG=0.0, IDCG=4.543559338088346
query: infected girl
******Hit! Found an answer at rank 1!******
******Hit! Found an answer at rank 2!******
******Hit! Found an answer at rank 3!******
DCG=2.1309297535714578, IDCG=4.543559338088346
query: clickers are hear
DCG=0.0, IDCG=0.0
query: brother tommy
DCG=0.0, IDCG=2.1309297535714578
query: Ellie riding a horse
******Hit! Found an answer at rank 2!******
******Hit! Found an answer at rank 3!******
******Hit! Found an answer at rank 4!******
******Hit! Found an answer at rank 5!******
******Hit! Found an answer at rank 10!******
******Hit! Found an answer at rank 11!******
DCG=2.23752394519728, IDCG=4.543559338088346
query: weapon bow and arrow
******Hit! Found an answer at rank 4!******
******Hit! Found an answer at rank 5!******
******Hit! Fou

DCG=3.552988821983009, IDCG=4.543559338088346
DCG: 0.284542602652628

Highest Performance:
 score = 0.284542602652628, 
 result = ./data source/The Last Of Us/output/The_Last_Of_Us_top3keywords_ma5_results.json
