In [26]:
from math import log

## Evaluation of text retrievals systems

Usage of the Cranfield methodology like in data science. With precision and recall at a given cutoff length.

* __Precision (P)__ = |relevant documents retrieved|/|total retrieved documents|, aka __are the retrieved results all relevant?__
* __Recall (R)__ = |relevant documents retrieved|/|total relevant documents|, aka __have all the relevant documents been retrieved?__
* Fb-measure = ((b^2 + 1) * PR) / (b^2 * P + R), combines P and R via geometric mean.

### How to Summarize a Ranking

#### Average precision

Average precision is the sum of increased P divided by the number of relevant documents in collection. Using binary relevance judgement (0: not relevant, 1: relevant).


In [27]:
def average_precision(ranked_list, total_relevant_doc):
    sum_precision = 0
    current_relevant_doc = 0
    for index, doc_relevance in enumerate(ranked_list):
        if doc_relevance != 0:
            current_relevant_doc += 1
            sum_precision += current_relevant_doc / (index + 1)
    average_precision = sum_precision / total_relevant_doc
    return average_precision

In [28]:
ranked_list_A = [1,1,0,0,0,0,0,0,0,0]
average_precision(ranked_list_A, 5)

0.4

In [29]:
ranked_list_B = [0,1,0,0,1,0,0,0,0,1]
average_precision(ranked_list_B, 5)

0.24

#### Normalized Discounted Cumulative Gain (nDCG)

Using multi level relevance judgement (1: not relevant to N: most relevant, with N > 2).

In [36]:
def dcg(ranked_list, max_relevance=3):
    dcg_value = 0
    for index, doc_relevance in enumerate(ranked_list):
        if index == 0:
            dcg_value += doc_relevance
        else:
            dcg_value += doc_relevance / log(index + 1)
    return dcg_value

def ndcg(ranked_list, max_relevance=3):
    optimal_dcg = dcg([max_relevance] * len(ranked_list), max_relevance)
    dcg_value = dcg(ranked_list)
    ndcg_value = dcg_value / optimal_dcg
    return ndcg_value

In [40]:
ranked_list = [3,2,1,1,3,1,1,2,1]
print('DGC: {}'.format(dcg(ranked_list)))
print('nDGC: {}'.format(ndcg(ranked_list)))

DGC: 11.869906908688476
nDGC: 0.5902216528493285
