## Implementación de métricas de evaluación de IR

In [33]:
import numpy as np
from typing import List

### Precision

Se define como la proporción de documentos recuperados (RET) que son relevantes (REL).

$\mathcal{P} = \frac{|\text{RET} \: \cap \: \text{REL}|}{|\text{RET}|}$

In [10]:
def precision(relevance_query: List[int]):

    assert set(relevance_query).issubset((0,1)), "Only binary values (0, 1) allowed."
    
    return sum(relevance_query) / len(relevance_query)

In [14]:
precision([0, 0, 0, 1])

0.25

### Precision at K

Se define como la proporción de documentos top-K recuperados que son relevantes.

In [24]:
def precision_at_k(relevance_query: List[int], k: int):

    assert set(relevance_query).issubset((0,1)), "Only binary values (0, 1) allowed."
    assert k > 0, "K must be greater or equal than 1."

    return sum(relevance_query[:k]) / len(relevance_query[:k])

In [25]:
precision_at_k([0, 0, 0, 1], 1)

0.0

### Recall at K

Se define como la proporción de documentos relevantes que se recuperan en el top K.

In [26]:
def recall_at_k(relevance_query: List[int], k: int, num_relevant_docs: int):

    assert set(relevance_query).issubset((0,1)), "Only binary values (0, 1) allowed."
    assert k > 0, "K must be greater or equal than 1."
    assert num_relevant_docs > 0, "Number of relevant docs must be greater or equal than 1."

    return sum(relevance_query[:k]) / num_relevant_docs

In [27]:
recall_at_k([0, 0, 0, 1], 1, 4)

0.0

### Average precision

Se define como el promedio de los **Precision at K**, calculados de manera iterativa al aumentar iterativamente **K** cada vez que se encuentra un documento relevante. El cálculo se detiene cuando se obtiene un **recall** de 1.

In [28]:
def average_precision(relevance_query: List[int]):

    assert set(relevance_query).issubset((0,1)), "Only binary values (0, 1) allowed."
    
    cumulative_precision = 0
    relevant_count = 0

    for observation_count, relevance in enumerate(relevance_query, 1):
        if relevance:
            relevant_count += 1
            cumulative_precision += relevant_count / observation_count

    return cumulative_precision / relevant_count

In [29]:
average_precision([0, 1, 0, 1, 1, 1, 1])

0.5961904761904762

### Mean Average Precision (MAP)

Promedio de calcular el **Average Precision** para varias consultas.

In [30]:
def mean_average_precision(relevance_queries: List[List[int]]):

    assert all(set(sublist).issubset((0, 1)) for sublist in relevance_queries), "Only binary values (0, 1) allowed in all query results."
    
    return sum(average_precision(relevance_query) for relevance_query in relevance_queries) / len(relevance_queries)

In [31]:
mean_average_precision([[1, 0, 1], [0, 1, 1]])

0.7083333333333333

### Discounted Cumulative Gain at K

Sea $\text{REL}_i$ la relevancia asociada con el documento en el rango $i$, $1 \leq i \leq K$. Definimos:

$\text{DGG@K} = \sum_{i = 1}^{K} \frac{\text{REL}_i}{\log_2 (\max (i,\: 2))}$

In [36]:
def discounted_cumulative_gain(relevance_query: List[int], k: int):
    
    assert all(x >= 0 for x in relevance_query), "All elements must be integers greater than or equal to 0."
    assert k > 0, "K must be greater or equal than 1."

    return sum(relevance / np.log2(max(i, 2)) for i, relevance in enumerate(relevance_query[:k], 1))

In [37]:
discounted_cumulative_gain([4, 4, 3, 0, 0, 1, 3, 3, 3, 0], 6)

10.279642067948915

### Normalized Discounted Cumulative Gain at K

Dada la DGG@K de una consulta, se divide entre el mejor DGG@K posible para esa consulta.

In [38]:
def normalized_discounted_cumulative_gain(relevance_query: List[int], k: int):

    assert all(x >= 0 for x in relevance_query), "All elements must be integers greater than or equal to 0."
    assert k > 0, "K must be greater or equal than 1."

    rq = relevance_query.copy()
    rq.sort(reverse=True)
    return discounted_cumulative_gain(relevance_query, k) / discounted_cumulative_gain(rq, k)

In [40]:
normalized_discounted_cumulative_gain([4,4,3,0,0,1,3,3,3,0], 6)

0.7424602308163405