# Understanding the Evaluation Metric
This notebook will follow the structure of [this great notebook](https://www.kaggle.com/pestipeti/explanation-of-map5-scoring-metric)

# Mean Average Precision (MAP)
Submissions are evaluated according to the Mean Average Precision @ 12 (MAP@12):

$$MAP@12 = {1 \over U} \sum_{u=1}^{U} \sum_{k=1}^{min(n,12)}P(k) \times rel(k)$$

where `U` is the number of images, `P(k)` is the precision at cutoff `k`, `rel(k)` is an indicator function equaling 1 if the item at rank `k` is a relevant (correct) label, zero otherwise and `n` is the number of predictions per image. <br>
We will slowly build towards the final function, bit by bit

In [None]:
import numpy as np

gt = np.array(['a', 'b', 'c', 'd', 'e'])

preds1 = np.array(['b', 'c', 'a', 'd', 'e'])
preds2 = np.array(['a', 'b', 'c', 'd', 'e'])
preds3 = np.array(['f', 'b', 'c', 'd', 'e'])
preds4 = np.array(['a', 'f', 'e', 'g', 'b'])
preds5 = np.array(['a', 'f', 'c', 'g', 'b'])
preds6 = np.array(['d', 'c', 'b', 'a', 'e'])

# Precision

Precision is the Positive Predictive Rate. In case of classification, precision is the number of True positives(i.e. the positive predictions by model that are actually positive) divided by all the positive predictions by the model (i.e. the sum of True positives and False positives)

$$ P = { \#\ of\ correct\ predictions\over \#\ of\ all\ predictions  } = {TP \over (TP + FP)}$$

<br>
In case of Information Retrieval, Precision is the fraction of the documents retrieved that are relevant to the user

$${\displaystyle {\text{P}}={\frac {|\{{\text{relevant documents}}\}\cap \{{\text{retrieved documents}}\}|}{|\{{\text{retrieved documents}}\}|}}}$$

# Precision at K

Precision at cutoff `k`, `P(k)`, is simply the precision calculated by considering only the subset of your predictions from rank 1 through `k`. <br>
For calculating this we take the top k recommendations and find it's precision with the ground truth.<br>
Example 1:
If `gt=[a,b,c,d,e]` and `pred=[b,c,a,d,e]` then for `P@1` we only take the first recommendation from `pred` i.e.`b` and find it's `precision` with the `gt`. 
$${\displaystyle {\text{P}}={\frac {|\{{\text{gt}}\}\cap \{{\text{pred[:1]}}\}|}{|\{{\text{pred[:1]}}\}|}}={\frac {\text{1}}{\text{1}}}}$$ 
<br>
Example 2:
If `gt=[a,b,c,d,e]` and `pred=[f,b,c,d,e]` then for `P@1` we only take the first recommendation from `pred` i.e.`f` and find it's `precision` with the `gt`. 
$${\displaystyle {\text{P}}={\frac {|\{{\text{gt}}\}\cap \{{\text{pred[:1]}}\}|}{|\{{\text{pred[:1]}}\}|}}={\frac {\text{0}}{\text{1}}}}$$
<br>
Example 3:
If `gt=[a,b,c,d,e]` and `pred=[a,f,e,g,b]` then for `P@2` we only take the top 2 recommendations from `pred` i.e.`[a,f]` and find it's `precision` with the `gt`. Intersection between the two sets is `1` since only `a` is present in the `gt`
$${\displaystyle {\text{P}}={\frac {|\{{\text{gt}}\}\cap \{{\text{pred[:2]}}\}|}{|\{{\text{pred[:2]}}\}|}}={\frac {\text{1}}{\text{2}}}}$$
<br>

Some more Examples:

| true  | predicted   | k  | P(k) |
|:-:|:-:|:-:|:-:|
| [a, b, c, d, e]  | [b, c, a, d, e]   | 1  | 1.0  |
| [a, b, c, d, e]  | [a, b, c, d, e]   | 1  | 1.0  |
| [a, b, c, d, e]  | [f, b, c, d, e]   | 1  | 0.0  |
| [a, b, c, d, e]  | [a, f, e, g, b]   | 2  | $$1\over2$$  |
| [a, b, c, d, e]  | [a, f, c, g, b]   | 3  | $$2\over3$$  |
| [a, b, c, d, e]  | [d, c, b, a, e]   | 3  | $$3\over3$$  |

In [None]:
def precision_at_k(y_true, y_pred, k=12):
    """ Computes Precision at k for one sample
    
    Parameters
    __________
    y_true: np.array
            Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           Precision at k
    """
    intersection = np.intersect1d(y_true, y_pred[:k])
    return len(intersection) / k

In [None]:
assert precision_at_k(gt, preds1, k=1) == 1.0
assert precision_at_k(gt, preds2, k=1) == 1.0
assert precision_at_k(gt, preds3, k=1) == 0.0
assert precision_at_k(gt, preds4, k=2) == 1/2
assert precision_at_k(gt, preds5, k=3) == 2/3
assert precision_at_k(gt, preds6, k=3) == 3/3

# Rel at K

`Rel(k)` is an indicator function equaling 1 if the item at rank k is a relevant (correct) label, zero otherwise.<br>
Example 1:
If `gt=[a,b,c,d,e]` and `pred=[b,c,a,d,e]` then for `rel@1` we only take the first recommendation from `pred` i.e.`b` and check if it's relevant i.e. present in `gt`.
$${\displaystyle {\text{rel(k)}}=1.0}$$
<br>
Example 2:
If `gt=[a,b,c,d,e]` and `pred=[f,b,c,d,e]` then for `rel@1` we only take the first recommendation from `pred` i.e.`f` and check if it's relevant i.e. present in `gt`.
$${\displaystyle {\text{rel(k)}}=0.0}$$
<br>
Example 3:
If `gt=[a,b,c,d,e]` and `pred=[a,f,e,g,b]` then for `rel@2` we only take the second recommendation from `pred` i.e.`f` and check if it's relevant i.e. present in `gt`.
$${\displaystyle {\text{rel(k)}}=0.0}$$
<br>

Some more Examples:

| true  | predicted   | k  | rel(k) |
|:-:|:-:|:-:|:-:|
| [a, b, c, d, e]  | [b, c, a, d, e]   | 1  | 1.0  |
| [a, b, c, d, e]  | [a, b, c, d, e]   | 1  | 1.0  |
| [a, b, c, d, e]  | [f, b, c, d, e]   | 1  | 0.0  |
| [a, b, c, d, e]  | [a, f, e, g, b]   | 2  | 0.0  |
| [a, b, c, d, e]  | [a, f, c, g, b]   | 3  | 1.0  |
| [a, b, c, d, e]  | [d, c, b, a, e]   | 3  | 1.0  |

In [None]:
def rel_at_k(y_true, y_pred, k=12):
    """ Computes Relevance at k for one sample
    
    Parameters
    __________
    y_true: np.array
            Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           Relevance at k
    """
    if y_pred[k-1] in y_true:
        return 1
    else:
        return 0

In [None]:
assert rel_at_k(gt, preds1, k=1) == 1.0
assert rel_at_k(gt, preds2, k=1) == 1.0
assert rel_at_k(gt, preds3, k=1) == 0.0
assert rel_at_k(gt, preds4, k=2) == 0.0
assert rel_at_k(gt, preds5, k=3) == 1.0
assert rel_at_k(gt, preds6, k=3) == 1.0

# Average Precision at K

This is simply the mean of the product of `P@k` and `rel(k)` for all values of `k`

$${1\over{{min(n,12)}}} {\sum_{k=1}^{min(n,12)}P(k) \times rel(k)}$$

In [None]:
def average_precision_at_k(y_true, y_pred, k=12):
    """ Computes Average Precision at k for one sample
    
    Parameters
    __________
    y_true: np.array
            Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           Average Precision at k
    """
    ap = 0.0
    for i in range(1, k+1):
        ap += precision_at_k(y_true, y_pred, i) * rel_at_k(y_true, y_pred, i)
        
    return ap / min(k, len(y_true))

In [None]:
assert average_precision_at_k(gt, preds1, k=1) == 1.0
assert average_precision_at_k(gt, preds2, k=1) == 1.0
assert average_precision_at_k(gt, preds3, k=1) == 0.0
assert average_precision_at_k(gt, preds4, k=2) == 0.5
assert average_precision_at_k(gt, preds5, k=3) == 0.5555555555555555
assert average_precision_at_k(gt, preds6, k=3) == 1.0

# Mean Average Precision at K

Take mean of Average Precision for all the users

In [None]:
def mean_average_precision(y_true, y_pred, k=12):
    """ Computes MAP at k
    
    Parameters
    __________
    y_true: np.array
            2D Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            2D Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           MAP at k
    """
    return np.mean([average_precision_at_k(gt, pred, k) \
                    for gt, pred in zip(y_true, y_pred)])

In [None]:
y_true = np.array([gt, gt, gt, gt, gt, gt])
y_pred = np.array([preds1, preds2, preds3, preds4, preds5, preds6])

In [None]:
print(average_precision_at_k(gt, preds1, k=4))
print(average_precision_at_k(gt, preds2, k=4))
print(average_precision_at_k(gt, preds3, k=4))
print(average_precision_at_k(gt, preds4, k=4))
print(average_precision_at_k(gt, preds5, k=4))
print(average_precision_at_k(gt, preds6, k=4))

In [None]:
mean_average_precision(y_true, y_pred, k=4)

### Thanks for reading