# Introduction

Hi, I'm George and based on this fantastic [notebook](https://www.kaggle.com/debarshichanda/understanding-mean-average-precision) I'll try to **optimize** 
the **Mean Average Precision @ K**

**Please feel free to comment and/or upvote!**

Cheers!


## TLDR
- Two optimization versions are proposed in order to calculate the MAP@K.
- The baseline approach can be found below and in the original notebook.

#### The execution times of the three versions can be found in the next table. 


| **Sample Size**  | **Baseline (V1) Time** | **V2 Time** | **V3 Time** |
|--------------------------|------------------------|-------------|-------------|
| 1.000                    | 463 ms                 | 43.9 ms     | 21 ms       |
| 10.000                   | 4.53 sec               | 430 ms      | 210 ms      |
| 100.000                  | 46.5 sec               | 4.35 sec    | 2.1 sec     |



## Baseline
**At first I'll try to execute the code in the aforementioned notebook to set a baseline.**

**The following snippets were taken directly from there.**

In [None]:
import numpy as np

def precision_at_k_v1(y_true, y_pred, k=12):
    """ Computes Precision at k for one sample
    
    Parameters
    __________
    y_true: np.array
            Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           Precision at k
    """
    intersection = np.intersect1d(y_true, y_pred[:k])
    return len(intersection) / k


def rel_at_k_v1(y_true, y_pred, k=12):
    """ Computes Relevance at k for one sample
    
    Parameters
    __________
    y_true: np.array
            Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           Relevance at k
    """
    if y_pred[k-1] in y_true:
        return 1
    else:
        return 0
    
def average_precision_at_k_v1(y_true, y_pred, k=12):
    """ Computes Average Precision at k for one sample
    
    Parameters
    __________
    y_true: np.array
            Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           Average Precision at k
    """
    ap = 0.0
    for i in range(1, k+1):
        ap += precision_at_k_v1(y_true, y_pred, i) * rel_at_k_v1(y_true, y_pred, i)
        
    return ap / min(k, len(y_true))

def mean_average_precision_v1(y_true, y_pred, k=12):
    """ Computes MAP at k
    
    Parameters
    __________
    y_true: np.array
            2D Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
            2D Array of predicted recommendations (Order does matter)
    k: int, optional
       Maximum number of predicted recommendations
            
    Returns
    _______
    score: double
           MAP at k
    """
    return np.mean([average_precision_at_k_v1(gt, pred, k) for gt, pred in zip(y_true, y_pred)])

### Baseline Execution

In [None]:
gtruth = np.array(['a', 'b', 'c', 'd', 'e', 'f', 'h', 'i', 'j', 'k', 'l', 'm'])
preds =  np.array(['a', 'f', 'c', 'g', 'b', 'k', 'o', 'n', 'x', 'l', 'q', 'i'])

In [None]:
# Construct a "large" dataset 
n_samples = 1_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# Baseline Benchmark @ 1.000

%timeit mean_average_precision_v1(y_true, y_pred, k=12)

In [None]:
# Construct a "larger" dataset 
n_samples = 10_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# Baseline Benchmark @ 10.000
%timeit mean_average_precision_v1(y_true, y_pred, k=12)

In [None]:
# Construct an even "larger" dataset 
n_samples = 100_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# Baseline Benchmark @ 100.000

%timeit mean_average_precision_v1(y_true, y_pred, k=12)

# Let's make it faster!

### At first we will work with our initial 2 vectors and go step by step

In [None]:
print(gtruth)
print(preds)

In [None]:
# Let's now say that we want to calculate @K
# The first thing to do is to take the K first items from the recommended items (ranking)
K = 12
y_pred = preds[:K]
y_pred

In [None]:
# Using the function np.intersect1d you can the intersected items between the 2 vectors
print(np.intersect1d(gtruth, y_pred))

**BUT** if you check the documentation about [np.intersect1d](https://numpy.org/doc/stable/reference/generated/numpy.intersect1d.html) you may 
discover two other parameters:
- assume_unique=True
- return_indices=True

In [None]:
# 3 things are returned here
# - The common items between the two vectors
# - The indexes of the common items in the first vector (truth vector)
# - The indexes of the common items in the second vector (ranking/recommendations vector)
np.intersect1d(gtruth, y_pred, assume_unique=True, return_indices=True)

In [None]:
# We will only need the indexes for the ranking vector
_, _, pred_indexes = np.intersect1d(gtruth, y_pred, assume_unique=True, return_indices=True)

pred_indexes

We cannot use this vector as it is. We'll transform it in a way that it will be a bit more intuitive

- We will create a mask for these positions by instantiating a vector with Os
- and then fill the indexes from the aforementioned vector with 1s. 

In [None]:
x = np.zeros(len(y_pred), dtype=int)
x[pred_indexes] = 1

x

#### If you think it through the aforementioned calculation provides the **Relevance @ K** vector

So let's re-write it with a  more readable naming convention

In [None]:
rel_at_k = np.zeros(len(y_pred), dtype=int)
rel_at_k[pred_indexes] = 1

### The next step is to try to calculate the precision at each K up to 12. 

To do that we need to count **how many common items** we have **at each step**.

This can be easily calculated by calculating **the Cumulative Sum over the Relative@K Vector**


Let's check:

In [None]:
# As you may see, the second row shows the number of intersected items 
# with our ground truth vector at each step K (index)

# initial Relative@K vector
print(rel_at_k)

# The cumulative of the Relative@K vector
print(rel_at_k.cumsum())

In [None]:
# We assign the instesection counts to this variable
intersection_count_at_k = rel_at_k.cumsum()

Having the **Intersection Counts**, and the **Relative Scores** at each K, all we need
is to compute the actual **Precision** at each K

To do this we'll need the denominator for each step.

This can be obtained by creating a vector being [1...12]

In [None]:
ranks = np.arange(1, len(y_pred) + 1, 1)
ranks

In [None]:
# Now we divide the Intersection Counts with the Ranks to obtain
# the Precision@K
precisions_at_k = intersection_count_at_k / ranks
precisions_at_k

In [None]:
# As a next step we will need to multiply the Precision@K with the Relevance@K.

# This will zero out the positions where we did not have a matching item
precisions_at_k = precisions_at_k * rel_at_k
precisions_at_k

### As a final step in order to calculate the **Average Precision @K** we need to take the average of the **precisions_at_k** variable

In [None]:
avg_precision_at_k = precisions_at_k.mean()

avg_precision_at_k

## Mean Average Precision at K (MAP@K)

Having calculated the Average Precision @ K for a single (ground truth, recommendations) pair, all we have to do is to **loop over all the other pairs** and **compute the final average step**

Since this is rather simple to show it step by step, I'll simply write the two two functions that calculate
- avg_precision_at_k
- mean_average_precision (@K)

In [None]:
def avg_precision_at_k_v2(y_true, y_pred, max_k=12, adjust_with_rel=True):
    """
    Computes the Average Precision at K in a more efficient way

    Parameters
    ----------
    y_true: np.array
        Array of correct recommendations (Order doesn't matter)
    y_pred: np.array
        Array of predicted recommendations (Order does matter)
    max_k: int, optional
        Maximum number of predicted recommendations
    adjust_with_rel: bool
        Whether you want to multiplicate the precisions with the Rel@K
        
    Returns
    _______
    score: float
        Precision at k
    """
    y_pred = y_pred[:max_k]

    _, _, pred_indexes = np.intersect1d(
        y_true, y_pred, assume_unique=True, return_indices=True)

    rel_at_k = np.zeros(len(y_pred), dtype=int)
    rel_at_k[pred_indexes] = 1

    intersection_count_at_k = rel_at_k.cumsum()
    ranks = np.arange(1, len(y_pred) + 1, 1)
    precisions_at_k = intersection_count_at_k / ranks

    if adjust_with_rel:
        precisions_at_k = precisions_at_k * rel_at_k

    return precisions_at_k.mean()

def mean_average_precision_v2(y_true, y_pred, k=12):
    """
    Computes the Mean Average Precision @K (MAP@K) in a more
    efficient way

    Returns
    -------

    """
    scores = []
    
    
    for gt, pred in zip(y_true, y_pred):
        scores.append(avg_precision_at_k_v2(gt, pred, k))

    return np.mean(scores)

In [None]:
# Construct a "large" dataset 
n_samples = 1_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# V2 Benchmark @ 1.000

# The V1 benchmark at 1.000 pairs is: 479 ms ± 17.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit mean_average_precision_v2(y_true, y_pred, k=12)

In [None]:
# Construct a "larger" dataset 
n_samples = 10_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# V2 Benchmark @ 10.000

# The V1 benchmark at 10.000 pairs is: 4.44 s ± 49.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit mean_average_precision_v2(y_true, y_pred, k=12)

In [None]:
# Construct an even "larger" dataset 
n_samples = 100_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# Current Benchmark @ 100.000 items

# The baseline benchmark at 100.000 pairs is: 44.5 s ± 1.14 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit mean_average_precision_v2(y_true, y_pred, k=12)

## Are the Two implementations (baseline & V2) the same?

### Let's check the Average Precision At K implementations

In [None]:
gtruth = np.array(['a', 'b', 'c', 'd', 'e', 'f', 'h', 'i', 'j', 'k', 'l', 'm'])
preds =  np.array(['a', 'f', 'c', 'g', 'b', 'k', 'o', 'n', 'x', 'l', 'q', 'i'])
gtruth.shape

In [None]:
for i in range(1, 13):
    baseline = average_precision_at_k_v1(gtruth, preds, k=i)
    v2 = avg_precision_at_k_v2(gtruth, preds, max_k=i)
    
    print(f'Avg Precision @ {i} | Baseline (v1): {baseline} | V2: {v2} | Are equal: {baseline==v2}')
    assert baseline == v2

# So, it seems that the two implementations fetch the same results

### Let's check the MAP@K implementations

In [None]:
# Construct a dataset 
n_samples = 10

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
for i in range(1, 13):
    baseline = mean_average_precision_v1(y_true, y_pred, k=i)
    v2 = mean_average_precision_v2(y_true, y_pred, k=i)

    print(f'MAP @ {i} | Baseline (v1): {baseline} | V2: {v2} | Are equal: {baseline==v2}')
    assert baseline == v2

# So, it seems that the two implementations for MAP@K fetch the same results

# V3 Implementation!

This implementation tries to beat V2 by taking into account **1 critical assumption**.

The assumption is that the **number of recommendations for all samples is the same**

In [None]:
def mean_average_precision_v3(y_true, y_pred, k=12):
    
    # compute the Rel@K for all items     
    rel_at_k = np.zeros((len(y_true), k), dtype=int)
    
    # collect the intersection indexes (for the ranking vector) for all pairs
    for idx, (truth, pred) in enumerate(zip(y_true, y_pred)):
        _, _, inter_idxs = np.intersect1d(truth, pred[:k], assume_unique=True, return_indices=True)         
        rel_at_k[idx, inter_idxs] = 1
    
    # Calculate the intersection counts for all pairs     
    intersection_count_at_k = rel_at_k.cumsum(axis=1)
    
    # we have the same denominator for all ranking vectors     
    ranks = np.arange(1, k + 1, 1)
    
    # Calculating the Precision@K for all Ks for all pairs     
    precisions_at_k = intersection_count_at_k / ranks
    # Multiply with the Rel@K for all pairs     
    precisions_at_k = precisions_at_k * rel_at_k

    # Calculate the average precisions @ K for all pairs
    average_precisions_at_k = precisions_at_k.mean(axis=1)
    
    # calculate the final MAP@K
    map_at_k = average_precisions_at_k.mean()

    return map_at_k

In [None]:
# Construct a "large" dataset 
n_samples = 1_000
y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples
len(y_true), len(y_pred)

In [None]:
# V3 Benchmark @ 1.000

# V1 benchmark at 1.000 pairs is: 479 ms ± 17.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) (baseline)
# V2 benchmark at 1.000 pairs is: 29.6 ms ± 617 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit mean_average_precision_v3(y_true, y_pred, k=12)

In [None]:
# Construct a "larger" dataset 
n_samples = 10_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# V3 Benchmark @ 10.000

# V1 benchmark at 10.000 pairs is: 4.44 s ± 49.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# V2 benchmark at 10.000 pairs is: 329 ms ± 52.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit mean_average_precision_v3(y_true, y_pred, k=12)

In [None]:
# Construct an even "larger" dataset 
n_samples = 100_000

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

In [None]:
# V3 Benchmark @ 100.000 items

# V1 benchmark at 100.000 pairs is: 44.5 s ± 1.14 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
# V2 benchmark at 100.000 pairs is: 3.15 s ± 388 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit mean_average_precision_v3(y_true, y_pred, k=12)

In [None]:
# Finally we check whether V1, V2 and V3 have the same results

# Construct a dataset 
n_samples = 10

y_true = [gtruth] * n_samples
y_pred = [preds] * n_samples

len(y_true), len(y_pred)

for i in range(1, 13):
    baseline = mean_average_precision_v1(y_true, y_pred, k=i)
    v2 = mean_average_precision_v2(y_true, y_pred, k=i)
    v3 = mean_average_precision_v3(y_true, y_pred, k=i)
    
    print(f'MAP @ {i} | Baseline (v1): {baseline} | V2: {v2}  | V3: {v3} | Are equal: {baseline==v2}, {baseline==v3}')
    assert baseline == v2
    assert baseline == v3

# So, it seems that the two implementations for MAP@K fetch the same results

## Thank you for your time!

### Feel free to comment and/or upvote!