# Multi Label Classification

In multi-label classification, each sample can have one or more
classes associated with it. One simple example of this type of problem would be a
task in which you are asked to predict different objects in a given image.

Imagine an image where we 
have a chair, flower-pot, window, but we don’t have other objects such as computer,
bed, tv, etc. So, one image can have multiple targets associated with it. This type of
problem is the multi-label classification problem.

The metrics for this type of classification problem are a bit different. Some suitable
and most common metrics are:
- Precision at k (P@k)
- Average precision at k (AP@k)
- Mean average precision at k (MAP@k)
- Log loss

### Precision at k (P@k)

If you have a list of original classes for a given
sample and list of predicted classes for the same, precision is defined as the number
of hits in the predicted list considering only top-k predictions, divided by k.

In [5]:
def pk(y_true, y_pred, k):
    """
    This function calculates precision at k
    for a single sample
    :param y_true: list of values, actual classes
    :param y_pred: list of values, predicted classes
    :return: precision at a given value k
    """
    
    # if k is 0, return 0. we should never have this
    # as k is always >= 1
    if k == 0:
        return 0
    
    # we are interested only in top-k predictions
    y_pred = y_pred[:k]
    
    # convert predictions to set
    pred_set = set(y_pred)
    
    # convert actual values to set
    true_set = set(y_true)
    
    # find common values
    common_values = pred_set.intersection(true_set)
    
    # return length of common values over k
    return len(common_values) / len(y_pred[:k])

### Average precision at k (AP@k) 

AP@k is calculated using P@k.
For example, if we have to calculate AP@3, we calculate AP@1, AP@2 and AP@3
and then divide the sum by 3.

In [6]:
def apk(y_true, y_pred, k):
    """
    This function calculates average precision at k
    for a single sample
    :param y_true: list of values, actual classes
    :param y_pred: list of values, predicted classes
    :return: average precision at a given value k
    """
    # initialize p@k list of values
    pk_values = []
    
    # loop over all k. from 1 to k + 1
    for i in range(1, k + 1):
        # calculate p@i and append to list
        pk_values.append(pk(y_true, y_pred, i))
        
    # if we have no values in the list, return 0
    if len(pk_values) == 0:
        return 0
    
    # else, we return the sum of list over
    return sum(pk_values) / len(pk_values)

In [7]:
y_true = [[1, 2, 3], [0, 2], [1], [2, 3], [1, 0],[]]

y_pred = [[0, 1, 2],[1], [0, 2, 3], [2, 3, 4, 0],[0, 1, 2], [0]]

for i in range(len(y_true)):
    for j in range(1, 4):
        print(f"""
        y_true={y_true[i]},
        y_pred={y_pred[i]},
        AP@{j}={apk(y_true[i], y_pred[i], k=j)}""")


        y_true=[1, 2, 3],
        y_pred=[0, 1, 2],
        AP@1=0.0

        y_true=[1, 2, 3],
        y_pred=[0, 1, 2],
        AP@2=0.25

        y_true=[1, 2, 3],
        y_pred=[0, 1, 2],
        AP@3=0.38888888888888884

        y_true=[0, 2],
        y_pred=[1],
        AP@1=0.0

        y_true=[0, 2],
        y_pred=[1],
        AP@2=0.0

        y_true=[0, 2],
        y_pred=[1],
        AP@3=0.0

        y_true=[1],
        y_pred=[0, 2, 3],
        AP@1=0.0

        y_true=[1],
        y_pred=[0, 2, 3],
        AP@2=0.0

        y_true=[1],
        y_pred=[0, 2, 3],
        AP@3=0.0

        y_true=[2, 3],
        y_pred=[2, 3, 4, 0],
        AP@1=1.0

        y_true=[2, 3],
        y_pred=[2, 3, 4, 0],
        AP@2=1.0

        y_true=[2, 3],
        y_pred=[2, 3, 4, 0],
        AP@3=0.8888888888888888

        y_true=[1, 0],
        y_pred=[0, 1, 2],
        AP@1=1.0

        y_true=[1, 0],
        y_pred=[0, 1, 2],
        AP@2=1.0

        y_true=[1, 0],
        y_pred=

### Mean average precision at k (MAP@k)
In machine learning, we are interested in all samples, and that’s why we have mean average precision 
at k or MAP@k. MAP@k is just an average of AP@k and can be calculated easily by the following python code

In [9]:
def mapk(y_true, y_pred, k):
    """
    This function calculates mean avg precision at k
    for a single sample
    :param y_true: list of values, actual classes
    :param y_pred: list of values, predicted classes
    :return: mean avg precision at a given value k
    """
    # initialize empty list for apk values
    apk_values = []
    
    # loop over all samples
    for i in range(len(y_true)):
        # store apk values for every sample
        apk_values.append(apk(y_true[i], y_pred[i], k=k))
        
    # return mean of apk values list
    return sum(apk_values) / len(apk_values)


In [20]:
print(mapk(y_true, y_pred, k=1))
print(mapk(y_true, y_pred, k=2))
print(mapk(y_true, y_pred, k=3))
print(mapk(y_true, y_pred, k=4))

0.3333333333333333
0.375
0.3611111111111111
0.34722222222222215


P@k, AP@k and MAP@k all range from 0 to 1 with 1 being the best.

Please note that sometimes you might see different implementations of P@k and
AP@k on the internet. For example, let’s take a look at one of these
implementations.

This implementation is another version of AP@k where order matters and we weigh
the predictions. This implementation will have slightly different results from what
I have presented.

In [None]:
# taken from:
# https://github.com/benhamner/Metrics/blob/
# master/Python/ml_metrics/average_precision.py
import numpy as np
    def apk(actual, predicted, k=10):
    """
    Computes the average precision at k.
    This function computes the AP at k between two lists of
    items.
    Parameters
    ----------
    actual : list
    A list of elements to be predicted (order doesn't matter)
    predicted : list
    A list of predicted elements (order does matter)
    k : int, optional
    The maximum number of predicted elements
    Returns
    -------
    score : double
    The average precision at k over the input lists
    """
    if len(predicted)>k:
        predicted = predicted[:k]
        score = 0.0
        num_hits = 0.0
    
    for i,p in enumerate(predicted):
        if p in actual and p not in predicted[:i]:
            num_hits += 1.0
            score += num_hits / (i+1.0)
    
    if not actual:
        return 0.0
    
    return score / min(len(actual), k)


### Log loss for multi-label classification

You can convert the targets to binary format and then use a log loss for each column. In
the end, you can take the average of log loss in each column. This is also known as
**mean column-wise log loss**. 

Of course, there are other ways you can implement this,
and you should explore it as you come across it.

We have now reached a stage where we can say that we now know all binary, multi-
class and multi-label classification metrics.