# Multi label classification

This type of classification deals with type of problems the target is multiple instead of just 1 label. For example, given an image, we need to predict what different objects are present in the image.

For this type of problem, evaluation metrics are also different. Most commons ones are:
1. Precision at k (P@k)
1. Average precision at k (AP@k)
1. Mean average precision at k (MAP@k)
1. Log loss

## Precision at k (P@k)

It is defined as the number of hits for top-k predictions for actual classes. Please don't take this as same precision as we learnt in Binary or Multi-class classification.

In [1]:
def precision_k(y_true, y_pred, k):
    """
    Computes the precision at k for a single sample
    
    :param y_true: Actual classes for the sample
    :param y_pred: Predicted classes for the sample
    :param k: The value for k
    :returns computed precision at k
    """
    
    if k == 0:
        return 0
    
    y_pred = y_pred[:k]
    y_pred_set = set(y_pred)
    
    y_true_set = set(y_true)
    
    common_classes = y_pred_set.intersection(y_true_set)
    
    return len(common_classes) / len(y_pred[:k])

## Average precision at k (AP@k)

It is the average of precision at all k i.e. AP@3 = average of P@1, P@2 and P@3.

In [2]:
def avg_precision_k(y_true, y_pred, k):
    """
    Computes the average precision at k for a single sample
    
    :param y_true: Actual classes for the sample
    :param y_pred: Predicted classes for the sample
    :param k: The value for k
    :returns computed average precision at k
    """
    if k == 0:
        return 0
    
    p_k = []
    for i in range(1, k+1):
        p_k.append(precision_k(y_true, y_pred, i))
    
    if len(p_k) == 0:
        return 0
    return sum(p_k)/len(p_k)

Let's use functions for an example.

In [3]:
y_true = [
    [1, 2, 3],
    [0, 2],
    [1],
    [2, 3],
    [1, 0],
    []
]

y_pred = [
    [0, 1, 2],
    [1],
    [0, 2, 3],
    [2, 3, 4, 0],
    [0, 1, 2],
    [0]
]

In [5]:
for yt, yp in zip(y_true, y_pred):
    for j in range(1, 4):
        print(
        f"""
        y_true={yt},
        y_pred={yp},
        AP@{j}={avg_precision_k(yt, yp, j)}
        """)


        y_true=[1, 2, 3],
        y_pred=[0, 1, 2],
        AP@1=0.0
        

        y_true=[1, 2, 3],
        y_pred=[0, 1, 2],
        AP@2=0.25
        

        y_true=[1, 2, 3],
        y_pred=[0, 1, 2],
        AP@3=0.38888888888888884
        

        y_true=[0, 2],
        y_pred=[1],
        AP@1=0.0
        

        y_true=[0, 2],
        y_pred=[1],
        AP@2=0.0
        

        y_true=[0, 2],
        y_pred=[1],
        AP@3=0.0
        

        y_true=[1],
        y_pred=[0, 2, 3],
        AP@1=0.0
        

        y_true=[1],
        y_pred=[0, 2, 3],
        AP@2=0.0
        

        y_true=[1],
        y_pred=[0, 2, 3],
        AP@3=0.0
        

        y_true=[2, 3],
        y_pred=[2, 3, 4, 0],
        AP@1=1.0
        

        y_true=[2, 3],
        y_pred=[2, 3, 4, 0],
        AP@2=1.0
        

        y_true=[2, 3],
        y_pred=[2, 3, 4, 0],
        AP@3=0.8888888888888888
        

        y_true=[1, 0],
        y_pred=[0, 1, 2],
        AP@1=1.

As seen above, this is all for one sample, but in ML, we do for all samples. For this we calculate, **mean average precision at k or MAP@k**.

## Mean average precision at k or MAP@k

It is the average of all AP@k.

In [6]:
def map_k(y_true, y_pred, k):
    """
    Computes the mean average precision at k or map@k
    
    :param y_true: Actual classes
    :param y_pred: Predicted classes
    :param k: the value of k
    :returns the mean average precision at k
    """
    if k == 0:
        return 0
    apk = []
    for yt, yp in zip(y_true, y_pred):
        apk.append(avg_precision_k(yt, yp, k))
    return sum(apk) / len(apk)

In [7]:
# Calculate for an example
y_true = [
    [1, 2, 3],
    [0, 2],
    [1],
    [2, 3],
    [1, 0],
    []
]

y_pred = [
    [0, 1, 2],
    [1],
    [0, 2, 3],
    [2, 3, 4, 0],
    [0, 1, 2],
    [0]
]

for i in range(1, 5):
    print(f'MAP@{i}: {map_k(y_true, y_pred, k=i)}')

MAP@1: 0.3333333333333333
MAP@2: 0.375
MAP@3: 0.3611111111111111
MAP@4: 0.34722222222222215


## Mean column-wise log loss

To calculate this, we can convert the targets to binary format and then use a log-loss for each column. In the end, you can take the average of log loss in each column. This is known as **mean column-wise log loss**.