<a href="https://colab.research.google.com/github/Andrey22154/ranking-ml-simulator/blob/main/%D0%9C%D0%B5%D1%82%D1%80%D0%B8%D0%BA%D0%B8_%D0%B7%D0%B0%D0%B4%D0%B0%D1%87_%D1%80%D0%B0%D0%BD%D0%B6%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%8F_ML_%D0%A1%D0%B8%D0%BC%D1%83%D0%BB%D1%8F%D1%82%D0%BE%D1%80.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**CGp(Cumulative Gain)**

In [3]:
from typing import List

import numpy as np


def cumulative_gain(relevance: List[float], k: int) -> float:
    """Score is cumulative gain at k (CG@k)

    Parameters
    ----------
    relevance:  `List[float]`
        Relevance labels (Ranks)
    k : `int`
        Number of elements to be counted

    Returns
    -------
    score : float
    """
    score = round(sum(relevance), 3)
    return score


In [4]:
relevance = [0.99, 0.94, 0.88, 0.74, 0.71, 0.68]
k = 5
print(cumulative_gain(relevance, k))

4.94


**DCG@k - (Discounted Cumulative Gain)**

In [82]:
from typing import List
import math
import numpy as np


def discounted_cumulative_gain(relevance: List[float], k: int, method: str = "standard") -> float:
    """Discounted Cumulative Gain

    Parameters
    ----------
    relevance : `List[float]`
        Video relevance list
    k : `int`
        Count relevance to compute
    method : `str`, optional
        Metric implementation method, takes the values​​
        `standard` - adds weight to the denominator
        `industry` - adds weights to the numerator and denominator
        `raise ValueError` - for any value

    Returns
    -------
    score : `float`
        Metric score
    """
    if method == 'standard':
      score = sum([relevance[i]/math.log2(i+2) for i in range(k)])
    else:
      score = sum([2**(relevance[i])-1/math.log2(i+2) for i in range(k)])

    return score

In [83]:
relevance = [0.99, 0.94, 0.88, 0.74, 0.71, 0.68]
k = 5
method = 'standard'
print(discounted_cumulative_gain(relevance, k, method))

2.6164401144680056


** nDCG - Normalized Discounted Cumulative Gain**

In [88]:
from typing import List

import numpy as np


def normalized_dcg(relevance: List[float], k: int, method: str = "standard") -> float:
    """Normalized Discounted Cumulative Gain.

    Parameters
    ----------
    relevance : `List[float]`
        Video relevance list
    k : `int`
        Count relevance to compute
    method : `str`, optional
        Metric implementation method, takes the values
        `standard` - adds weight to the denominator
        `industry` - adds weights to the numerator and denominator
        `raise ValueError` - for any value

    Returns
    -------
    score : `float`
        Metric score
    """

    iDCG_lst = sorted(relevance)[::-1]
    iDCG = sum([iDCG_lst[i]/math.log2(i+2) for i in range(k)])

    if method == 'standard':
      DCG = sum([relevance[i]/math.log2(i+2) for i in range(k)])
    elif method == 'industry':
      DCG = sum([2**(relevance[i]) - 1/math.log2(i+2) for i in range(k)])

    score = DCG/iDCG

    return score

In [89]:
relevance = [0.99, 0.94, 0.74, 0.88, 0.71, 0.68]
k = 5
method = 'standard'
print(normalized_dcg(relevance, k, method))

0.9962906539247512


**Average nDCG - Average Normalized Discounted Cumulative Gain**

In [92]:
from typing import List

import numpy as np

def avg_ndcg(list_relevances: List[List[float]], k: int, method: str = 'standard') -> float:
    """Average nDCG

    Parameters
    ----------
    list_relevances : `List[List[float]]`
        Video relevance matrix for various queries
    k : `int`
        Count relevance to compute
    method : `str`, optional
        Metric implementation method, takes the values ​​\
        `standard` - adds weight to the denominator\
        `industry` - adds weights to the numerator and denominator\
        `raise ValueError` - for any value

    Returns
    -------
    score : `float`
        Metric score
    """
    av_nDCG = []

    for elem in list_relevances:

        iDCG_lst = sorted(elem)[::-1]
        iDCG = sum([iDCG_lst[i]/math.log2(i+2) for i in range(k)])

        if method == 'standard':
          DCG = sum([elem[i]/math.log2(i+2) for i in range(k)])
        elif method == 'industry':
          DCG = sum([2**(elem[i]) - 1/math.log2(i+2) for i in range(k)])
        score = DCG/iDCG

        av_nDCG.append(score)

    score = sum(av_nDCG)/len(av_nDCG)

    return score

In [93]:
list_relevances = [[0.99, 0.94, 0.88, 0.89, 0.72, 0.65],
                   [0.99, 0.92, 0.93, 0.74, 0.61, 0.68], 
                   [0.99, 0.96, 0.81, 0.73, 0.76, 0.69]]  
k = 5
method = 'standard'
print(avg_ndcg(list_relevances, k, method))


0.9961322104432755
