#**Normalized Discounted Cumulative Gain**

In this piece of code, we have seen the working of the normalized discounted cumulative gain or nDCG. For this, we took the help of SVD model which we applied on our dataset and calculated the nDCG for the same.

Importing the necessary libraries

In [None]:
import random
from scipy.sparse import csr_matrix, dok_matrix
from math import ceil
import numpy as np
import pandas as pd
from surprise import SVD
from surprise import Dataset
from surprise import Reader
from surprise import accuracy
from surprise.model_selection import train_test_split
from collections import defaultdict

**Param** *predictions*: The prediction object given by the model

**Param** *n*: (default =10), The number of predictions to choose based on relevancy

**Desciption:** The following function iterates over the prediction object and for each *uid* in the *predictions* object returns n number of most relevant results.


In [None]:
def get_top_n(predictions, n=10):
    # First map the predictions to each user.
    top_n = defaultdict(list)
    for uid, iid, true_r, est, _ in predictions:
        top_n[uid].append((iid, true_r, est))

    # Then sort the predictions for each user and retrieve the k highest ones.
    for uid, user_ratings in top_n.items():
        user_ratings.sort(key=lambda x: x[2], reverse=True)
        top_n[uid] = user_ratings[:n]

    return top_n

**Param** *user*: Represents the uid

**Param** *predictions*: The prediction object given by the model

**Desciption:** The following function iterates over the *predictions* for a particular *user* passed to the function and for each of the item in the *predictions* object, calculates the ratio of the **ground truth** to the **predicted truth**.
The average values of these ratios are returned as a key-value pair.

In [None]:
def calc_ndcg(user, predictions):

    averageValues = []

    for i in range(len(predictions)):
        gt = predictions[i][1]  # ground truth
        pt = predictions[i][2]  # predicted truth
        if(pt > 0 and gt > 0):
            ratio = gt/pt
            averageValues.append(ratio)

    ndcg = 0
    for x in averageValues:
        ndcg += x
    ndcg = ndcg/(len(averageValues))
    return {user: ndcg}

### Reading Data
Reading the data using pandas and loading the data-frame into the *data* variable using the *reader* helper method from **Surprise**

In [None]:
df = pd.read_csv('../rating.csv', sep="\t")
print(df.head())

reader = Reader(rating_scale=(1, 5))

data = Dataset.load_from_df(df[['U_ID', 'P_ID', 'RATING']], reader)

Splitting the data into train and test set in a 4:1 ratio

In [None]:
trainset, testset = train_test_split(data, test_size=.20)

### Single Value Decomposition Model
Defining the algorithm and training the algorithm on the trainset, and predict ratings for the testset

In [None]:
algo = SVD()
algo.fit(trainset)

predictions = algo.test(testset)

In [None]:
predictions[1:5]

Then compute Root Mean Square Error

In [None]:
accuracy.rmse(predictions)

Testing the model and computing the NDCG values to measure the accuracy

In [None]:
top_n = get_top_n(predictions, n=10)

In [None]:
top_n[1:5]

In [None]:
ndcg_values_final = []

itr = 0
for uid, user_ratings in top_n.items():
    res = calc_ndcg(uid, user_ratings)
    ndcg_values_final.append(res[uid])
    itr += 1
    if itr > 20:
        break

In [None]:
print("Final ndcg value: ", sum(ndcg_values_final)/len(ndcg_values_final))