# <center>DataLab Cup 4: Recommender Systems</center>
<center>Shan-Hung Wu & DataLab</center>
<center>Fall 2023</center>

Team: 陳瑜旋轉陳玟旋轉陳瑜旋

Team Member: 111501538 劉杰閎、111062588 陳玟璇、111062697 吳律穎

## Platform: [Kaggle](https://www.kaggle.com/t/b06e248a3827434f80c4fdc6009d5fe0)

Please download the dataset and the environment source code from Kaggle.

## Environment Setting


In [1]:
import os
import random
import copy

import tensorflow as tf
import numpy as np
import pandas as pd
from tqdm import tqdm

from evaluation.environment import TrainingEnvironment, TestingEnvironment
from sentence_transformers import SentenceTransformer

2024-01-16 04:05:22.825782: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [2]:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        # Select GPU number 1
        tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)

2 Physical GPUs, 1 Logical GPUs


2024-01-16 04:05:25.905052: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-01-16 04:05:25.905260: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-01-16 04:05:25.911081: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-01-16 04:05:25.911291: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-01-16 04:05:25.911487: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from S

In [3]:
# Official hyperparameters for this competition (do not modify)
N_TRAIN_USERS = 1000
N_TEST_USERS = 2000
N_ITEMS = 209527
HORIZON = 2000
TEST_EPISODES = 5
SLATE_SIZE = 5

## Datasets

In [4]:
# Dataset paths
USER_DATA = os.path.join('dataset', 'user_data.json')
ITEM_DATA = os.path.join('dataset', 'item_data.json')

# Output file path
OUTPUT_PATH = os.path.join('output', 'output.csv')

### User Data

In [5]:
df_user = pd.read_json(USER_DATA, lines=True)
df_user

Unnamed: 0,user_id,history
0,0,"[42558, 65272, 13353]"
1,1,"[146057, 195688, 143652]"
2,2,"[67551, 85247, 33714]"
3,3,"[116097, 192703, 103229]"
4,4,"[68756, 140123, 135289]"
...,...,...
1995,1995,"[95090, 131393, 130239]"
1996,1996,"[2360, 147130, 8145]"
1997,1997,"[99794, 138694, 157888]"
1998,1998,"[55561, 60372, 51442]"


### Item Data

In [6]:
df_item = pd.read_json(ITEM_DATA, lines=True)
df_item

Unnamed: 0,item_id,headline,short_description
0,0,Over 4 Million Americans Roll Up Sleeves For O...,Health experts said it is too early to predict...
1,1,"American Airlines Flyer Charged, Banned For Li...",He was subdued by passengers and crew when he ...
2,2,23 Of The Funniest Tweets About Cats And Dogs ...,"""Until you have a dog you don't understand wha..."
3,3,The Funniest Tweets From Parents This Week (Se...,"""Accidentally put grown-up toothpaste on my to..."
4,4,Woman Who Called Cops On Black Bird-Watcher Lo...,Amy Cooper accused investment firm Franklin Te...
...,...,...,...
209522,209522,RIM CEO Thorsten Heins' 'Significant' Plans Fo...,Verizon Wireless and AT&T are already promotin...
209523,209523,Maria Sharapova Stunned By Victoria Azarenka I...,"Afterward, Azarenka, more effusive with the pr..."
209524,209524,"Giants Over Patriots, Jets Over Colts Among M...","Leading up to Super Bowl XLVI, the most talked..."
209525,209525,Aldon Smith Arrested: 49ers Linebacker Busted ...,CORRECTION: An earlier version of this story i...


## Text Embedding from item descriptions

在這裡我們針對每一個item的描述，使用`SentenceTransformer`去做Text Embedding，可以得到每一個item都有headline的embedding跟short_description的embedding，兩者的維度都是768。


In [7]:
dataset_dir = './dataset/'
if os.path.exists(dataset_dir + 'train_data_embedding.pkl'):
    df_item_train = pd.read_pickle(dataset_dir + 'train_data_embedding.pkl')
else:
    sbert = SentenceTransformer('all-mpnet-base-v2')
    df_item_train = pd.read_json(ITEM_DATA, lines=True)
    df_item_train['headline_embeddings'] = df_item_train['headline'].apply(lambda x: sbert.encode(x))
    df_item_train['short_description_embeddings'] = df_item_train['short_description'].apply(lambda x: sbert.encode(x))
    df_item_train.to_pickle(dataset_dir + 'train_data_embedding.pkl')

df_item_train

Unnamed: 0,item_id,headline,short_description,headline_embeddings,short_description_embeddings
0,0,Over 4 Million Americans Roll Up Sleeves For O...,Health experts said it is too early to predict...,"[-0.054995973, 0.10514701, 0.0009537986, -0.07...","[0.04689467, 0.089309394, -0.018575395, -0.029..."
1,1,"American Airlines Flyer Charged, Banned For Li...",He was subdued by passengers and crew when he ...,"[-0.020863444, 0.011131575, 0.0013632453, -0.0...","[0.017128233, -0.0062120855, 0.015252358, 0.02..."
2,2,23 Of The Funniest Tweets About Cats And Dogs ...,"""Until you have a dog you don't understand wha...","[0.017761054, 0.053476874, 6.918786e-05, -0.03...","[0.10238154, 0.07736524, 0.0020822838, -0.0614..."
3,3,The Funniest Tweets From Parents This Week (Se...,"""Accidentally put grown-up toothpaste on my to...","[-0.0029250348, 0.01137404, 0.0045979875, -0.0...","[0.04334459, 0.056244634, 0.0071496996, -0.057..."
4,4,Woman Who Called Cops On Black Bird-Watcher Lo...,Amy Cooper accused investment firm Franklin Te...,"[-0.0049342206, 0.053551663, 0.027952224, -0.0...","[-0.0066743735, 0.03416268, -0.00058029604, 0...."
...,...,...,...,...,...
209522,209522,RIM CEO Thorsten Heins' 'Significant' Plans Fo...,Verizon Wireless and AT&T are already promotin...,"[0.041218493, -0.007820907, -0.01887703, -0.02...","[-0.029524302, -0.0045847334, -0.054970894, -0..."
209523,209523,Maria Sharapova Stunned By Victoria Azarenka I...,"Afterward, Azarenka, more effusive with the pr...","[-0.047861934, -0.027825285, -0.0048302715, -0...","[0.03547541, -0.027677324, 0.019167567, -0.007..."
209524,209524,"Giants Over Patriots, Jets Over Colts Among M...","Leading up to Super Bowl XLVI, the most talked...","[-0.0816778, 0.022369152, 0.027179016, 0.02018...","[-0.020275101, 0.10664522, -0.007810726, -0.01..."
209525,209525,Aldon Smith Arrested: 49ers Linebacker Busted ...,CORRECTION: An earlier version of this story i...,"[-0.04274766, 0.12479968, -0.047635496, -0.057...","[0.044633802, 0.014033731, -0.004920267, -0.02..."


### Creating User Embedding dataframe

+ 我們會建立一個 User Embedding dataframe來代表每一個User的feature
+ 計算的方式就是透過每一個user之前點過哪三個item，我們去拿那三個item的embedding出來取平均，就當作這個user的embedding，最後一個column是把headline跟short description concat起來。


In [8]:
N_USERS = 2000
N_ITEMS = 209527
EMBEDDING_DIM = 768
HISTORY_SIZE = 3

In [9]:
if os.path.exists(dataset_dir + 'user_embedding.pkl'):
    df_user_embedding = pd.read_pickle(dataset_dir+'user_embedding.pkl')
else:
    df_user_embedding_list = []
    for user in range(N_USERS):
        # print(df_user.iloc[user])
        sum_headline = tf.zeros(shape=(EMBEDDING_DIM,)) # since all embeddings are (768,)
        sum_short_description = tf.zeros(shape=(EMBEDDING_DIM,)) # since all embeddings are (768,)
        for item in df_user.iloc[user]["history"]:
            headline_tensor = tf.convert_to_tensor(df_item_train.iloc[item]["headline_embeddings"])
            short_description_tensor = tf.convert_to_tensor(df_item_train.iloc[item]["short_description_embeddings"])
            sum_headline += headline_tensor
            sum_short_description += short_description_tensor
        sum_headline = tf.divide(sum_headline, HISTORY_SIZE)
        sum_short_description = tf.divide(sum_short_description, HISTORY_SIZE)
        concat_embedding = tf.concat([sum_headline, sum_short_description],axis=0)
        df_user_embedding_list.append([user, sum_headline.numpy(), sum_short_description.numpy(), concat_embedding.numpy()])
    df_user_embedding = pd.DataFrame(df_user_embedding_list, columns = ['user_id', 'headline_embedding', 'short_description_embedding','concat_embeddings'])

In [10]:
print(df_user_embedding.iloc[0])
print(df_user_embedding.iloc[0]["concat_embeddings"][768])
display(df_user_embedding.head(5))

user_id                                                                        0
headline_embedding             [0.038156908, 0.041387293, -0.004624029, -0.02...
short_description_embedding    [0.03716896, 0.0512002, -0.025162613, -0.01830...
concat_embeddings              [0.038156908, 0.041387293, -0.004624029, -0.02...
Name: 0, dtype: object
0.03716896


Unnamed: 0,user_id,headline_embedding,short_description_embedding,concat_embeddings
0,0,"[0.038156908, 0.041387293, -0.004624029, -0.02...","[0.03716896, 0.0512002, -0.025162613, -0.01830...","[0.038156908, 0.041387293, -0.004624029, -0.02..."
1,1,"[-0.01718974, 0.04380578, -0.0033005949, 0.026...","[-0.022690356, 0.041603807, -0.009130617, -0.0...","[-0.01718974, 0.04380578, -0.0033005949, 0.026..."
2,2,"[-0.0044823308, -0.017107317, -0.038572405, -0...","[0.03541447, 0.021701857, -0.016264068, -0.025...","[-0.0044823308, -0.017107317, -0.038572405, -0..."
3,3,"[-0.03414116, 0.035626348, -0.024177575, 0.043...","[-0.032475274, 0.034354758, -0.0064822477, 0.0...","[-0.03414116, 0.035626348, -0.024177575, 0.043..."
4,4,"[-0.0039870082, 0.082041055, -0.01948834, -0.0...","[0.009200389, 0.0539891, -0.026606128, -0.0117...","[-0.0039870082, 0.082041055, -0.01948834, -0.0..."


### Creating Item embedding dataframe

那我們也重新去建立一個item embedding，丟棄調原本對item的description，最後一個column是把headline跟short description concat起來。

In [11]:
if (os.path.exists(dataset_dir+'item_embedding.pkl')):
    df_item_embedding = pd.read_pickle(dataset_dir+'item_embedding.pkl')
else:
    item_embedding_list = []
    for item in range(len(df_item_train)):
        headline_tensor = tf.convert_to_tensor(df_item_train.iloc[item]["headline_embeddings"])
        short_description_tensor = tf.convert_to_tensor(df_item_train.iloc[item]["short_description_embeddings"])
        concat_embedding = tf.concat([headline_tensor, short_description_tensor],axis=0)
        item_embedding_list.append([item, headline_tensor.numpy(), short_description_tensor.numpy(), concat_embedding.numpy()])
    df_item_embedding = pd.DataFrame(item_embedding_list, columns=["item_id","headline_embeddings","short_description_embeddings","concat_embeddings"])
    df_item_embedding.to_pickle(dataset_dir + 'item_embedding.pkl')

In [12]:
print(df_item_embedding.iloc[0])
print(df_item_embedding.iloc[0]["concat_embeddings"][768])
print("len of item dataframe is ", len(df_item_embedding))
display(df_item_embedding.head(5))

item_id                                                                         0
headline_embeddings             [-0.054995973, 0.10514701, 0.0009537986, -0.07...
short_description_embeddings    [0.04689467, 0.089309394, -0.018575395, -0.029...
concat_embeddings               [-0.054995973, 0.10514701, 0.0009537986, -0.07...
Name: 0, dtype: object
0.04689467
len of item dataframe is  209527


Unnamed: 0,item_id,headline_embeddings,short_description_embeddings,concat_embeddings
0,0,"[-0.054995973, 0.10514701, 0.0009537986, -0.07...","[0.04689467, 0.089309394, -0.018575395, -0.029...","[-0.054995973, 0.10514701, 0.0009537986, -0.07..."
1,1,"[-0.020863444, 0.011131575, 0.0013632453, -0.0...","[0.017128233, -0.0062120855, 0.015252358, 0.02...","[-0.020863444, 0.011131575, 0.0013632453, -0.0..."
2,2,"[0.017761054, 0.053476874, 6.918786e-05, -0.03...","[0.10238154, 0.07736524, 0.0020822838, -0.0614...","[0.017761054, 0.053476874, 6.918786e-05, -0.03..."
3,3,"[-0.0029250348, 0.01137404, 0.0045979875, -0.0...","[0.04334459, 0.056244634, 0.0071496996, -0.057...","[-0.0029250348, 0.01137404, 0.0045979875, -0.0..."
4,4,"[-0.0049342206, 0.053551663, 0.027952224, -0.0...","[-0.0066743735, 0.03416268, -0.00058029604, 0....","[-0.0049342206, 0.053551663, 0.027952224, -0.0..."


In [13]:
from scipy import spatial
from sklearn.metrics.pairwise import cosine_similarity

## Create user-user similarity matrix 

為了考慮user與user之間的相似度，我們需要用到之前算好的user embedding。

In [14]:
display(df_user_embedding)

Unnamed: 0,user_id,headline_embedding,short_description_embedding,concat_embeddings
0,0,"[0.038156908, 0.041387293, -0.004624029, -0.02...","[0.03716896, 0.0512002, -0.025162613, -0.01830...","[0.038156908, 0.041387293, -0.004624029, -0.02..."
1,1,"[-0.01718974, 0.04380578, -0.0033005949, 0.026...","[-0.022690356, 0.041603807, -0.009130617, -0.0...","[-0.01718974, 0.04380578, -0.0033005949, 0.026..."
2,2,"[-0.0044823308, -0.017107317, -0.038572405, -0...","[0.03541447, 0.021701857, -0.016264068, -0.025...","[-0.0044823308, -0.017107317, -0.038572405, -0..."
3,3,"[-0.03414116, 0.035626348, -0.024177575, 0.043...","[-0.032475274, 0.034354758, -0.0064822477, 0.0...","[-0.03414116, 0.035626348, -0.024177575, 0.043..."
4,4,"[-0.0039870082, 0.082041055, -0.01948834, -0.0...","[0.009200389, 0.0539891, -0.026606128, -0.0117...","[-0.0039870082, 0.082041055, -0.01948834, -0.0..."
...,...,...,...,...
1995,1995,"[0.007958271, 0.06905828, -0.012985666, 0.0113...","[-0.005231034, 0.03755425, -0.013397035, -0.01...","[0.007958271, 0.06905828, -0.012985666, 0.0113..."
1996,1996,"[-0.036823038, 0.03161138, -0.017325308, 0.003...","[-0.010244309, -0.01270321, 0.00085817085, -0....","[-0.036823038, 0.03161138, -0.017325308, 0.003..."
1997,1997,"[0.02073842, 0.056454603, 0.000992029, 0.00940...","[0.012727796, 0.0049337894, -0.012769024, 0.02...","[0.02073842, 0.056454603, 0.000992029, 0.00940..."
1998,1998,"[0.016565206, 0.06024626, -0.0010410805, -0.01...","[0.025898555, 0.046939295, -0.02147156, -0.017...","[0.016565206, 0.06024626, -0.0010410805, -0.01..."


由於test user沒辦法透過收集資料來知道他們的喜好，因此我們會先透過上面的user embeding計算test user跟各個train user的cosine similarity。這樣在推薦item給test user的時候，就可以找出哪些train user跟他比較相近，就可以推薦這個test user那些train user會喜歡的東西。

In [15]:
if(os.path.exists(dataset_dir + 'user_user_similarity_matrix.pkl')):
    df_user_user_similarity_matrix = pd.read_pickle(dataset_dir + 'user_user_similarity_matrix.pkl')
else:
    user_user_similarity_matrix = []
    for train_user in range(0,1000):
        row_similarity_list = []
        for test_user in range(1000,2000):
            cosine_similarity = (1-spatial.distance.cosine(df_user_embedding.iloc[train_user]["concat_embeddings"], df_user_embedding.iloc[test_user]["concat_embeddings"]))
            row_similarity_list.append(cosine_similarity)
        user_user_similarity_matrix.append(row_similarity_list)
    # print(user_user_similarity_matrix)
    columns_names = list(range(1000,2000))
    df_user_user_similarity_matrix = pd.DataFrame(user_user_similarity_matrix, columns = columns_names)
    df_user_user_similarity_matrix.to_pickle(dataset_dir + 'user_user_similarity_matrix.pkl')


In [16]:
display(df_user_user_similarity_matrix)

Unnamed: 0,1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,...,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999
0,0.281648,0.208695,0.125609,0.179470,0.183822,0.109009,0.052345,0.202675,0.335739,0.217125,...,0.173621,0.222536,0.221223,0.189922,0.258909,0.221459,0.169270,0.253004,0.236725,0.312044
1,0.107756,0.157925,0.171894,0.191783,0.215627,0.263724,0.066528,0.048420,0.140302,0.219080,...,0.188666,0.127834,0.170375,0.164114,0.132382,0.155832,0.172978,0.137667,0.036835,0.184401
2,0.340798,0.104089,0.117053,0.121768,0.192938,0.217003,-0.007815,0.082012,0.093833,0.201811,...,0.162832,0.347783,0.093973,0.229857,0.219791,0.337218,-0.006569,0.166411,0.199871,0.136065
3,0.152984,0.204095,0.054092,0.151933,0.109114,0.051870,0.128989,0.186338,0.399219,0.216707,...,0.244051,0.087133,0.480642,0.195987,0.244529,0.177108,0.142491,0.211308,-0.009201,0.387058
4,0.410860,0.238368,0.188938,0.215626,0.339893,0.351212,0.086864,0.204705,0.207856,0.335223,...,0.293352,0.420971,0.161138,0.279038,0.344157,0.386043,0.085055,0.218188,0.275574,0.251951
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,0.318691,0.269468,0.119631,0.186885,0.302532,0.243200,0.058353,0.258401,0.215519,0.326197,...,0.224176,0.286827,0.162822,0.274992,0.254537,0.387950,0.023907,0.317955,0.197057,0.175260
996,0.246249,0.342056,0.188454,0.261937,0.381595,0.282175,0.023506,0.282524,0.276801,0.293508,...,0.296109,0.172917,0.195978,0.312822,0.194241,0.184523,0.052026,0.216403,0.212665,0.304270
997,0.253551,0.193244,0.278929,0.220450,0.320774,0.388695,0.158398,0.211402,0.324682,0.186920,...,0.220253,0.273932,0.362760,0.306700,0.410991,0.453089,0.130172,0.230232,0.169021,0.406711
998,0.232465,0.211833,0.121255,0.133941,0.225623,0.214997,0.058819,0.195395,0.183076,0.203365,...,0.160095,0.239532,0.179513,0.185097,0.231373,0.376211,0.013491,0.250970,0.063096,0.201342


## Simulation Environments (Testing)

在執行testing enviroment之前，我們要先讀取之前在training environment收集到的data做為推薦的依據。比較特別的是，我們蒐集資料的方法是，我們只會蒐集user真正有點擊的東西，而不會把那些不點擊的東西也記錄下來。原本也有試過都收集下來，然後送給最後面我們嘗試過的model訓練，但是這樣反而會造成model的偏差，因此最後就只收集有點擊的。

### Load Experience Data

讀取經驗資料的時候，我們會用一個user_item_info的dictionary，紀錄每個user點擊過哪些item以及這些item被點擊的次數。

In [17]:
user_item_info = {uid: {} for uid in range(N_TEST_USERS)}  # for every element: {user_id: {item_id: click_count}}

with open('./dataset/clicked_ids_output_final.txt', 'r') as file:
    for line in file:
        line = line.strip().split(', ')
        user_id = int(line[0])
        item_id = int(line[1])
        
        try:
            user_item_info[user_id][item_id] += 1
        except KeyError:
            user_item_info[user_id][item_id] = 1

### Top 3 alike user

同前所述，test user的喜好沒辦法透過蒐集資料的方式來取得，因此我們先用前面算出的user-user similarity matrix來找出test user最接近的三個train user。接下來做正規化，讓這三個train user跟test user相似度加總為1作為權重。然後就可以將這三個train user喜歡的物品及點擊次數乘上對應的權重，得到test user的interest item及點擊次數。

In [18]:
# ensure test user is empty dict
for uid in range(1000, 2000):
    user_item_info[uid] = {}

for test_user_id in range(1000, 2000):
    # print("Origin similarity:", df_user_item_similarity_latest.iloc[test_user][0])
    column_data = df_user_user_similarity_matrix[test_user_id]
    top_three_similarity = column_data.nlargest(3) # top 3 alike user and its similarity
    
    sum_of_similarities = top_three_similarity.sum() # used to normalized
    normalized_list = [top_three_similarity.iloc[i]/sum_of_similarities for i in range(3)]
    # print(normalized_list, "\n")
    
    for i in range(3):
        alike_user_id = top_three_similarity.index[i]
        interest_items = list(user_item_info[alike_user_id].keys())
        interest_count = list(user_item_info[alike_user_id].values())
        interest_count = [x * normalized_list[i] for x in interest_count]
                
        for j in range(len(interest_items)):
            item_id = interest_items[j]
            item_count = interest_count[j]
            try:
                user_item_info[test_user_id][item_id] += item_count
            except KeyError:
                user_item_info[test_user_id][item_id] = item_count

In [19]:
item_dict = {}
item_weights = {}

for user_id in tqdm(range(N_TEST_USERS)):
    item_dict[user_id] = list(user_item_info[user_id].keys())
    interest_count = np.array(list(user_item_info[user_id].values()))
    total_count = interest_count.sum()
    item_weights[user_id] = interest_count / total_count

100%|██████████| 2000/2000 [00:01<00:00, 1590.62it/s]


In [20]:
TESTING = True

有了所有user的interest item以及點擊次數之後，在testing environment遇到一個user，就可以推薦他的interest item給他，而點擊次數越大的會有越高的機會被推薦。另外，由於一個item被某個user點擊之後，這個user在這個episode再次點擊這個item的機率就會大幅下降，因此如果user有點擊，我們會把這個user的這個item的weight設定為0，這樣這個item在這個episode就不會再出現在推薦名單裡了。(Weight在每個episode一開始會先複製一份原始的，因此改變weight不會影響到下個episode)

In [21]:
if(TESTING):
    # Initialize the testing environment
    test_env = TestingEnvironment()
    scores = []

    # The item_ids here is for the random recommender
    item_ids = [i for i in range(N_ITEMS)]

    # Repeat the testing process for 5 times
    for _ in range(TEST_EPISODES):
        # [TODO] Load your model weights here (in the beginning of each testing episode)
        # [TODO] Code for loading your model weights...
        # df_test_similarity = df_user_item_similarity_latest.copy() # reload from train
        current_weights = copy.deepcopy(item_weights)
        
        # Start the testing process
        with tqdm(desc='Testing') as pbar:
            # Run as long as there exist some active users
            while test_env.has_next_state():
                # Get the current user id
                cur_user = test_env.get_state()

                # [TODO] Employ your recommendation policy to generate a slate of 5 distinct items
                # [TODO] Code for generating the recommended slate...
                # Here we provide a simple random implementation
                # slate = random.sample(item_ids, k=SLATE_SIZE)

                interest_items = item_dict[cur_user]
                weight = current_weights[cur_user]
                
                slate = random.choices(interest_items, weight, k=5)
                while len(np.unique(slate)) != SLATE_SIZE:
                    slate = random.choices(interest_items, weight, k=5)
    
                    
                # Get the response of the slate from the environment
                clicked_id, in_environment = test_env.get_response(slate)
                if (clicked_id != -1):
                    weight_idx = interest_items.index(clicked_id)
                    current_weights[cur_user][weight_idx] = 0
                    

                # [TODO] Update your model here (optional)
                # [TODO] You can update your model at each step, or perform a batched update after some interval
                # [TODO] Code for updating your model...
   
   
                # Update the progress indicator
                pbar.update(1)

        # Record the score of this testing episode
        scores.append(test_env.get_score())

        # Reset the testing environment
        test_env.reset()

        # [TODO] Delete or reset your model weights here (in the end of each testing episode)
        # [TODO] Code for deleting your model weights...
        # df_test_similarity = df_user_item_similarity.copy()
    # Calculate the average scores 
    avg_scores = [np.average(score) for score in zip(*scores)]

    # Generate a DataFrame to output the result in a .csv file
    df_result = pd.DataFrame([[user_id, avg_score] for user_id, avg_score in enumerate(avg_scores)], columns=['user_id', 'avg_score'])
    df_result.to_csv(OUTPUT_PATH, index=False)
    display(df_result)
else:
    print("Not to test this time")

Testing: 268914it [13:46, 325.39it/s]
Testing: 271462it [14:09, 319.64it/s]
Testing: 271148it [14:00, 322.46it/s]
Testing: 271236it [14:08, 319.56it/s]
Testing: 269362it [13:55, 322.54it/s]


Unnamed: 0,user_id,avg_score
0,0,0.0095
1,1,0.8039
2,2,0.0243
3,3,0.0226
4,4,0.0159
...,...,...
1995,1995,0.0029
1996,1996,0.0032
1997,1997,0.0029
1998,1998,0.0028


In [22]:
if(TESTING):
    total_score = df_result['avg_score'].sum()
    print(f"Total test score:{total_score}")
    print(f"eval metric: {1-total_score/2000}")

Total test score:135.2122
eval metric: 0.9323939


## Models we have tried during the competition

### 1. Vanilla FunkSVD
* train用FunkSVD的model來當policy network，環境會給予正確的labal，用這個labal與predict出的結果算LOSS，並最小化total LOSS。
* 實驗結果：total loss下降了，可是avg score沒有上升，應該是loss訂得不好，不能將沒選到的item的labal都設為0因為有可能他是target item。

### 2. Q-learning
* 因為感覺這次的比賽題目跟RL的概念蠻像的，因此嘗試用Q-learning來解決這個問題。我們把state定義為user id，action定義為由五個item id的組成的slate，而rewad就是輸入slate到環境所得到的點擊結果，當有item被點擊時reward設為1，否則reward為-1。
* 實驗結果：跑了幾個episode之後發現cumulated reward都沒有上升的趨勢，而且跑一個episode要跑很久，推測可能是action set太大，因此就放棄了這個方法。

### 3. DNN-based recommender
* 用了幾個embedding layer還有dense layer建構dnn-based model，讓他可以輸入user id和item id然後預測評分，評分範圍是1~5，1代表極度沒興趣，5代表超級有興趣。
* 實驗結果：原本想說如果預測的準，就可以知道用戶對於各個item的評分，然後就可以在環境推薦評價5分的item給使用者。但是可能是model架得不好，或是收集到的資料還是不夠完善(已經有先執行訓練環境多收集到70多萬筆data)，train loss掉的很慢，而且還是蠻高的。找了很久仍然找不出問題，所以最後也沒有採用這個方法。

### 4. Item-based Collaborative Filtering
* 主要是會考量user與item之間的相似度，優先推薦相似度高的item給user。
* 程式碼如下：

#### User-Item Similarity Matrix

在這個地方，我們是透過計算每一個user跟每一個item之間的餘弦相似度，並存成一個dataframe。舉例來說，row=10, item=100，這個對應到的值就是user 10跟item 100的相似度，越高代表這個user有越高的機率會喜歡這個item.

In [23]:
dataset_dir = './dataset/'
if os.path.exists(dataset_dir + 'user_item_similarity.pkl'):
    df_user_item_similarity = pd.read_pickle(dataset_dir + 'user_item_similarity.pkl')
else:
    with open(dataset_dir+'user_item_similarity.txt','a') as fin:
        user_item_similarity_matrix = []
        for user in range(N_USERS):
            cosine_similarity_list = []
            for item in range(N_ITEMS):
                # calculating cosine similarity
                cosine_similarity = (1-spatial.distance.cosine(df_user_embedding.iloc[user]["concat_embeddings"], df_item_embedding.iloc[item]["concat_embeddings"]))
                cosine_similarity_list.append(cosine_similarity)    
            
            user_item_similarity_matrix.append(cosine_similarity_list)
            fin.write(",".join([str(x) for x in cosine_similarity_list]))
            fin.write('\n')
            fin.flush()
        df_user_item_similarity = pd.DataFrame(user_item_similarity_matrix)
        df_user_item_similarity.to_pickle(dataset_dir + 'user_item_similarity.pkl')

In [24]:
display(df_user_item_similarity.head(5))
print(df_user_item_similarity.shape)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,209517,209518,209519,209520,209521,209522,209523,209524,209525,209526
0,0.062287,0.003323,0.261205,0.274883,0.064868,0.013853,0.226747,0.061278,0.196566,0.109174,...,0.043999,0.059647,0.192302,0.062932,0.180494,0.012018,0.048684,0.079522,-0.015544,0.075266
1,0.221764,0.055316,0.112049,0.190925,0.091024,0.078525,0.017141,0.173482,0.051898,0.018566,...,-0.004871,-0.035946,0.117543,-0.005578,0.029166,0.013565,0.055716,0.008542,0.07323,0.03899
2,0.106126,0.022207,0.117865,0.251699,-0.012615,-0.014647,0.126634,0.021103,0.089571,0.015671,...,0.03622,0.084124,-0.012249,0.043811,0.079477,-0.061269,0.03782,0.004233,0.000226,0.029401
3,0.093782,0.016448,0.107896,0.075283,0.06736,-0.035875,0.099989,0.168689,0.12879,0.098932,...,0.056915,0.040844,0.170552,0.072513,0.178354,0.060635,0.035779,0.203117,0.084404,0.080729
4,0.122536,0.071247,0.153996,0.225814,0.183863,0.045011,0.210932,0.074886,0.209297,0.033176,...,0.063099,0.134406,0.060474,0.110714,0.092821,0.012666,0.129592,0.059556,0.067587,0.003772


(2000, 209527)


### Train Item-based CF

在訓練的時候我們主要用了兩個dataframe來做紀錄，第一個是`df_next`，另一個是`df_current`，在每一次的iteration開始前，`df_current`會從`df_next` copy一份，並在每一次要推薦使用者的時候，會根據 `df_current[user_id]`中相似度最高的前五個item優先進行推薦，推薦完以後這次的iteration就不該再推薦他，因此這邊就把下次的iteration設為1，而這次設為-1；那如果挑了五個推薦，結果這個user一個都沒點，就會給這些item一些penalty，所以把這些item的分數都乘上0.9。

In [25]:
df_next = df_user_item_similarity
model_save_dir = "./model/item_based_CF"
if not os.path.exists(model_save_dir):
    os.makedirs(model_save_dir)

In [26]:
# Initialize the training environment
train_env = TrainingEnvironment()

training_scores = []
# Reset the training environment (this can be useful when you have finished one episode of simulation and do not want to re-initialize a new environment)
train_env.reset()
for epoch in range(1):
    # APPLY CHANGE TO CURRENT DF.
    df_current = df_next.copy()
    with tqdm(desc='Training') as pbar:
        # Check if there exist any active users in the environment
        while (train_env.has_next_state()):
            # Get the current user ID
            user_id = train_env.get_state()
            sorted_indices = np.argsort(df_current.iloc[user_id])
            # Get top5 similarity response of recommending the slate to the current user
            slate = list(sorted_indices[-5:])
            clicked_id, in_environment = train_env.get_response(slate)
            # Update similarity matrix here
            if(clicked_id == -1): # mean there is no click in this item
                for item in slate:
                    df_current.iloc[user_id][item] *= 0.9
                    df_next.iloc[user_id][item] *= 0.9
            else:
                for item in slate:
                    if(item == clicked_id):
                        df_current.iloc[user_id][item] = -1
                        df_next.iloc[user_id][item] = 1
                        
            pbar.update(1)
    training_scores.append(train_env.get_score())
    df_next.to_pickle(f"{model_save_dir}/epoch_{epoch}.pkl")
    train_env.reset()

# Get the normalized session length score of all users
avg_train_scores = [np.average(score) for score in zip(*training_scores)]
df_train_score = pd.DataFrame([[user_id, score] for user_id, score in enumerate(avg_train_scores)], columns=['user_id', 'avg_score'])
display(df_train_score)
print(training_scores)
total_score = df_train_score['avg_score'].sum()
print(f"Total train score:{total_score}")

Training: 14896it [06:24, 38.72it/s]


Unnamed: 0,user_id,avg_score
0,0,0.0025
1,1,0.0035
2,2,0.0025
3,3,0.0035
4,4,0.0025
...,...,...
995,995,0.0035
996,996,0.0025
997,997,0.0025
998,998,0.0025


[[0.0025, 0.0035, 0.0025, 0.0035, 0.0025, 0.0035, 0.0025, 0.0025, 0.0025, 0.0055, 0.0065, 0.0025, 0.0035, 0.0025, 0.0025, 0.0045, 0.0025, 0.0025, 0.0025, 0.0025, 0.0035, 0.0025, 0.0035, 0.005, 0.0025, 0.0025, 0.0075, 0.0025, 0.0035, 0.0025, 0.004, 0.0045, 0.004, 0.0025, 0.0025, 0.0025, 0.0025, 0.0025, 0.0025, 0.0025, 0.0025, 0.0025, 0.139, 0.0025, 0.005, 0.006, 0.0025, 0.005, 0.0025, 0.0025, 0.004, 0.005, 0.011, 0.0045, 0.0025, 0.0035, 0.014, 0.0025, 0.0025, 0.0075, 0.0045, 0.0025, 0.0025, 0.004, 0.0025, 0.0035, 0.0025, 0.0025, 0.0025, 0.0025, 0.0035, 0.0035, 0.0035, 0.0055, 0.0025, 0.005, 0.0025, 0.0045, 0.004, 0.0025, 0.0195, 0.0025, 0.0025, 0.004, 0.0035, 0.0035, 0.0025, 0.0025, 0.0325, 0.0025, 0.0085, 0.0035, 0.0035, 0.0035, 0.0035, 0.005, 0.0025, 0.004, 0.005, 0.0025, 0.0045, 0.0025, 0.0035, 0.0025, 0.0035, 0.004, 0.006, 0.0025, 0.0025, 0.0025, 0.0025, 0.0025, 0.0035, 0.0025, 0.0035, 0.0025, 0.0025, 0.0045, 0.0025, 0.0025, 0.005, 0.0025, 0.0025, 0.004, 0.011, 0.0025, 0.004, 0.0035