Let's import the dataframe and cosine matrix of the EmbeddingGemma model. We will repeat the same steps as previous models.

In [2]:
import pandas as pd
import gc

In [4]:
sg_df_clean = pd.read_csv("sg_df_clean.csv")

In [5]:
Full_cosine_matrix = pd.read_pickle("Full_cosine_matrix_EmbeddingGemma.pkl")

In [6]:
gc.collect()

0

We are repeating the same steps as notebooks 3 and 5:

In [8]:
sg_df_clean["rating_ratio"] = sg_df_clean["rating"]/100
print(sg_df_clean["rating_ratio"])

0        0.352941
1        0.913793
2        1.000000
3        0.862069
4        0.639706
           ...   
72366    0.808989
72367    0.681199
72368    1.000000
72369    1.000000
72370    0.666667
Name: rating_ratio, Length: 72371, dtype: float64


In [9]:
c = sg_df_clean["rating_ratio"].mean()
m= sg_df_clean["user_reviews"].median()

In [11]:
def weighted_game_score(x, c=c, m=m):
    r = x["rating_ratio"] #I am taking the rating_ratio for each game (row) and storing it as variable r
    n = x["user_reviews"]#I am taking the user_reviews for each game (row) and storing it as variable n
    return ((n*r) / (n+m) + (m*c) / (n+m))

In [12]:
sg_df_clean["game_score"] = sg_df_clean.apply(weighted_game_score, axis=1)

In [13]:
sg_df_clean[["name","game_score"]].sample(20)

Unnamed: 0,name,game_score
4665,Armored Brigade,0.898101
37591,Awakening: The Golden Age Collector's Edition,0.512823
10112,Dino Delivery,0.60886
40366,Grim Earth,0.871797
30229,Highlands,0.809754
24254,Little Island,0.672588
55569,Nurtured human Plan: Meet a Date！,0.724364
22340,MX vs ATV Legends,0.561361
22485,Heart and Axe,0.672588
34816,Nexagon: Deathmatch,0.794876


In [14]:
def recommend_games(game, df = sg_df_clean, sim_matrix = Full_cosine_matrix):
    # 1. Find the game's index in the dataframe to use it in the similarity matrix
    try: 
        index = df[df['name'] == game].index[0]
    except IndexError:
        return "The game you typed does not exist in the database. Please make sure the spelling exactly matches the game on steam" #to give an error if the spelling is incorrect
    
    # 2. We create a temp dataframe to modify without changing the original
    temp_df = df.copy()
    
    # 3. We create a new column with the list of cosine similarities for the specified game
    temp_df['similarity'] = sim_matrix[index]
    
    # 4. FILTER (Get top 20 matches)
    # Sort by similarity (Descending)
    # iloc[1:21] grabs the top 20, skipping the game itself (which is the first one since it's the most similar)
    top_similar = temp_df.sort_values('similarity', ascending=False).iloc[1:21]
    
    # 4. RANK (Pick top 5 best quality)
    # Sort the 20 candidates by 'game_score' and pick the top 5
    top_picks = top_similar.sort_values('game_score', ascending=False).head(5)
    
    # 5. Return only the relevant columns
    cols = ['name', 'similarity', 'game_score']
    return top_picks[cols]



In [23]:
recommend_games(game = "Hogwarts Legacy")

Unnamed: 0,name,similarity,game_score
1419,LEGO® Harry Potter: Years 5-7,0.459989,0.890344
52738,Map Of Materials,0.447996,0.883701
41856,LEGO® Harry Potter: Years 1-4,0.513148,0.865144
71142,Magical Girl Konoha,0.447116,0.842292
4416,Hazordhu,0.479801,0.838831


In [15]:
recommend_games(game = "The Elder Scrolls V: Skyrim Special Edition")

Unnamed: 0,name,similarity,game_score
16413,The Elder Scrolls IV: Oblivion® Game of the Ye...,0.507183,0.955539
51296,The Elder Scrolls IV: Oblivion® Game of the Ye...,0.499036,0.955496
5785,The Elder Scrolls III: Morrowind® Game of the ...,0.439858,0.953199
9109,The Elder Scrolls V: Skyrim,0.63859,0.948258
3765,Sid Meier's Pirates!,0.39045,0.941963


In [25]:
def rec_rating():
    a = recommend_games(game = "The Elder Scrolls V: Skyrim Special Edition")
    b = recommend_games(game = "DOOM Eternal")
    c = recommend_games(game = "Hollow Knight")
    d = recommend_games(game = "Hades")
    e = recommend_games(game = "ELDEN RING")
    return a, b, c, d, e


In [27]:
rec_rating()

(                                                    name  similarity  \
 16413  The Elder Scrolls IV: Oblivion® Game of the Ye...    0.507183   
 51296  The Elder Scrolls IV: Oblivion® Game of the Ye...    0.499036   
 5785   The Elder Scrolls III: Morrowind® Game of the ...    0.439858   
 9109                         The Elder Scrolls V: Skyrim    0.638590   
 3765                                Sid Meier's Pirates!    0.390450   
 
        game_score  
 16413    0.955539  
 51296    0.955496  
 5785     0.953199  
 9109     0.948258  
 3765     0.941963  ,
                 name  similarity  game_score
 3401   Ultimate Doom    0.530816    0.964302
 33218  Devil Daggers    0.464657    0.955320
 7623            DOOM    0.534766    0.952628
 23005        DOOM II    0.470500    0.948779
 53559    HYPER DEMON    0.466565    0.941629,
                                 name  similarity  game_score
 13307  Shovel Knight: Treasure Trove    0.468102    0.962854
 68244                      Void

In order, these are the scores for whether the recommendations are good or not: 
1) 4/5
2) 5/5
3) 4/5
4) 5/5
5) 4/5

Total = 22/25

Almost all of the recommendations are good except Sid Meier's Pirates, which has a low similarity score to be fair. The results are comparable to ModernBertEmbed.