# Game Recommendation Engine:
## Project Overview:

 - This end-to-end Game Recommendation System provides intelligent and personalized video game suggestions using a hybrid approach that blends content-based filtering with smart search functionality powered by NLP (Fuzzy Matching).
 - It utilizes enriched metadata from a comprehensive video game dataset of over 10,000+ games, filtered across major platforms like PS5, PS4, Xbox One, XBOX SERIES S/X and PC.
 - The system extracts relevant features like genres, tags, developers, ESRB ratings, and store availability, enabling accurate similarity-based recommendations using TF-IDF vectorization and cosine similarity.
 - A responsive and visually appealing user interface is built using Streamlit, allowing users to:
   - Search games using approximate names or partial inputs
   - View recommendations with cover images, ratings, and purchase links

This project demonstrates proficiency in data preprocessing, NLP techniques, API integration, and end-to-end deployment, making it an ideal showcase for real-world Data Science and Machine Learning applications in the gaming domain.

## 🎮 This notebook will build the Game Recommendation Engine by:
 - 🧹 Textual Data Cleaning & Preprocessing:
   - Strip unnecessary characters, convert text to lowercase, and remove noise for consistent formatting.
   - Construct a custom textual feature called “soup”, which combines metadata like title, description, genres, tags, developers, and platforms into a single field.
   - Apply lemmatization to standardize word forms and improve the accuracy of textual similarity.
 - 🔍 Smart Search with Fuzzy Matching:
   - Leverage fuzzywuzzy’s process.extractOne() to support approximate string matching for user queries.
   - Enables users to find games using partial titles, typos, or inconsistent casing for a smoother search experience.
   - The best-matching game title is then passed into the recommendation engine for similarity scoring.
 - 📊 TF-IDF Vectorization for Content-Based Filtering:
   - Convert the “soup” into TF-IDF vectors to represent semantic content numerically.
   - Calculate cosine similarity between games based on these vectors to identify the most relevant recommendations.
 - 🎯 Recommendation Output:
   - For each game search input, return the Top 10 most similar games, complete with metadata like cover image, rating, platform, and purchase links (if available).

In [22]:
# import required library:
import pandas as pd

# Load data:
games = pd.read_csv("../Data/Games.csv")

# Preview the data:
games.head(5)

Unnamed: 0,id,title,rating,released,background_image_url,website,ratings,store,developers,genres,tags,publishers,esrb_rating,description
0,3498,Grand Theft Auto V,4.47,2013-09-17,https://media.rawg.io/media/games/20a/20aa03a1...,http://www.rockstargames.com/V/,"['exceptional', 'recommended', 'meh', 'skip']","[('Steam', 'store.steampowered.com'), ('PlaySt...","['Rockstar North', 'Rockstar Games']",['Action'],"['Singleplayer', 'Steam Achievements', 'Multip...",['Rockstar Games'],Mature,"Rockstar Games went bigger, since their previo..."
1,3328,The Witcher 3: Wild Hunt,4.64,2015-05-18,https://media.rawg.io/media/games/618/618c2031...,https://thewitcher.com/en/witcher3,"['exceptional', 'recommended', 'meh', 'skip']","[('GOG', 'gog.com'), ('PlayStation Store', 'st...",['CD PROJEKT RED'],"['Action', 'RPG']","['Singleplayer', 'Full controller support', 'A...",['CD PROJEKT RED'],Mature,"The third game in a series, it holds nothing b..."
2,4200,Portal 2,4.59,2011-04-18,https://media.rawg.io/media/games/2ba/2bac0e87...,http://www.thinkwithportals.com/,"['exceptional', 'recommended', 'meh', 'skip']","[('Xbox Store', 'microsoft.com'), ('Steam', 's...",['Valve Software'],"['Shooter', 'Puzzle']","['Singleplayer', 'Steam Achievements', 'Multip...","['Electronic Arts', 'Valve']",Everyone 10+,Portal 2 is a first-person puzzle game develop...
3,4291,Counter-Strike: Global Offensive,3.57,2012-08-21,https://media.rawg.io/media/games/736/73619bd3...,http://blog.counter-strike.net/,"['recommended', 'meh', 'exceptional', 'skip']","[('PlayStation Store', 'store.playstation.com'...","['Valve Software', 'Hidden Path Entertainment']",['Shooter'],"['Steam Achievements', 'Multiplayer', 'Full co...",['Valve'],Mature,Counter-Strike is a multiplayer phenomenon in ...
4,5286,Tomb Raider (2013),4.06,2013-03-05,https://media.rawg.io/media/games/021/021c4e21...,http://www.tombraider.com,"['recommended', 'exceptional', 'meh', 'skip']","[('Xbox 360 Store', 'marketplace.xbox.com'), (...",['Crystal Dynamics'],['Action'],"['Singleplayer', 'Multiplayer', 'Full controll...",['Square Enix'],Mature,A cinematic revival of the series in its actio...


In [23]:
# Data Information:
games.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10919 entries, 0 to 10918
Data columns (total 14 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   id                    10919 non-null  int64  
 1   title                 10919 non-null  object 
 2   rating                10919 non-null  float64
 3   released              10919 non-null  object 
 4   background_image_url  10919 non-null  object 
 5   website               10919 non-null  object 
 6   ratings               10919 non-null  object 
 7   store                 10919 non-null  object 
 8   developers            10919 non-null  object 
 9   genres                10919 non-null  object 
 10  tags                  10919 non-null  object 
 11  publishers            10919 non-null  object 
 12  esrb_rating           10919 non-null  object 
 13  description           10919 non-null  object 
dtypes: float64(1), int64(1), object(12)
memory usage: 1.2+ MB


## Cleaning Textual Data:

In [24]:
# Starting with cleaning Title data:

import re
import unicodedata

def clean_title(title):
    title = unicodedata.normalize('NFKD', title).encode('ascii', 'ignore').decode('utf-8', 'ignore')
    title = title.lower().strip()
    title = re.sub(r'[^a-z0-9\s]','',title)
    title = re.sub(r'\s+',' ',title)
    return title

# Apply the function to a newly created column:
games['title_clean'] = games['title'].apply(clean_title)

# Preview:
games.head(5)

Unnamed: 0,id,title,rating,released,background_image_url,website,ratings,store,developers,genres,tags,publishers,esrb_rating,description,title_clean
0,3498,Grand Theft Auto V,4.47,2013-09-17,https://media.rawg.io/media/games/20a/20aa03a1...,http://www.rockstargames.com/V/,"['exceptional', 'recommended', 'meh', 'skip']","[('Steam', 'store.steampowered.com'), ('PlaySt...","['Rockstar North', 'Rockstar Games']",['Action'],"['Singleplayer', 'Steam Achievements', 'Multip...",['Rockstar Games'],Mature,"Rockstar Games went bigger, since their previo...",grand theft auto v
1,3328,The Witcher 3: Wild Hunt,4.64,2015-05-18,https://media.rawg.io/media/games/618/618c2031...,https://thewitcher.com/en/witcher3,"['exceptional', 'recommended', 'meh', 'skip']","[('GOG', 'gog.com'), ('PlayStation Store', 'st...",['CD PROJEKT RED'],"['Action', 'RPG']","['Singleplayer', 'Full controller support', 'A...",['CD PROJEKT RED'],Mature,"The third game in a series, it holds nothing b...",the witcher 3 wild hunt
2,4200,Portal 2,4.59,2011-04-18,https://media.rawg.io/media/games/2ba/2bac0e87...,http://www.thinkwithportals.com/,"['exceptional', 'recommended', 'meh', 'skip']","[('Xbox Store', 'microsoft.com'), ('Steam', 's...",['Valve Software'],"['Shooter', 'Puzzle']","['Singleplayer', 'Steam Achievements', 'Multip...","['Electronic Arts', 'Valve']",Everyone 10+,Portal 2 is a first-person puzzle game develop...,portal 2
3,4291,Counter-Strike: Global Offensive,3.57,2012-08-21,https://media.rawg.io/media/games/736/73619bd3...,http://blog.counter-strike.net/,"['recommended', 'meh', 'exceptional', 'skip']","[('PlayStation Store', 'store.playstation.com'...","['Valve Software', 'Hidden Path Entertainment']",['Shooter'],"['Steam Achievements', 'Multiplayer', 'Full co...",['Valve'],Mature,Counter-Strike is a multiplayer phenomenon in ...,counterstrike global offensive
4,5286,Tomb Raider (2013),4.06,2013-03-05,https://media.rawg.io/media/games/021/021c4e21...,http://www.tombraider.com,"['recommended', 'exceptional', 'meh', 'skip']","[('Xbox 360 Store', 'marketplace.xbox.com'), (...",['Crystal Dynamics'],['Action'],"['Singleplayer', 'Multiplayer', 'Full controll...",['Square Enix'],Mature,A cinematic revival of the series in its actio...,tomb raider 2013


In [27]:
import ast

# Handle 'not rated' for ratings and 'not available' for others
games['ratings'] = games['ratings'].replace('not rated', "['not rated']")
games['ratings'] = games['ratings'].apply(lambda x: ast.literal_eval(x) if isinstance(x, str) else x)

# For other columns with 'not available'
columns_to_clean = ['developers', 'genres', 'tags', 'publishers', 'store']
for col in columns_to_clean:
    games[col] = games[col].replace('not available', "['not available']")
    games[col] = games[col].apply(lambda x: ast.literal_eval(x) if isinstance(x, str) else x)

In [None]:
# Converting columns from list to normal strings:
games['ratings'] = games['ratings'].apply(lambda x: ", ".join(x))
games['developers'] = games['developers'].apply(lambda x: ", ".join(x))
games['genres'] = games['genres'].apply(lambda x: ", ".join(x))
games['tags'] = games['tags'].apply(lambda x: ", ".join(x))
games['publishers'] = games['publishers'].apply(lambda x: ", ".join(x))

In [37]:
# Creating new columns for store name and store domain:
games['store_name'] = games['store'].apply(lambda x: ", ".join([store[0] for store in x]))
games['store_domain'] = games['store'].apply(lambda x: ", ".join([store[1] for store in x]))

In [46]:
# Preview the data:
games.head()

Unnamed: 0,id,title,rating,released,background_image_url,website,ratings,store,developers,genres,tags,publishers,esrb_rating,description,title_clean,store_name,store_domain
0,3498,Grand Theft Auto V,4.47,2013-09-17,https://media.rawg.io/media/games/20a/20aa03a1...,http://www.rockstargames.com/V/,"exceptional, recommended, meh, skip","[(Steam, store.steampowered.com), (PlayStation...","Rockstar North, Rockstar Games",Action,"Singleplayer, Steam Achievements, Multiplayer,...",Rockstar Games,Mature,"Rockstar Games went bigger, since their previo...",grand theft auto v,"Steam, PlayStation Store, Epic Games, Xbox 360...","store.steampowered.com, store.playstation.com,..."
1,3328,The Witcher 3: Wild Hunt,4.64,2015-05-18,https://media.rawg.io/media/games/618/618c2031...,https://thewitcher.com/en/witcher3,"exceptional, recommended, meh, skip","[(GOG, gog.com), (PlayStation Store, store.pla...",CD PROJEKT RED,"Action, RPG","Singleplayer, Full controller support, Atmosph...",CD PROJEKT RED,Mature,"The third game in a series, it holds nothing b...",the witcher 3 wild hunt,"GOG, PlayStation Store, Steam, Xbox Store, Nin...","gog.com, store.playstation.com, store.steampow..."
2,4200,Portal 2,4.59,2011-04-18,https://media.rawg.io/media/games/2ba/2bac0e87...,http://www.thinkwithportals.com/,"exceptional, recommended, meh, skip","[(Xbox Store, microsoft.com), (Steam, store.st...",Valve Software,"Shooter, Puzzle","Singleplayer, Steam Achievements, Multiplayer,...","Electronic Arts, Valve",Everyone 10+,Portal 2 is a first-person puzzle game develop...,portal 2,"Xbox Store, Steam, PlayStation Store, Xbox 360...","microsoft.com, store.steampowered.com, store.p..."
3,4291,Counter-Strike: Global Offensive,3.57,2012-08-21,https://media.rawg.io/media/games/736/73619bd3...,http://blog.counter-strike.net/,"recommended, meh, exceptional, skip","[(PlayStation Store, store.playstation.com), (...","Valve Software, Hidden Path Entertainment",Shooter,"Steam Achievements, Multiplayer, Full controll...",Valve,Mature,Counter-Strike is a multiplayer phenomenon in ...,counterstrike global offensive,"PlayStation Store, Steam, Xbox 360 Store","store.playstation.com, store.steampowered.com,..."
4,5286,Tomb Raider (2013),4.06,2013-03-05,https://media.rawg.io/media/games/021/021c4e21...,http://www.tombraider.com,"recommended, exceptional, meh, skip","[(Xbox 360 Store, marketplace.xbox.com), (Stea...",Crystal Dynamics,Action,"Singleplayer, Multiplayer, Full controller sup...",Square Enix,Mature,A cinematic revival of the series in its actio...,tomb raider 2013,"Xbox 360 Store, Steam, PlayStation Store, Goog...","marketplace.xbox.com, store.steampowered.com, ..."


In [49]:
# Adding a rating_label based on rating(numeric) by using ratings(string):
def map_rating(row):
    num = row['rating']
    cat = row['ratings']

    if pd.isna(num):
        return cat if pd.notna(cat) else 'not available'
    
    # Map based on numeric thresholds:
    if num <= 1.59:
        return 'skip'
    elif 1.60 <= num <= 2.59:
        return 'meh'
    elif 2.60 <= num <= 4.59:
        return 'recommended'
    elif num > 4.60:
        return 'exceptional'
    else:
        return cat if pd.notna(cat) else 'not available'
    
# Apply the function:
games['rating_label'] = games.apply(map_rating, axis=1)

# Preview:
games.head()

Unnamed: 0,id,title,rating,released,background_image_url,website,ratings,store,developers,genres,tags,publishers,esrb_rating,description,title_clean,store_name,store_domain,rating_label
0,3498,Grand Theft Auto V,4.47,2013-09-17,https://media.rawg.io/media/games/20a/20aa03a1...,http://www.rockstargames.com/V/,"exceptional, recommended, meh, skip","[(Steam, store.steampowered.com), (PlayStation...","Rockstar North, Rockstar Games",Action,"Singleplayer, Steam Achievements, Multiplayer,...",Rockstar Games,Mature,"Rockstar Games went bigger, since their previo...",grand theft auto v,"Steam, PlayStation Store, Epic Games, Xbox 360...","store.steampowered.com, store.playstation.com,...",recommended
1,3328,The Witcher 3: Wild Hunt,4.64,2015-05-18,https://media.rawg.io/media/games/618/618c2031...,https://thewitcher.com/en/witcher3,"exceptional, recommended, meh, skip","[(GOG, gog.com), (PlayStation Store, store.pla...",CD PROJEKT RED,"Action, RPG","Singleplayer, Full controller support, Atmosph...",CD PROJEKT RED,Mature,"The third game in a series, it holds nothing b...",the witcher 3 wild hunt,"GOG, PlayStation Store, Steam, Xbox Store, Nin...","gog.com, store.playstation.com, store.steampow...",exceptional
2,4200,Portal 2,4.59,2011-04-18,https://media.rawg.io/media/games/2ba/2bac0e87...,http://www.thinkwithportals.com/,"exceptional, recommended, meh, skip","[(Xbox Store, microsoft.com), (Steam, store.st...",Valve Software,"Shooter, Puzzle","Singleplayer, Steam Achievements, Multiplayer,...","Electronic Arts, Valve",Everyone 10+,Portal 2 is a first-person puzzle game develop...,portal 2,"Xbox Store, Steam, PlayStation Store, Xbox 360...","microsoft.com, store.steampowered.com, store.p...",recommended
3,4291,Counter-Strike: Global Offensive,3.57,2012-08-21,https://media.rawg.io/media/games/736/73619bd3...,http://blog.counter-strike.net/,"recommended, meh, exceptional, skip","[(PlayStation Store, store.playstation.com), (...","Valve Software, Hidden Path Entertainment",Shooter,"Steam Achievements, Multiplayer, Full controll...",Valve,Mature,Counter-Strike is a multiplayer phenomenon in ...,counterstrike global offensive,"PlayStation Store, Steam, Xbox 360 Store","store.playstation.com, store.steampowered.com,...",recommended
4,5286,Tomb Raider (2013),4.06,2013-03-05,https://media.rawg.io/media/games/021/021c4e21...,http://www.tombraider.com,"recommended, exceptional, meh, skip","[(Xbox 360 Store, marketplace.xbox.com), (Stea...",Crystal Dynamics,Action,"Singleplayer, Multiplayer, Full controller sup...",Square Enix,Mature,A cinematic revival of the series in its actio...,tomb raider 2013,"Xbox 360 Store, Steam, PlayStation Store, Goog...","marketplace.xbox.com, store.steampowered.com, ...",recommended


In [63]:
# Cleaning non english parts:
import re

def remove_non_english(text):
    # Pattern to match common non-English section headers
    pattern = r'(?:\n|\s|^)(Español|Français|Deutsch|Português|Русский|日本語|中文|한국어|Italiano|Polski|Türkçe|العربية|हिन्दी|ไทย|繁體中文)(.|\n)*$'
    
    # Remove the non-English part
    return re.sub(pattern, '', text, flags=re.IGNORECASE).strip()

# Apply function:
games['description_clean'] = games['description'].apply(remove_non_english)

In [65]:
# Create a soup of the textual data:
games['soup'] = games['developers'] + " " + games['genres'] + " " + games['tags'] + " " + games['publishers'] + " " + games['description_clean'] + " " + games['rating_label']
print(f"Generated Soup\n{games['soup'][0]}")

Generated Soup
Rockstar North, Rockstar Games Action Singleplayer, Steam Achievements, Multiplayer, Full controller support, Atmospheric, Great Soundtrack, RPG, Co-op, Open World, cooperative, First-Person, Third Person, Funny, Sandbox, Comedy, Third-Person Shooter, Moddable, Crime, vr mod Rockstar Games Rockstar Games went bigger, since their previous installment of the series. You get the complicated and realistic world-building from Liberty City of GTA4 in the setting of lively and diverse Los Santos, from an old fan favorite GTA San Andreas. 561 different vehicles (including every transport you can operate) and the amount is rising with every update. 
Simultaneous storytelling from three unique perspectives: 
Follow Michael, ex-criminal living his life of leisure away from the past, Franklin, a kid that seeks the better future, and Trevor, the exact past Michael is trying to run away from. 
GTA Online will provide a lot of additional challenge even for the experienced players, comi

### Clean and Lemmatize soup:

In [None]:
# Import required libraires:

import re
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('omw-1.4')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\USER\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\USER\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\USER\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to
[nltk_data]     C:\Users\USER\AppData\Roaming\nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


True

In [66]:
# Keeping details to English only:
from langdetect import detect

# Initialize StopWords and lemmatizer:
stop_words = set(stopwords.words('english'))
lemmatizer = WordNetLemmatizer()

# Define the function to clean the soup:
def clean_text(text):
    # Otherwise, clean the text as usual
    tokens = nltk.word_tokenize(re.sub(r'\W',' ', text.lower()))
    filtered = [lemmatizer.lemmatize(word) for word in tokens if word not in stop_words]
    return ' '.join(filtered)

# Apply functon to soup
games['cleaned_soup'] = games['soup'].apply(lambda x: clean_text(x))

In [73]:
# Preview the first row of cleaned soup:
games['cleaned_soup'][0]

'rockstar north rockstar game action singleplayer steam achievement multiplayer full controller support atmospheric great soundtrack rpg co op open world cooperative first person third person funny sandbox comedy third person shooter moddable crime vr mod rockstar game rockstar game went bigger since previous installment series get complicated realistic world building liberty city gta4 setting lively diverse los santos old fan favorite gta san andreas 561 different vehicle including every transport operate amount rising every update simultaneous storytelling three unique perspective follow michael ex criminal living life leisure away past franklin kid seek better future trevor exact past michael trying run away gta online provide lot additional challenge even experienced player coming fresh story mode player around help likely ruin mission every gta mechanic date experienced player unique customizable character community content paired leveling system tends keep everyone busy engaged r

## Preprocessing Data:

In [74]:
# Apply TF-IDF Vectorization on clean soup:
from sklearn.feature_extraction.text import TfidfVectorizer      # Import package

# Initialize Vectorizer:
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(games['cleaned_soup'])

# Print output:
print(f"Successfully vectorized soup:\n{tfidf_matrix.shape}")

Successfully vectorized soup:
(10919, 53048)


# Computing Cosine Similarity using TF-IDF Vectorized Data:
---

In [75]:
# Import required library:
from sklearn.metrics.pairwise import cosine_similarity

# Compute the data:
cosine_sim = cosine_similarity(tfidf_matrix,tfidf_matrix)

# Print output:
print(f"Successfully computed data:\n{cosine_sim.shape} ")

Successfully computed data:
(10919, 10919) 


In [76]:
# Save the cosine sim using NumPy:
import numpy as np
np.save("cosine_sim.npy", cosine_sim)
print("Saved!")

Saved!


In [78]:
# Converting released column to datetime and formatting for display purposes:
games['released'] = pd.to_datetime(games['released'], errors='coerce')

# Formatting:
games['release_date'] = games['released'].dt.strftime('%d %b %Y')

# Preview:
games.head()

Unnamed: 0,id,title,rating,released,background_image_url,website,ratings,store,developers,genres,...,esrb_rating,description,title_clean,store_name,store_domain,rating_label,soup,cleaned_soup,description_clean,release_date
0,3498,Grand Theft Auto V,4.47,2013-09-17,https://media.rawg.io/media/games/20a/20aa03a1...,http://www.rockstargames.com/V/,"exceptional, recommended, meh, skip","[(Steam, store.steampowered.com), (PlayStation...","Rockstar North, Rockstar Games",Action,...,Mature,"Rockstar Games went bigger, since their previo...",grand theft auto v,"Steam, PlayStation Store, Epic Games, Xbox 360...","store.steampowered.com, store.playstation.com,...",recommended,"Rockstar North, Rockstar Games Action Singlepl...",rockstar north rockstar game action singleplay...,"Rockstar Games went bigger, since their previo...",17 Sep 2013
1,3328,The Witcher 3: Wild Hunt,4.64,2015-05-18,https://media.rawg.io/media/games/618/618c2031...,https://thewitcher.com/en/witcher3,"exceptional, recommended, meh, skip","[(GOG, gog.com), (PlayStation Store, store.pla...",CD PROJEKT RED,"Action, RPG",...,Mature,"The third game in a series, it holds nothing b...",the witcher 3 wild hunt,"GOG, PlayStation Store, Steam, Xbox Store, Nin...","gog.com, store.playstation.com, store.steampow...",exceptional,"CD PROJEKT RED Action, RPG Singleplayer, Full ...",cd projekt red action rpg singleplayer full co...,"The third game in a series, it holds nothing b...",18 May 2015
2,4200,Portal 2,4.59,2011-04-18,https://media.rawg.io/media/games/2ba/2bac0e87...,http://www.thinkwithportals.com/,"exceptional, recommended, meh, skip","[(Xbox Store, microsoft.com), (Steam, store.st...",Valve Software,"Shooter, Puzzle",...,Everyone 10+,Portal 2 is a first-person puzzle game develop...,portal 2,"Xbox Store, Steam, PlayStation Store, Xbox 360...","microsoft.com, store.steampowered.com, store.p...",recommended,"Valve Software Shooter, Puzzle Singleplayer, S...",valve software shooter puzzle singleplayer ste...,Portal 2 is a first-person puzzle game develop...,18 Apr 2011
3,4291,Counter-Strike: Global Offensive,3.57,2012-08-21,https://media.rawg.io/media/games/736/73619bd3...,http://blog.counter-strike.net/,"recommended, meh, exceptional, skip","[(PlayStation Store, store.playstation.com), (...","Valve Software, Hidden Path Entertainment",Shooter,...,Mature,Counter-Strike is a multiplayer phenomenon in ...,counterstrike global offensive,"PlayStation Store, Steam, Xbox 360 Store","store.playstation.com, store.steampowered.com,...",recommended,"Valve Software, Hidden Path Entertainment Shoo...",valve software hidden path entertainment shoot...,Counter-Strike is a multiplayer phenomenon in ...,21 Aug 2012
4,5286,Tomb Raider (2013),4.06,2013-03-05,https://media.rawg.io/media/games/021/021c4e21...,http://www.tombraider.com,"recommended, exceptional, meh, skip","[(Xbox 360 Store, marketplace.xbox.com), (Stea...",Crystal Dynamics,Action,...,Mature,A cinematic revival of the series in its actio...,tomb raider 2013,"Xbox 360 Store, Steam, PlayStation Store, Goog...","marketplace.xbox.com, store.steampowered.com, ...",recommended,"Crystal Dynamics Action Singleplayer, Multipla...",crystal dynamic action singleplayer multiplaye...,A cinematic revival of the series in its actio...,05 Mar 2013


In [81]:
# Checking info:
games.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10919 entries, 0 to 10918
Data columns (total 22 columns):
 #   Column                Non-Null Count  Dtype         
---  ------                --------------  -----         
 0   id                    10919 non-null  int64         
 1   title                 10919 non-null  object        
 2   rating                10919 non-null  float64       
 3   released              10793 non-null  datetime64[ns]
 4   background_image_url  10919 non-null  object        
 5   website               10919 non-null  object        
 6   ratings               10919 non-null  object        
 7   store                 10919 non-null  object        
 8   developers            10919 non-null  object        
 9   genres                10919 non-null  object        
 10  tags                  10919 non-null  object        
 11  publishers            10919 non-null  object        
 12  esrb_rating           10919 non-null  object        
 13  description     

In [None]:
# Checking for empty string data:
games[games['title_clean'] == ""]

Unnamed: 0,id,title,rating,released,background_image_url,website,ratings,store,developers,genres,...,esrb_rating,description,title_clean,store_name,store_domain,rating_label,soup,cleaned_soup,description_clean,release_date
7391,349870,初恋日记,1.6,2019-09-20,https://media.rawg.io/media/screenshots/603/60...,not available,"skip, meh","[(Steam, store.steampowered.com)]",YEARS,"RPG, Indie",...,not rated,School Years is a series of visual novel adapt...,,Steam,store.steampowered.com,meh,"YEARS RPG, Indie Singleplayer, RPG, Partial Co...",year rpg indie singleplayer rpg partial contro...,School Years is a series of visual novel adapt...,20 Sep 2019
10532,549619,封灵档案,2.5,2021-01-29,https://media.rawg.io/media/screenshots/f70/f7...,not available,"meh, skip, exceptional, recommended","[(Steam, store.steampowered.com)]",（Hong Kong）GKD Game Studio,not available,...,not rated,about this game封灵-1V5（Notes Of Soul 2.0) is a ...,,Steam,store.steampowered.com,meh,（Hong Kong）GKD Game Studio not available Multi...,hong kong gkd game studio available multiplaye...,about this game封灵-1V5（Notes Of Soul 2.0) is a ...,29 Jan 2021
10617,61197,太吾绘卷,3.92,2018-09-21,https://media.rawg.io/media/screenshots/e1f/e1...,not available,"exceptional, recommended, meh, skip","[(Steam, store.steampowered.com)]",ConchShip Games,"Adventure, RPG, Strategy, Casual, Indie",...,not rated,"OverviewIn the ""Taiwu"" Universe, besides playi...",,Steam,store.steampowered.com,recommended,"ConchShip Games Adventure, RPG, Strategy, Casu...",conchship game adventure rpg strategy casual i...,"OverviewIn the ""Taiwu"" Universe, besides playi...",21 Sep 2018
10723,61944,关于我被小学女生绑架这件事,3.0,2018-05-31,https://media.rawg.io/media/screenshots/17b/17...,not available,"skip, exceptional, meh, recommended","[(Steam, store.steampowered.com)]","ZiX Solutions, 猫薄荷制作组","Adventure, Casual, Indie",...,not rated,梧桐主催，全新力作http://store.steampowered.com/app/871...,,Steam,store.steampowered.com,recommended,"ZiX Solutions, 猫薄荷制作组 Adventure, Casual, Indie...",zix solution 猫薄荷制作组 adventure casual indie sin...,梧桐主催，全新力作http://store.steampowered.com/app/871...,31 May 2018


In [86]:
# Removing rows where title_clean is ""
games = games[games['title_clean'].str.strip().astype(bool)]

In [88]:
# Checking for description:
games[games['description_clean'] == ""]

Unnamed: 0,id,title,rating,released,background_image_url,website,ratings,store,developers,genres,...,esrb_rating,description,title_clean,store_name,store_domain,rating_label,soup,cleaned_soup,description_clean,release_date
7055,14096,Sakura Angels,2.57,2015-01-16,https://media.rawg.io/media/screenshots/daa/da...,not available,"skip, recommended, meh, exceptional","[(Steam, store.steampowered.com)]",Winged Cloud,"Casual, Indie",...,not rated,日本語バージョンダウンロード開始されました。\nFrom the creators of S...,sakura angels,Steam,store.steampowered.com,meh,"Winged Cloud Casual, Indie Singleplayer, steam...",winged cloud casual indie singleplayer steam t...,,16 Jan 2015


In [91]:
# Removing rows where description_clean is ""
games = games[games['description_clean'].str.strip().astype(bool)]

In [94]:
# Filling missing values for 
games['release_date'] = games['release_date'].fillna('Not Available')

In [97]:
# Final check before saving (Leaving released as it as its not required):
games.isnull().sum()

id                        0
title                     0
rating                    0
released                126
background_image_url      0
website                   0
ratings                   0
store                     0
developers                0
genres                    0
tags                      0
publishers                0
esrb_rating               0
description               0
title_clean               0
store_name                0
store_domain              0
rating_label              0
soup                      0
cleaned_soup              0
description_clean         0
release_date              0
dtype: int64

In [98]:
# Resetting Index:
games.reset_index(drop=True, inplace=True)

In [99]:
# Saving the updated games data:
games.to_csv("games_recommended.csv", index=False)

# Content Based Recommendation Engine with Smart Search:
---

In [101]:
# Saving games_recommended.csv as pkl file:
import pandas as pd
import pickle

# Load data:
games_df = pd.read_csv("games_recommended.csv")

# Save into PKL:
with open("games_recommended.pkl", "wb") as f:
    pickle.dump(games_df, f)

print("Pickle file saved successfully.")

Pickle file saved successfully.


In [None]:
import re

# Define function:
def clean_desc(text):
    # Remove newlines and any header starting with '###' (like ###Plot)
    text = re.sub(r'(\r\n|\r|\n|###\w+)', ' ', text)
    
    # Replace multiple spaces with a single space and strip leading/trailing spaces
    text_clean = re.sub(r'\s+', ' ', text).strip()
    
    # Return the cleaned text
    return text_clean

# Apply the function:
games_df['description_clean'] = games_df['description_clean'].apply(clean_desc)

In [None]:
games_df.to_csv("games_recommended.csv", index=False)

In [35]:
import re
import pandas as pd
import numpy as np
# Load data:
games = pd.read_csv("../Data/games_recommended.csv")
# Load precomputed cosine sim:
cosine_sim = np.load("../Recommendation Engine/cosine_sim.npy")

In [36]:
# Save to pkl file:
import pickle

with open("games_recommended.pkl", 'wb') as f:
    pickle.dump(games, f)

print("Pickle file successfully saved")

Pickle file successfully saved


In [1]:
import re
import pandas as pd
import numpy as np
# Load data:
games = pd.read_csv("../Data/games_recommended.csv")
# Load precomputed cosine sim:
cosine_sim = np.load("../Recommendation Engine/cosine_sim.npy")

# === Contest Based Recommendation Function ===
---

In [2]:
# Import Rapid Fuzz:
from rapidfuzz import process, fuzz

In [3]:
alias_dict = {
    'gta5': 'grand theft auto v',
    'gta 5': 'grand theft auto v',
    'gta v': 'grand theft auto v',
    'gta': 'grand theft auto',
    'gta 4': 'grand theft auto iv',
    'gta4': 'grand theft auto iv',

    'witcher 3': 'the witcher 3 wild hunt',
    'tw3': 'the witcher 3 wild hunt',

    'rdr': 'red dead redemption',
    'rdr2': 'red dead redemption 2',
    'red dead 2': 'red dead redemption 2',
    'red dead': 'red dead redemption 2',

    'botw': 'the legend of zelda breath of the wild',
    'zelda botw': 'the legend of zelda breath of the wild',

    'elden': 'elden ring',
    'elden ring': 'elden ring',

    'gow': 'god of war',
    'god of war 4': 'god of war',
    'god of war': 'god of war',

    'minecraft': 'minecraft',

    'fortnite': 'fortnite',

    'cod': 'call of duty',
    'call of duty': 'call of duty',

    'hzd': 'horizon zero dawn',
    'horizon': 'horizon zero dawn',

    'spiderman': 'marvels spider man',
    'spider man': 'marvels spider man',
    'marvel spiderman': 'marvels spider man',

    'cyberpunk': 'cyberpunk 2077',
    'cyberpunk 2077': 'cyberpunk 2077',

    'ac valhalla': 'assassins creed valhalla',
    'assassins creed valhalla': 'assassins creed valhalla',
    'acv': 'assassins creed valhalla',
    'ac': 'assassins creed',
    'ac2': 'assassins creed 2',

    're8': 'resident evil village',
    'resident evil 8': 'resident evil village',
    'village': 'resident evil village',

    'tlou': 'the last of us',
    'tlou2': 'the last of us part ii',
    'last of us': 'the last of us',
    'last of us 2': 'the last of us part ii',

    'dragon ball': 'dragon ball z',
    'dragon z': 'dragon ball fighterz',
    'dbz': 'dragon ball z',
    'dbz budokai': 'dragonal ball budokai tenkaichi',
    'dbz sparking zero': 'dragon ball sparking zero',

    'hogwarts': 'hogwarts legacy',

    'sekiro': 'sekiro shadows die twice',

    'hellblade': 'hellblade senuas sacrifice'
}

In [18]:
def recommend_games(user_input, top_n=10):
    try:
        # Validate user input
        if not isinstance(user_input, str) or not user_input.lower().strip():
            raise ValueError("User input must not be empty. Please add a movie to get recommendations.")
        
        # Clean user input
        user_input_clean = re.sub(r'[^a-zA-Z0-9\s]', '', user_input.lower().strip())

        # Use alias if available
        if user_input_clean in alias_dict:
            user_input_clean = alias_dict[user_input_clean]

        # Handle case where no fuzzy match is found
        match_result = process.extractOne(user_input_clean, games['title_clean'].to_list(), scorer=fuzz.ratio)
        if match_result is None:
            raise ValueError(f"Movie {user_input} is not updated in the data. It will be added in future update of application.")
        
        best_match = match_result[0]
        print(f"Best Match is: {best_match}")

        # Get index of the best match
        idx = games[games['title_clean'] == best_match].index[0]

        # Ensure that idx is valid
        if idx < 0 or idx >= len(games):
            raise IndexError(f"Index {idx} is out of range.")
        
        # Calculate similarity scores directly from the precomputed cosine_sim matrix
        sim_scores = list(enumerate(cosine_sim[idx]))      # Use the precomputed cosine similarity matrix

        # Sort the similarity scores (excluding the movie itself)
        similar_games_idx = sorted(sim_scores, key=lambda x: x[1], reverse=True)[1:top_n+1]

        # Prepare results
        results = []
        for i, _ in similar_games_idx:
            game_data = games.loc[i]
            # Safely split store names and domains into lists, or use empty lists
            store_names = game_data['store_name'].split(', ')
            store_domains = game_data['store_domain'].split(', ')

            # Zip only if valid and lengths match
            store_display = ', '.join([
                    f"{name} : https://{domain}" for name, domain in zip(store_names, store_domains)
                    ])
                
            # Check if all necessary fields exist
            if any(field not in game_data for field in ['title', 'description_clean','genres', 'release_date', 'rating', 'tags', 'developers', 'publishers', 'esrb_rating', 'background_image_url','website']):
                continue

            results.append({
                'Title': games.loc[i,'title'],
                'Description': games.loc[i, 'description_clean'],
                'Genre': games.loc[i, 'genres'],
                'Release Date': games.loc[i,'release_date'],
                'Rating': games.loc[i,'rating'],
                'Platforms': games.loc[i, 'platforms'],
                'Stores': store_display,
                'Tags': games.loc[i,'tags'],
                'Developer': games.loc[i, 'developers'],
                'Publisher': games.loc[i,'publishers'],
                'ESRB_Rating': games.loc[i,'esrb_rating'],
                'Poster': games.loc[i, 'background_image_url'],
                'Website': games.loc[i, 'website'],
                'Screenshots': games.loc[i, 'screenshots']
            })


        return results
    
    except ValueError as ve:
        return {'Error': str(ve)}
    
    except IndexError as ie:
        return {'Error': f'Index error: {str(ie)}'}
    
    except Exception as e:
        return {'Error': f'An unexpected error occurred: {str(e)}'}

In [37]:
recommend_games('elden ring',10)

Best Match is: elden ring


[{'Title': "Alwa's Awakening",
  'Description': "In Alwa's Awakening you play as Zoe, a heroine sent from another world to bring peace to the land of Alwa. Equipped only with a magic staff she awakens in a distant land and must set out to help the people. Traverse dangerous dungeons, meet interesting and fun characters and explore the world in this 8-bit adventure game. Just like the old classics you won't have a flashing arrow telling you exactly where to go and what to do next. Instead you are free to find your own way and by using your magic staff you can progress through the over 400 unique challenging rooms in the game. Alwa’s Awakening is a game that tries to stay as close as possible to the authentic 8-bit look with sweet pixel art, a soundtrack filled with catchy chiptunes and so much charm it’ll bring you right back to the NES era. With easy to understand controls the game is easy to learn but tough to master, just like how games were in the old days! A retro game with an auth