## Content Based Recommendation

## Import the dataset
Dataset= https://www.kaggle.com/datasets/trolukovich/steam-games-complete-dataset
In this Notebook the dataset steam_games.csv is used.


In [5]:
import pandas as pd

games = pd.read_csv('./datasets/steam_games.csv', low_memory=False)
pd.DataFrame(games.columns, columns=['columns']).T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
columns,url,types,name,desc_snippet,recent_reviews,all_reviews,release_date,developer,publisher,popular_tags,game_details,languages,achievements,genre,game_description,mature_content,minimum_requirements,recommended_requirements,original_price,discount_price


In [6]:
# Drop empty descriptions
games = games.dropna(subset=['game_description'])

In [7]:
games.head()

Unnamed: 0,url,types,name,desc_snippet,recent_reviews,all_reviews,release_date,developer,publisher,popular_tags,game_details,languages,achievements,genre,game_description,mature_content,minimum_requirements,recommended_requirements,original_price,discount_price
0,https://store.steampowered.com/app/379720/DOOM/,app,DOOM,Now includes all three premium DLC packs (Unto...,"Very Positive,(554),- 89% of the 554 user revi...","Very Positive,(42,550),- 92% of the 42,550 use...","May 12, 2016",id Software,"Bethesda Softworks,Bethesda Softworks","FPS,Gore,Action,Demons,Shooter,First-Person,Gr...","Single-player,Multi-player,Co-op,Steam Achieve...","English,French,Italian,German,Spanish - Spain,...",54.0,Action,"About This Game Developed by id software, the...",,"Minimum:,OS:,Windows 7/8.1/10 (64-bit versions...","Recommended:,OS:,Windows 7/8.1/10 (64-bit vers...",$19.99,$14.99
1,https://store.steampowered.com/app/578080/PLAY...,app,PLAYERUNKNOWN'S BATTLEGROUNDS,PLAYERUNKNOWN'S BATTLEGROUNDS is a battle roya...,"Mixed,(6,214),- 49% of the 6,214 user reviews ...","Mixed,(836,608),- 49% of the 836,608 user revi...","Dec 21, 2017",PUBG Corporation,"PUBG Corporation,PUBG Corporation","Survival,Shooter,Multiplayer,Battle Royale,PvP...","Multi-player,Online Multi-Player,Stats","English,Korean,Simplified Chinese,French,Germa...",37.0,"Action,Adventure,Massively Multiplayer",About This Game PLAYERUNKNOWN'S BATTLEGROUND...,Mature Content Description The developers de...,"Minimum:,Requires a 64-bit processor and opera...","Recommended:,Requires a 64-bit processor and o...",$29.99,
2,https://store.steampowered.com/app/637090/BATT...,app,BATTLETECH,Take command of your own mercenary outfit of '...,"Mixed,(166),- 54% of the 166 user reviews in t...","Mostly Positive,(7,030),- 71% of the 7,030 use...","Apr 24, 2018",Harebrained Schemes,"Paradox Interactive,Paradox Interactive","Mechs,Strategy,Turn-Based,Turn-Based Tactics,S...","Single-player,Multi-player,Online Multi-Player...","English,French,German,Russian",128.0,"Action,Adventure,Strategy",About This Game From original BATTLETECH/Mec...,,"Minimum:,Requires a 64-bit processor and opera...","Recommended:,Requires a 64-bit processor and o...",$39.99,
3,https://store.steampowered.com/app/221100/DayZ/,app,DayZ,The post-soviet country of Chernarus is struck...,"Mixed,(932),- 57% of the 932 user reviews in t...","Mixed,(167,115),- 61% of the 167,115 user revi...","Dec 13, 2018",Bohemia Interactive,"Bohemia Interactive,Bohemia Interactive","Survival,Zombies,Open World,Multiplayer,PvP,Ma...","Multi-player,Online Multi-Player,Steam Worksho...","English,French,Italian,German,Spanish - Spain,...",,"Action,Adventure,Massively Multiplayer",About This Game The post-soviet country of Ch...,,"Minimum:,OS:,Windows 7/8.1 64-bit,Processor:,I...","Recommended:,OS:,Windows 10 64-bit,Processor:,...",$44.99,
4,https://store.steampowered.com/app/8500/EVE_On...,app,EVE Online,EVE Online is a community-driven spaceship MMO...,"Mixed,(287),- 54% of the 287 user reviews in t...","Mostly Positive,(11,481),- 74% of the 11,481 u...","May 6, 2003",CCP,"CCP,CCP","Space,Massively Multiplayer,Sci-fi,Sandbox,MMO...","Multi-player,Online Multi-Player,MMO,Co-op,Onl...","English,German,Russian,French",,"Action,Free to Play,Massively Multiplayer,RPG,...",About This Game,,"Minimum:,OS:,Windows 7,Processor:,Intel Dual C...","Recommended:,OS:,Windows 10,Processor:,Intel i...",Free,


### Preprocessing

I want to use game_description as the content for the content-based recommendation system
The tf-idf vectorizer is used to extract features of the game description, it will transform this description into a matrix of tf-idf features.

The parameter max_df will ignore words when they are used more then 80% of the time in the descriptions.
The min_df parameter will ignore words that ore used less then 2 times in the different descriptions.
Both of these options are there to reduce the noice.

In [8]:
from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(stop_words='english', max_df=0.8, min_df=2)

tfidf_matrix = vectorizer.fit_transform(games['game_description'])
print(tfidf_matrix.shape)

(37920, 53601)


## Find similar games

We'll use a (K-Nearest Neighbors) KNN model to find similar games. As you might remember from earlier lessons, every KNN uses a **distance metric** to find the nearest neighbors. In this case we're going to use the **cosine similarity** as the distance metric. The cosine similarity is a measure of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them.

### Cosine Similarity

The cosine similarity is a measure of similarity between two vectors $\bf{x}$ and $\bf{y}$.

$cos(\bf{x},\bf{y}) = \frac{\bf{x} \cdot \bf{y}}{||\bf{x}|| \cdot ||\bf{y}||}$
<br/>
$\phantom{cos(\bf{x},\bf{y})} = \frac{\sum_{i=1}^{n} x_i y_i}{\sqrt{\sum_{i=1}^{n}(x_i)^2} \sqrt{\sum_{i=1}^{n}(y_i)^2}}$

where $\bf{x}$ and $\bf{y}$ are vectors and $||\bf{x}||$ and $||\bf{y}||$ are the norms of $\bf{x}$ and $\bf{y}$ and where $x_i$ and $y_i$ are the term frequency of the $i$th word in the two documents.

The cosine similarity has the following properties, it is:

* a **normalized dot product**.
* **independent of the magnitude** of the vectors.
* is **zero** if the two vectors are **orthogonal** and **one** if the two vectors are **equal**.
* **symmetric**, this means that the similarity between A and B is the same as the similarity between B and A.
* **non-negative**
* **bounded** between 0 and 1, this means that the similarity between two vectors is always between 0 and 1.

We're going to compute the **cosine similarity** between different movies based on their plot summary _term frequency occurence - signature_. That is the vector representation of the plot summary. The higher the cosine similarity, the more similar the movies are.

SciKit-Learn provides a function to compute the cosine similarity between two vectors. We could use that function to compute the cosine similarity between the plot summaries of the movies, but in this case we're going to use a kNN model to find the nearest neighbors of a movie. The kNN model has a parameter called `metric` that we can set to `cosine` to use the cosine similarity as the distance metric.


In [9]:
from sklearn.neighbors import NearestNeighbors

def get_content_based_recommendation(name, top_n=10, metric='cosine'):
    # Get the index of the game that matches the title
    # we'll use that index to locate the row in the tf-idf matrix that corresponds to that game

    if name.lower() not in games['name'].str.lower().values:
        return "Title not found in the dataset"
    
    idx = games[games.name.str.lower() == name.lower()].index[0]
    model = NearestNeighbors(n_neighbors=top_n, metric=metric)
    model.fit(tfidf_matrix)
    similar_games = model.kneighbors(tfidf_matrix[idx], return_distance=False)[0]
    similar_games = similar_games.flatten()[1:]

    # Return the top 10 most similar games
    return games.iloc[similar_games]

#### Test1

From experience i can say that the predicted games are indeed a good recommendation.

In [10]:
get_content_based_recommendation('DOOM')[['name', 'game_description', 'genre', 'popular_tags', 'release_date']]

Unnamed: 0,name,game_description,genre,popular_tags,release_date
839,Doom 3: BFG Edition,About This Game DOOM 3 BFG Edition is the ult...,Action,"FPS,Horror,Action,Shooter,Classic,Sci-fi,Singl...","Oct 15, 2012"
788,DOOM VFR,"About This Game Developed by id Software, the...",Action,"Violent,Action,Gore,VR,FPS,Shooter,Horror,Sing...","Nov 30, 2017"
366,DOOM Eternal,"About This Game As the DOOM Slayer, you retur...",Action,"Gore,Violent,Action,FPS,Great Soundtrack,Demon...","Nov 22, 2019"
7687,The Haunted: Hells Reach,About This Game All Hell Has Broken Loose!!! ...,"Action,Indie","Action,Indie,Gore,Co-op,Third-Person Shooter,Z...","Oct 24, 2011"
35380,UNLEASH HELL,About This Game UNLEASH HELL is a brutally ha...,"Action,Indie","Action,Indie,Gore,Violent,FPS",TBA
1652,Hell is Other Demons,About This Game A Fast-Paced Bullet Hell Pla...,"Action,Indie","Indie,Action,Pixel Graphics,Platformer,Great S...","May 20, 2019"
2105,DOOM 3 Resurrection of Evil,About This Content The gripping expansion pa...,Action,"Action,FPS,Horror,Sci-fi,Dark,Atmospheric,Shoo...","Apr 3, 2005"
7780,HordeZ,About This Game Available for Arcades on Spr...,"Action,Indie","Action,Indie,FPS,Horror,VR,Shooter,On-Rails Sh...","Apr 29, 2016"
96,Ultimate Doom,About This Game The complete megahit game tha...,Action,"Classic,FPS,Action,1990's,Great Soundtrack,Dem...","Apr 30, 1995"


#### Test2

From experience i can say that the predicted games are indeed a good recommendation.

In [11]:
get_content_based_recommendation('BATTLETECH')[['name', 'game_description', 'genre', 'popular_tags', 'release_date']]


Unnamed: 0,name,game_description,genre,popular_tags,release_date
9859,MechCorp,About This Game MechCorp. is a turn-based-tac...,Strategy,"Strategy,Turn-Based Tactics,Mechs,Hex Grid,Sci...","Aug 2, 2018"
821,MechWarrior Online™ Solaris 7,About This Game MechWarrior Online™ Solaris ...,"Action,Free to Play,Massively Multiplayer,Simu...","Free to Play,Mechs,Multiplayer,Action,Shooter,...","Dec 10, 2015"
24113,Techwars Online,About This Game Techwars Online is a hardcore...,"Action,Indie,Massively Multiplayer,Strategy","Strategy,Massively Multiplayer,Indie,Action,Tu...","Mar 17, 2016"
1463,Override: Mech City Brawl,"About This Game No gears, no glory! Control ...","Action,Indie","Action,Indie,Mechs,Fighting,Local Co-Op,4 Play...","Dec 3, 2018"
2720,Mechs V Kaijus,About This Game In Mechs V Kaijus you take on...,"Action,Indie,Strategy,Early Access","Early Access,Action,Indie,Strategy,Tower Defen...","May 4, 2018"
27670,BATTLETECH Season Pass,About This Content The BATTLETECH Season Pas...,"Action,Adventure,Strategy","Strategy,Action,Adventure","Nov 27, 2018"
6295,Melting World Online,About This Game After the cataclysm that led...,"Adventure,Indie,Massively Multiplayer,RPG,Stra...","Strategy,Indie,Mechs,RPG,Adventure,Turn-Based,...","Oct 4, 2018"
111,Battle Brothers,About This Game Battle Brothers is a turn ba...,"Indie,RPG,Strategy","Strategy,Turn-Based Combat,Medieval,RPG,Turn-B...","Mar 24, 2017"
24035,Dark Horizons: Mechanized Corps,About This Game WARNING: THIS GAME IS CURRENT...,"Action,Indie,Simulation,Early Access","Early Access,Mechs,Action,Indie,Simulation,Ear...","Jul 22, 2014"
