# GAME DATASET
Using the game dataset to build a recommender system using user based collaborative filtering.

## BUSINESS OBJECTIVE
* Maximise profit
* Maximise visibility
* Maximise customer base
* Minimise attrition rate

## CONSTRAINTS
* High competition
* Online piracy

## DATA DICTIONARY

| **slno** | **Name of Feature** | **Description**                  |   **Type**   | **Relevance** |
|:--------:|:-------------------:|:---------------------------------|:-------------|:-------------:|
|     1    | userId              | It is the user ID of the users.  | Nominal      | Relevant      |
|     2    | game                | It is the name of the games.     | Nominal      | Relevant      |
|     3    | rating              | It is the ratings of the games.  | Ratio        | Relevant      |

Importing the required libraries

In [1]:
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

Loading the dataset using pandas library.

In [2]:
df0=pd.read_csv(r"D:\360Digitmg\ASSIGNMENTS\Ass10\game.csv")
df=df0.copy()
df.head()

Unnamed: 0,userId,game,rating
0,3,The Legend of Zelda: Ocarina of Time,4.0
1,6,Tony Hawk's Pro Skater 2,5.0
2,8,Grand Theft Auto IV,4.0
3,10,SoulCalibur,4.0
4,11,Grand Theft Auto IV,4.5


From the below code we can get a general idea about the dataset. 

In [3]:
df.shape

(5000, 3)

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   userId  5000 non-null   int64  
 1   game    5000 non-null   object 
 2   rating  5000 non-null   float64
dtypes: float64(1), int64(1), object(1)
memory usage: 117.3+ KB


from the above information it is clear that there are no null values in the dataset and the shape of the dataset is (51,4).

Checking the number of duplicates in the dataset. 

In [5]:
df.duplicated(keep='first').sum()

0

To get the column names of the dataset. 

In [6]:
df.columns

Index(['userId', 'game', 'rating'], dtype='object')

Using the pivot table function to get a sparse matrix.

In [7]:
pt=df.pivot_table(index='game',columns='userId',values='rating')
pt

userId,1,2,3,5,6,7,8,10,11,12,...,7101,7102,7104,7107,7108,7110,7116,7117,7119,7120
game,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
'Splosion Man,,,,,,,,,,,...,,,,,,,,,,
007: The World is Not Enough,,,,,,,,,,,...,,,,,,,,,,
10 Second Ninja X,,,,,,,,,,,...,,,,,,,,,,
1001 Spikes,,,,,,,,,,,...,,,,,,,,,,
1701 A.D.,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
inFamous,,,,,,,,,,,...,,,,,,,,,,
inFamous 2,,,,,,,,,,,...,,,,,,,,,,
inFamous: Festival of Blood,,,,,,,,,,,...,,,,,,,,,,
inFamous: Second Son,,,,,,,,,,,...,,,,,,,,,,


Using the fillna function to replace the nan values with 0. 

In [8]:
pt.fillna(0,inplace=True)
pt

userId,1,2,3,5,6,7,8,10,11,12,...,7101,7102,7104,7107,7108,7110,7116,7117,7119,7120
game,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
'Splosion Man,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
007: The World is Not Enough,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
10 Second Ninja X,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1001 Spikes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1701 A.D.,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
inFamous,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
inFamous 2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
inFamous: Festival of Blood,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
inFamous: Second Son,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


To get the similarity score of the sparse matrix. 

In [9]:
similarity_score=cosine_similarity(pt)
similarity_score.shape

(3438, 3438)

Creating a custom function to get the recommendation on giving the name and number of games to be recommended details. 

In [10]:
def recommend(game_name,topn):
    # fetching the index values
    index= np.where(pt.index==game_name)[0][0]
    # Here the items are sorted on the basis of the similarity scores for each given games index.
    # Enumerate is used to get both the similarity score and the index numbers. 
    similar_items=sorted(list(enumerate(similarity_score[index])),key=lambda x:x[1],reverse=True)[1:topn+1]
    
    # Getting the movie index 
    df_idx  =  [i[0] for i in similar_items]
    df_scores =  [i[1] for i in similar_items]

    df_similar_show = pd.DataFrame(columns=["Titles", "Score"])
    df_similar_show["Titles"] = df.loc[df_idx, "game"]
    df_similar_show["Score"] = df_scores
    df_similar_show.reset_index(inplace = True)  

    return (df_similar_show)

Using the input function and the custom function giving the name of the game and the number of reccomendations required we can get the desired output. 

In [15]:
a=input("Enter the name of the game: ")
b=int(input('Enter the number of recommendations required: '))

recommend(a,b)

Enter the name of the game: 'Splosion Man
Enter the number of recommendations required: 3


Unnamed: 0,index,Titles,Score
0,2785,Suikoden II,0.707107
1,3036,Toy Soldiers,0.336994
2,870,The Witness,0.333333


## CONCLUSION
* Using the recommendation system we can recommend some unknown games which the customer was not going to play but because it was recommended it will generate an interest about the game based on ratings from users of the similar tastes. 
* Good rating games will get more sales and visibility. 