## Tabletop Game Recommendation

### Imports

In [1]:
import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', 100)

from IPython.display import Image, display, Markdown

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity


### Functions

In [25]:
# Function to show the dimensions, column zero counts, column datatypes, column null counts of the input dataframe
def check_df(df):
    # Display the dimensions of the DataFrame
    display(Markdown("#### DataFrame dimensions"))
    display(df.shape)

    # Add a new line
    display(Markdown("<br>"))

    # Display data types of all columns
    display(Markdown("#### Data Types, zeros and nulls"))
    display(pd.DataFrame({
        "Data type": df.dtypes,
        "Zero counts": (df == 0).sum(),
        "Zero count %": (((df == 0).sum()/df.count())* 100).round(2),
        "Null counts": df.isnull().sum(),
        "Null count %": ((df.isnull().sum())/(df.count()+df.isnull().sum())* 100).round(2)
    }))

# Function which asks the user if they want to select specific game characteristics
def ask_for_preference():
    response = input("Do you have any preferences for game characteristics? (Yes/No): ")
    
    if response.strip().lower() == 'yes':
        print(f"The current number of games for comparison is {len(filtered_final_df)-1}:",end="\n")
        return True
        
    elif response.strip().lower() == 'no':
        print("Ok, no preferences will be used.", end="\n")
        return False
        
    else:
        print("Invalid response. Please enter 'Yes' or 'No'.", end="\n")
        return ask_for_preference()

# Function to display possible games based on user input
def suggest_games(game_list):
    while True:  # Repeat until valid input is received
        user_input = input("Type part of a game name you are looking for: ")
        suggestions = [game for game in game_list if user_input.lower() in game.lower()]
        
        if suggestions:
            print(f"Suggestions based on your input '{user_input}':")
            for game in suggestions:
                print(f"- {game}")
            break  # Exit the loop if suggestions are found
        else:
            print(f"No games found for '{user_input}'. Please try again.", end="\n")

def ask_for_average_rating():
    while True:
        try:
            average_rating = float(input("What average user rating would you like the game to have at least (between 7.0 and 10.0): "))
            
            if 7.0 <= average_rating <= 10.0:
                return average_rating
            else:
                print("Invalid range. Please ensure the rating is a float value between 7.0 and 10.0.", end="\n")
        except ValueError:
            print("Invalid input. Please enter a valid float value between 7.0 and 10.0.", end="\n")

def ask_for_complexity_rating():
    while True:
        try:
            complexity_rating_low_lim = float(input("What lower limit of complexity would you be happy with? (between 0.0 and 5.0): "))
            complexity_rating_up_lim = float(input("What upper limit of complexity would you be happy with? (between 0.0 and 5.0): "))
            
            if 0.0 <= complexity_rating_low_lim <= 5.0 and 0.0 <= complexity_rating_up_lim <= 5.0:
                return complexity_rating_low_lim, complexity_rating_up_lim
            else:
                print("Invalid range. Please ensure the rating is a float value between 0.0 and 5.0.", end="\n")
        except ValueError:
            print("Invalid input. Please enter a valid float value between 0.0 and 5.0.", end="\n")

def ask_for_min_player_num():
    while True:
        try:
            min_player_num = int(input("What minimum player number would you like the game to have (between 1 and 100): "))
            
            if 1 <= min_player_num <= 100:
                return min_player_num
            else:
                print("Invalid range. Please ensure the rating is an integer value between 1 and 100.", end="\n")
        except ValueError:
            print("Invalid input. Please enter a valid integer value between 1 and 100.", end="\n")

def ask_for_max_player_num():
    while True:
        try:
            max_player_num = int(input("What maximum player number would you like the game to have (between 1 and 100): "))
            
            if 1 <= max_player_num <= 100:
                return max_player_num
            else:
                print("Invalid range. Please ensure the rating is an integer value between 1 and 100.", end="\n")
        except ValueError:
            print("Invalid input. Please enter a valid integer value between 1 and 100.", end="\n")

def ask_for_exp_playtime():
    while True:
        try:
            exp_playtime = int(input("What expected playtime would you like the game to have (between 3 and 12000 minutes): "))
            
            if 1 <= exp_playtime <= 12000:
                return exp_playtime
            else:
                print("Invalid range. Please ensure the rating is an integer value between 3 and 12000 minutes.", end="\n")
        except ValueError:
            print("Invalid input. Please enter a valid integer value between 3 and 12000 minutes.", end="\n")

def ask_for_min_age():
    while True:
        try:
            min_age = int(input("What minimum player age would you like the game to have (between 2 and 21 years old): "))
            
            if 2 <= min_age <= 21:
                return min_age
            else:
                print("Invalid range. Please ensure the rating is an integer value between 2 and 21 years old.", end="\n")
        except ValueError:
            print("Invalid input. Please enter a valid integer value between 2 and 21 years old.", end="\n")

def main(final_df, game_list): 
    
    preference_req = ask_for_preference()
    
    suggest_games(game_list)
    while True:  # Loop until a valid game is found
        game_rec = input("Type in the exact game name you found (or any name): ")
    
        # Check if the user input matches any game in the DataFrame
        game_rec_data = final_df.loc[final_df['name'] == game_rec]
        
        if not game_rec_data.empty:  # If game data is found
            break  # Exit the loop if the game is found
        else:  # If game data is not found
            print(f"Game '{game_rec}' not found in the dataset. Please try again.", end="\n")  
            
    if preference_req:
        # Drop the game from final_df
        filtered_final_df = final_df.drop(final_df.loc[final_df['name'] == game_rec].index)

        #'Define the values for each prefered range of characteristics & then use them to filter final_df
        average_rating = ask_for_average_rating()
        filtered_final_df = filtered_final_df.loc[filtered_final_df['average'] >= average_rating]
        print(f"The current number of games for comparison is {len(filtered_final_df)}:",end="\n")
        
        complexity_rating_low_lim , complexity_rating_up_lim = ask_for_complexity_rating()
        filtered_final_df = filtered_final_df.loc[filtered_final_df['average complexity'] >= complexity_rating_low_lim]
        filtered_final_df = filtered_final_df.loc[filtered_final_df['average complexity'] <= complexity_rating_up_lim]
        print(f"The current number of games for comparison is {len(filtered_final_df)}:",end="\n")
        
        min_player_num = ask_for_min_player_num()
        filtered_final_df = filtered_final_df.loc[filtered_final_df['min player number'] >= min_player_num]
        print(f"The current number of games for comparison is {len(filtered_final_df)}:",end="\n")
        
        max_player_num = ask_for_max_player_num()
        filtered_final_df = filtered_final_df.loc[filtered_final_df['max player number'] <= max_player_num]
        print(f"The current number of games for comparison is {len(filtered_final_df)}:",end="\n")
        
        exp_playtime = ask_for_exp_playtime()
        filtered_final_df = filtered_final_df.loc[filtered_final_df['expected play time'] >= exp_playtime]
        print(f"The current number of games for comparison is {len(filtered_final_df)}:",end="\n")
        
        min_age = ask_for_min_age()
        filtered_final_df = filtered_final_df.loc[filtered_final_df['minimum age limit'] >= min_age]
        print(f"The current number of games for comparison is {len(filtered_final_df)}:",end="\n")
        
        # Add the recommended game back to the filtered DataFrame
        filtered_final_df = pd.concat([filtered_final_df, game_rec_data])

    else:
        filtered_final_df = final_df
        print(f"The current number of games being compared is {len(filtered_final_df)-1}:",end="\n")
        
    return filtered_final_df, game_rec


### Creating final dataframe

In [173]:
# Creating final_df dataframe from the previously feature engineered dataf
final_df = pd.read_csv('Games_fe.csv')

In [107]:
# Checking for any null or zero values which would affect recommendation, complexity votes and year published are not used and therefore ignored
check_df(final_df)

#### DataFrame dimensions

(4831, 20)

<br>

#### Data Types, zeros and nulls

Unnamed: 0,Data type,Zero counts,Zero count %,Null counts,Null count %
id,int64,0,0.0,0,0.0
name,object,0,0.0,0,0.0
average,float64,0,0.0,0,0.0
usersrated,int64,0,0.0,0,0.0
number of comments,int64,0,0.0,0,0.0
complexity votes,int64,4,0.08,0,0.0
average complexity,float64,0,0.0,0,0.0
year published,int64,5,0.1,0,0.0
min player number,int64,0,0.0,0,0.0
max player number,int64,0,0.0,0,0.0


## Game Recommendation Generation

In [27]:
# Reloading final_df to refresh dataset for filtering
final_df = pd.read_csv('Games_fe.csv')
game_list = final_df['name'].tolist()

if __name__ == "__main__":
    filtered_final_df, game_rec = main(final_df, game_list)

    if len(filtered_final_df) <=1:
        print("Based on your preferences there are no other games to recommend, please try different preferences")
        sys.exit()
        
    # Tokenize the bag of words based on whitespace separation
    vectorizer = CountVectorizer(token_pattern=r'[^ ]+')

    # Model learns the vocabulary from the text data ('bag of words'), and transforms the text data into a matrix of token counts
    vectorized = vectorizer.fit_transform(filtered_final_df['bag of words'])

    # Cosine similairty calculation, based on matrix of token counts
    similarities = cosine_similarity(vectorized)
    
    # score_df is used to contain the similarity matrix between all the games
    score_df = pd.DataFrame(similarities, columns=filtered_final_df['name'], index=filtered_final_df['name'])

    # Identifying the most similar games to the input game, based on the similarity matrix
    input_game = game_rec

    # Creating a dataframe called 'recommendations' which is ordered based on most similar to input game
    recommendations = pd.DataFrame(score_df.nlargest(11,input_game)[input_game])

    # Sort the DataFrame by similarity score in descending order
    recommendations.sort_values(by=[input_game],ascending = False)
    
    # Exclude the first row and select rows 1 to 11 & rename the 2nd column to 'Similarity score'
    recommendations = recommendations.rename(columns={input_game: 'Similarity score'}).iloc[1:12]

# Display the recommendations
recommendations

Do you have any preferences for game characteristics? (Yes/No):  Yes


The current number of games for comparison is 2450:


Type part of a game name you are looking for:  chess


Suggestions based on your input 'chess':
- Chess
- Bughouse Chess
- Chess960
- Tank Chess


Type in the exact game name you found (or any name):  Chess
What average user rating would you like the game to have at least (between 7.0 and 10.0):  7


The current number of games for comparison is 4830:


What lower limit of complexity would you be happy with? (between 0.0 and 5.0):  2
What upper limit of complexity would you be happy with? (between 0.0 and 5.0):  4


The current number of games for comparison is 3675:


What minimum player number would you like the game to have (between 1 and 100):  2


The current number of games for comparison is 2148:


What maximum player number would you like the game to have (between 1 and 100):  4


The current number of games for comparison is 1413:


What expected playtime would you like the game to have (between 3 and 12000 minutes):  30


The current number of games for comparison is 1369:


What minimum player age would you like the game to have (between 2 and 21 years old):  2


The current number of games for comparison is 1369:


Unnamed: 0_level_0,Similarity score
name,Unnamed: 1_level_1
Mahjong,0.274202
Go,0.269951
Shogi,0.174522
Chess960,0.148159
Bridge,0.143736
Xiangqi,0.134413
Changgi,0.124939
Ordo,0.105593
Jass,0.098773
The Duke,0.093124
