# <center>**Board Game Recommender**</center>

<img src='Board_Game.png'>

## **Table of Contents (Work in progress)**

1. [Problem Statement](#problem)
2. [Data Loading and Exploration](#data-loading)
3. [Data Preprocessing](#data-preprocess)
[<ul>4.1 Custom Transformer</ul>](#custom)
[<ul>4.2 Numerical Pipelines</ul>](#numeric)
[<ul>4.3 Ordinal Pipeline</ul>](#ordinal)
[<ul>4.4 Column Transformers</ul>](#column)
4. [Model Selection and Training](#selection)
[<ul>5.1 Shortlist Promising Models</ul>](#initial)
[<ul>5.2 Fine Tuning</ul>](#fine)
5. [Model Evaluation](#evaluation)
6. [Conclusion](#conclude)
7. [Appendix](#append)

---

## **1. Problem Statement** <a class="anchor" id="problem"></a>

The goal of this project is to create a model to recommend board games using the [Board Games Dataset](https://www.kaggle.com/datasets/andrewmvd/board-games) from Kaggle.  This project was created as the capstone project in my Master of Data Science program at Eastern University.  

The dataset author's description of the columns is in the Appendix section.  

## **2. Data Loading and Exploration** <a class="anchor" id="data-loading"></a>

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
%%html
<style>
table {
  float: left;
}
</style>

In [3]:
board_game_data = pd.read_csv("bgg_dataset.csv")

In [4]:
board_game_data.head()

Unnamed: 0,ID,Name,Year Published,Min Players,Max Players,Play Time,Min Age,Users Rated,Rating Average,BGG Rank,Complexity Average,Owned Users,Mechanics,Domains
0,174430.0,Gloomhaven,2017.0,1,4,120,14,42055,8.79,1,3.86,68323.0,"Action Queue, Action Retrieval, Campaign / Bat...","Strategy Games, Thematic Games"
1,161936.0,Pandemic Legacy: Season 1,2015.0,2,4,60,13,41643,8.61,2,2.84,65294.0,"Action Points, Cooperative Game, Hand Manageme...","Strategy Games, Thematic Games"
2,224517.0,Brass: Birmingham,2018.0,2,4,120,14,19217,8.66,3,3.91,28785.0,"Hand Management, Income, Loans, Market, Networ...",Strategy Games
3,167791.0,Terraforming Mars,2016.0,1,5,120,12,64864,8.43,4,3.24,87099.0,"Card Drafting, Drafting, End Game Bonuses, Han...",Strategy Games
4,233078.0,Twilight Imperium: Fourth Edition,2017.0,3,6,480,14,13468,8.7,5,4.22,16831.0,"Action Drafting, Area Majority / Influence, Ar...","Strategy Games, Thematic Games"


In [5]:
board_game_data.duplicated().sum()

0

In [6]:
board_game_data.shape

(20343, 14)

In [7]:
board_game_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20343 entries, 0 to 20342
Data columns (total 14 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   ID                  20327 non-null  float64
 1   Name                20343 non-null  object 
 2   Year Published      20342 non-null  float64
 3   Min Players         20343 non-null  int64  
 4   Max Players         20343 non-null  int64  
 5   Play Time           20343 non-null  int64  
 6   Min Age             20343 non-null  int64  
 7   Users Rated         20343 non-null  int64  
 8   Rating Average      20343 non-null  float64
 9   BGG Rank            20343 non-null  int64  
 10  Complexity Average  20343 non-null  float64
 11  Owned Users         20320 non-null  float64
 12  Mechanics           18745 non-null  object 
 13  Domains             10184 non-null  object 
dtypes: float64(5), int64(6), object(3)
memory usage: 2.2+ MB


In [8]:
board_game_data.isna().sum()

ID                       16
Name                      0
Year Published            1
Min Players               0
Max Players               0
Play Time                 0
Min Age                   0
Users Rated               0
Rating Average            0
BGG Rank                  0
Complexity Average        0
Owned Users              23
Mechanics              1598
Domains               10159
dtype: int64

In [10]:
board_game_data[board_game_data['ID'].isna()]

Unnamed: 0,ID,Name,Year Published,Min Players,Max Players,Play Time,Min Age,Users Rated,Rating Average,BGG Rank,Complexity Average,Owned Users,Mechanics,Domains
10776,,Ace of Aces: Jet Eagles,1990.0,2,2,20,10,110,6.26,10778,2.0,,,
10835,,Die Erben von Hoax,1999.0,3,8,45,12,137,6.05,10837,2.0,,,
11152,,Rommel in North Africa: The War in the Desert ...,1986.0,2,2,0,12,53,6.76,11154,4.0,,,
11669,,Migration: A Story of Generations,2012.0,2,4,30,12,49,7.2,11671,2.0,,,
12649,,Die Insel der steinernen Wachter,2009.0,2,4,120,12,49,6.73,12651,3.0,,,
12764,,Dragon Ball Z TCG (2014 edition),2014.0,2,2,20,8,33,7.03,12766,2.5,,,
13282,,Dwarfest,2014.0,2,6,45,12,82,6.13,13284,1.75,,,
13984,,Hus,,2,2,40,0,38,6.28,13986,2.0,,,
14053,,Contrario 2,2006.0,2,12,0,14,37,6.3,14055,1.0,,,
14663,,Warage: Extended Edition,2017.0,2,6,90,10,49,7.64,14665,3.0,,,


In [11]:
board_game_data[board_game_data['Year Published'].isna()]

Unnamed: 0,ID,Name,Year Published,Min Players,Max Players,Play Time,Min Age,Users Rated,Rating Average,BGG Rank,Complexity Average,Owned Users,Mechanics,Domains
13984,,Hus,,2,2,40,0,38,6.28,13986,2.0,,,


In [12]:
board_game_data[board_game_data['Owned Users'].isna()]

Unnamed: 0,ID,Name,Year Published,Min Players,Max Players,Play Time,Min Age,Users Rated,Rating Average,BGG Rank,Complexity Average,Owned Users,Mechanics,Domains
2828,202755.0,Guildhall Fantasy: Fellowship,2016.0,2,4,45,10,565,7.13,2830,2.0,,"Hand Management, Take That, Set Collection",
3590,196305.0,Guildhall Fantasy: Alliance,2016.0,2,4,45,10,360,7.2,3592,2.14,,"Hand Management, Set Collection, Take That",
3739,196306.0,Guildhall Fantasy: Coalition,2016.0,2,4,45,10,336,7.19,3741,2.13,,"Hand Management, Set Collection, Take That",
5807,289.0,Chariot Lords,1999.0,3,4,360,12,221,6.68,5809,3.0,,"Area Movement, Variable Player Powers",
9202,6813.0,Operation Market Garden: Descent into Hell,1985.0,2,2,120,12,94,6.72,9204,3.0,,"Dice Rolling, Events, Grid Movement, Hexagon G...",
9317,139.0,Hoax,1981.0,3,12,45,10,216,5.97,9319,1.38,,"Deduction, Hidden Roles, Voting",
10075,266756.0,Devil Boats: PT Boats in the Solomons,2021.0,1,1,60,14,49,7.84,10077,2.83,,,
10776,,Ace of Aces: Jet Eagles,1990.0,2,2,20,10,110,6.26,10778,2.0,,,
10835,,Die Erben von Hoax,1999.0,3,8,45,12,137,6.05,10837,2.0,,,
11152,,Rommel in North Africa: The War in the Desert ...,1986.0,2,2,0,12,53,6.76,11154,4.0,,,


### EDA on Training Set

In [22]:
board_game_data['Domains'].value_counts()

Domains
Wargames                                          3029
Strategy Games                                    1455
Family Games                                      1340
Abstract Games                                     869
Children's Games                                   708
Thematic Games                                     647
Party Games                                        409
Family Games, Strategy Games                       354
Customizable Games                                 235
Strategy Games, Thematic Games                     217
Thematic Games, Wargames                           139
Family Games, Party Games                          139
Abstract Games, Family Games                       116
Family Games, Thematic Games                       109
Children's Games, Family Games                     105
Strategy Games, Wargames                            99
Abstract Games, Strategy Games                      40
Party Games, Thematic Games                         36
Cu

In [21]:
board_game_data['Mechanics'].value_counts()

Mechanics
Hand Management                                                                                                 432
Hexagon Grid                                                                                                    412
Dice Rolling                                                                                                    372
Roll / Spin and Move                                                                                            369
Tile Placement                                                                                                  285
                                                                                                               ... 
Dice Rolling, Measurement Movement, Pick-up and Deliver, Variable Player Powers, Variable Set-up                  1
Action Points, Dice Rolling, Grid Movement, Modular Board, Variable Phase Order, Variable Player Powers           1
Area Movement, Hidden Movement, Secret Unit Deployment, Team-B

## **4. Data Preprocessing** <a class="anchor" id="data-preprocess"></a>

## **5. Model Selection and Training** <a class="anchor" id="selection"></a>

### 5.1 Shortlist Promising Models <a class="anchor" id="initial"></a>

### 5.2 Fine Tuning<a class="anchor" id="fine"></a>

## **6. Model Evaluation** <a class="anchor" id="evaluation"></a>

## **7. Conclusion** <a class="anchor" id="conclude"></a>

## **8. Appendix** <a class="anchor" id="append"></a>

The description of the columns from the dataset author's Kaggle post are listed below:

|variable                 |class     |description |
|:---|:---|:-------|
|ID                       |int       | BoardGameGeek ID Number |
|Name                     |character | Board game name  |
|Year Published           |int       | Year published  |
|Min Players              |int       | The minimum suggested number of players to play the game |
|Max Players              |int       | The maximum suggested number of players to play the game |
|Play Time                |int       | Average play time in minutes as suggested by the game creators |
|Min Age                  |int       | Age rating |
|Users Rated              |int       | Number of users who rated the game |
|Rating Average           |float64   | Average of user ratings |
|BGG Rank                 |int       | BoardGameGeek ranking |
|Complexity Average       |float64   | Average of user ratings for complexity from 1 - 5 |
|Owned Users              |int       | Number of users who own the game |
|Mechanics                |character | List of game mechanics for that game |
|Domains                  |character | List of game subgenres |