In [37]:
# Use of IPython.display and HTML for using css to format text in Markdown based on
# www.dataquest.io/blog/advanced-jupyter-notebooks-tutorial
from IPython.display import HTML
HTML(
    "<style>\
    span.str {color:#BA2121; font-style:italic; font-weight:bold;}\
    span.num {color:#080; font-style:italic; font-weight:bold;}\
    span.bltn {color:#080; font-weight:bold;}\
    span.op {color:#AA22FF;}\
    span.func {color:#00F;}\
    h3.yt {color:#009900; font-style:italic;}</style>"
)

<h2><b><u>Case Study: 20,000 Board Games</u></b></h2>

<h3><b><i>Background</i></b></h3>
The tabletop gaming industry, which include board games, has experienced a resurgence in popularity during the late 2000s [1]. The estimated global market for board games was approximately 7.6 billion USD in 2017 with the potential to rise to 12 billion USD by 2023 [2].  As of 2021, there were 1,288 pages of board games listed on Board Game Geek with 100 games per each page to view [3]. With so many board games now available, selecting one to purchase can be difficult. 

<h3><b><i>Goal</i></b></h3>
The goal of this case study is to recommend board games based on specified criteria. For example, a customer wants to play the top rated board game; what game meets this criteria? You will use attributes such as the rank of the game, the average rating of the game, the number of user ratings, and the minimum and maximum number of players to make recommendations on a game to purchase.

As you work through this case study you will practice concepts you have learned such in DataCamp's <a href="https://learn.datacamp.com/courses/intro-to-python-for-data-science">Introduction to Python</a> and <a href="https://learn.datacamp.com/courses/data-science-for-everyone">Data Science for Everyone</a> such as: storing data as <b><i>Variables</i></b>, working with data structures such as a <b><i>List</i></b>, and importing various <b><i>Python Libraries</i></b>.

![img_1.jpg](attachment:img_1.jpg)

<div style="text-align: center">[4]</div>



<h3><b><i>Data</i></b></h3>
A subset of the <b><i>"20,000 Boardgames Dataset"</i></b> [5] hosted on Kaggle is used in this case study. The <b><i>20,000 Boardgames Dataset</i></b> is based on rankings scraped from boardgamegeek.com on January 13th, 2020 and is released under the CC0 license. To view the full dataset, <a href = "https://www.kaggle.com/extralime/20000-boardgames-dataset">click here</a>.

You will work with the following modified subset of data in the <b><i>Games table </i></b> below. 

<table style="width:75%">
    <tr>
        <th>Id</th>
        <th>Name</th>
        <th>Rank</th>
        <th>Minimum Players</th>
        <th>Maximum Players</th>
        <th>Average Rating</th>
    </tr>
    <tr>
        <td>174430</td>
        <td>Gloomhaven</td>
        <td>1</td>
        <td>1</td>
        <td>4</td>
        <td>8.85</td>
    </tr>    
     <tr>
         <td>167791</td>
        <td>Terraforming Mars</td>
        <td>3</td>
         <td>1</td>
         <td>4</td>
        <td>8.42</td>    
    </tr>
    <tr>
        <td>220308</td>
        <td>Gaia Project</td>
        <td>9</td>
        <td>1</td>
        <td>4</td>
        <td>8.51</td>
    </tr>
     <tr>
        <td>266192</td>
        <td>Wingspan</td>
        <td>27</td>
        <td>1</td>
        <td>5</td>
        <td>8.10</td>    
    </tr>
    <tr>
        <td>237182</td>
        <td>Root</td>
        <td>39</td>
        <td>2</td>
        <td>4</td>
        <td>8.06</td>  
    </tr>
    
</table>

In addition to these 5 games, you will also work with data from the top 20 ranked board games from this dataset.


Sources: 
<ol style = "font-size:8px">
    <li>Shear Fischgrund, Margo. 2020. "The Rise of Board Games in Today’s Tech-dominated Culture". Pittwire. University of Pittsburgh. https://www.pittwire.pitt.edu/news/rise-board-games-today-s-tech-dominated-culture</li>
    <li>Statista. 2021. "Global board games market value from 2017 to 2023". Consumer Goods and FMCG > Toys. https://www.statista.com/statistics/829285/global-board-games-market-value/</li>
    <li>Board Game Geek. 2021. "Browse". https://boardgamegeek.com/browse/boardgame/</li>
    <li>Pxfuel. 2021. Image. https://www.pxfuel.com/en/search?q=board+game. 
    <li>Lime and ZakWhite. 2021. 20,000 Board Games Dataset: January 13th, 2020. Kaggle. https://www.kaggle.com/extralime/20000-boardgames-dataset</li>
</ol>



<h3>Python Basics</h3>
In the code blocks below you will practice creating <b><i>variables</i></b> that store different values, examine data types such as <span class="str">String</span>, <span class="num">Integer and Float</span>, and work with the built-in functions <span class="bltn">print() and type()</span>.

<h4><u>Creating Variables</u></h4>
Create a variable named <span class="num">id_1</span> and save the <i>"Id"</i> value for the first game in the <b><i>Games table </i></b>. 

In [38]:
# Create a variable named id_1 and save the "Id" value for the first game in the table. 
id_1 = 174430

Create a variable named <span style = "color:#BA2121"><b><i>name_1</i></b></span> and save the <i>"Name"</i> value for the first game in the <b><i>Games table </i></b>. 

In [39]:
# Create a variable named name_1 and store the "Name" of the first game in the table
name_1 = 'Gloomhaven'

Create a variable named <span class="num">rank_1</span> and save the <i>"Rank"</i> value for the first game in the <b><i>Games table</i></b>. <span class="bltn">

In [40]:
# Create a variable rank_1 and store the "Rank" of the first game in the table
rank_1 = 1

Create a variable named <span class="num">avg_rating_1</span> and save the <i>"Average Rating"</i> value for the first game in the <b><i>Games table</i></b>.

In [41]:
# Create a variable named avg_rating_1 and store the average rating of the first game in the table
avg_rating_1 = 8.85

<h4><u>Examining Variables using Built-In Functions</u></h4>

<span class="bltn">Print</span> the data type of the <span class="num">id_1</span> variable using the <span class="bltn">type</span> function.


In [42]:
# Printing the data type of id_1

print(type(id_1))


<class 'int'>


<span style = "color:#008000"><b>Print</b></span> the value stored in the <span style = "color:#BA2121"><b><i>name_1</i></b></span> variable using the <span style = "color:#008000"><b>print</b></span> function.



In [43]:
# Print name_1
print(name_1)

Gloomhaven


<span style = "color:#008000"><b>Print</b></span> the data type of the <span style = "color:#BA2121"><b><i>name_1</i></b></span> variable using the <span style = "color:#008000"><b>type</b></span> function.

In [44]:
# Print the type of name_1
print(type(name_1))

<class 'str'>


 <span class="bltn">Print</span> the data type of the <span class="num">avg_rating_1</span> variable using the <span class="bltn">type</span> function.

In [45]:
# Print the avg_rating_1 data type
print(type(avg_rating_1))

<class 'float'>


 <span class="bltn">Print</span> the following sentence: <span class="str">"Id_1 is equal to: "</span> <span class="num">id_1</span> Where <span class="num">id_1</span> is equal to the value stored in that variable.

In [46]:
# Print the following sentence: Id_1 is equal to: id_1.
print('Id_1 is equal to: ' + str(id_1))

Id_1 is equal to: 174430


<h3>Python Lists</h3>
In the code block below you will practice storing, accessing, and manipulating, data in <b><i>lists</i></b>. 

<h4><u>Creating a List</u></h4>
Create a variable <b><i>avg_ratings</i></b> that stores a <b><i>list</i></b> containing the first four values in the <i>"Average Rating" </i> column. Store the string <span class="str">"Average Rating" </span> in <b>position 0</b> of the list. 

In [47]:
# Create a variable avg_ratings that stores the average rating for the first four rows of the table above. 
# Include the string "Average Rating" as the first element in this list. 
avg_ratings = ['Average Rating', 8.85, 8.42, 8.51, 8.10]
print(avg_ratings)

['Average Rating', 8.85, 8.42, 8.51, 8.1]


<h4><u>Working with Elements of a List</u></h4>


Create a string containing the following phrase: <span class="str">"The number of elements in the list is : " </span>
and append to the end of it the length of the <b><i>avg_ratings</i></b> list using the <span class="bltn">len and str</span> functions.

In [48]:
# Print the number of elements in the avg_ratings list. Use the follow format: The number of elements is: ______. 
print('The number of elements is: ' + str(len(avg_ratings)))

The number of elements is: 5


Using the <span class="bltn">type</span> function <span class="bltn">print</span> the type of the <b><i> avg_ratings</i></b> variable.

In [49]:
# Print the data type of the avg_ratings variable
print(type(avg_ratings))

<class 'list'>


Using the <span class="bltn">type</span> function <span class="bltn">print</span> the type of the element at <b>position 0</b> in the <b><i> avg_ratings</i></b> variable.

In [50]:
# Print the data type of the first element in the avg_ratings list.
print(type(avg_ratings[0]))

<class 'str'>


Using the <span class="bltn">append</span> function append the value <span class="num">8.06</span> to the end of the <b><i> avg_ratings</i></b> list.

In [51]:
# Append the average rating for the last game in the table to the avg_ratings list. 
avg_ratings.append(8.06)

<span class="bltn">Print</span> the values stored in the <b><i>avg_ratings</i></b> list.

In [52]:
# Print the avg_ratings list
print(avg_ratings)

['Average Rating', 8.85, 8.42, 8.51, 8.1, 8.06]


Modify the <b><i>avg_ratings</i></b> list so that only <span class="num">floating point </span>values remain. Hint: Remove the value <span class="str">"Average Rating"</span>.

In [53]:
# Modify the avg_ratings list so that only the floating point values remain. 
# Hint: Remove "Average Rating". 
avg_ratings.remove('Average Rating')

<span class="bltn">Print</span> the values stored in the updated <b><i>avg_ratings</i></b> list.

In [54]:
# Print the average ratings list
print(avg_ratings)

[8.85, 8.42, 8.51, 8.1, 8.06]


<span class="bltn">Print</span> the <b>third element</b> in the <b><i>avg_ratings</i></b> list. Hint: Remember that Python lists begin with index 0. 

In [55]:
# Print the third element in the avg_ratings list. Hint: Remember Python lists begin with index 0. 
print(avg_ratings[2])

8.51


<h3>Functions and Packages</h3>
In the code block below, you will practice using <b><i>functions</i></b> and <b><i>packages</i></b>. Run the cell immediately below to intialize the variables needed for the next part of Exercise 1. 

<h4><u>Creating Additional Variables</u></h4>
Run the cell immediately below to intialize the variables needed for the next part of Exercise 1. 

In [56]:
# First run the code block below to store these variables in memory. 
# Each variable represents an attribute or column in the table. 
ids = [174430, 167791, 220308, 266192, 237182]
names = ["Gloomhaven", "Terraforming Mars", "Gaia Project", "Wingspan", "Root"]
ranks = [1, 3, 9, 27, 39]
min_players = [1, 1, 1, 1, 2]
max_players = [4, 4, 4, 5, 4]

<h4><u>Importing Libraries</u></h4>
<span class = "bltn">Import</span> the <b>statistics</b> and <b>math</b> modules.

In [57]:
# Import Python's built in Statistics Module
import statistics

# Import Python's math module
import math

<h4><u>More Functions and Methods</u></h4>
Use the <b>mean</b> function in the statistics module to <span class="bltn">print</span> the mean of the <b><i>avg_ratings</i></b> list. 

In [58]:
# Print the mean of the average game rating from the table above. 
statistics.mean(avg_ratings)

8.388

The <b><i>avg_ratings</i></b> list holds <span class="num">floating point</span> values. First, sort the values in descending order, then select the value at <b>position 0</b>. This value represents the highest rating. Next, use the <b>ceil</b> function in the math module to take the ceiling of the highest rating value and <span class="bltn">print </span>the result. 

In [59]:
# Take the ceiling of the highest rating. Print the result. 
print(math.ceil(sorted(avg_ratings, reverse=True)[0]))

9


<span class="bltn">Print </span> the number of games that can be played with a single player. Use the built-in method <span class="bltn">count </span> and pass <span class="num">1 </span> as the argument. 

In [60]:
# Print the number of games that require a minimum player of 1
print(min_players.count(1))

4


<span class="bltn">Print </span> the number of games that allow a maximum of five people to play. Use the built-in method <span class="bltn">count </span> and pass <span class="num">5 </span> as the argument. 

In [61]:
# Print the number of games that allow 5 people to play. 
print(max_players.count(5))

1


Use the <b><i>ranks</i></b> list and the built-in method <span class="bltn">index </span> to get the index of the game with the rank of 9 and store it in the variable <span class="num">ninth_rank</span>. Use the index in <span class="num">ninth_rank</span> to access the value stored in <b><i>names</i></b> to <span class="bltn">print</span> the name of the game with the ninth ranking.

In [62]:
# Find the index of the game with rank 9 and use this to print it's name and ranking. 
ninth_rank = ranks.index(9)

# Print the name of the 9th ranked game. 
print(names[ninth_rank])

Gaia Project


<h3> Use Case </h3>
The code blocks below show how the full dataset can be used with built-in Python modules to answer questions about the data. This code mostly uses deafult modules in Python. In later exercises, we will use add-on packages such as Pandas instead. The following questions will be answered:

<ul>
    <li>For the top 20 games, which game had the most users rate it? What is the average rating of this game and what is its rank?</li>
    <li>For the top 20 games, which games can be played by more than 4 players at once? What is the maximum number of players for these games?</li>
</ul>

<h4><u>Reading in the Data</u></h4>

In [63]:
# Import the csv file using Python's built in csv module
import csv

# Import NumPy
import numpy as np

game_data = []

# Read in csv file to a list.
with open("Data/board_games_20.csv", newline='') as csvfile:
    game_reader = csv.reader(csvfile, delimiter=',', quotechar='|')
    for row in game_reader:
        game_data.append(list(row))
        
# Remove attribute names from the data
game_data = game_data[1:]

print(len(game_data))

# Restrict the data to only keep columns relevant to answering the use case questions. 
# In this case the attributes are: objectid, name, rank, minplayers, maxplayers, usersrated, and average. 
selected_games = []
for game in game_data:
    game_info = []
    game_info.extend([int(game[0]), game[1], int(game[3]), int(game[4]), int(game[5]), int(game[14]), float(game[15])])
    selected_games.append(game_info)



20


<h4><u>Question 1</u></h4>
Which game had the most users rate it? What is the average rating of this game and what is its rank? Convert the usersrated values to a <b><i>numpy array </i></b>. 

In [64]:
# Question 1: Which game had the most users rate it? What is the average rating of this game and what is its rank?
# Convert the usersrated values to a numpy array
user_ratings = np.asarray([game[5] for game in selected_games])

# Print the array's data type
print(user_ratings.dtype)

# Store the index of the most rated game 
most_rated_idx = user_ratings.argmax()

# Print the name of the game that had the most user ratings
print("Game with the most user ratings is : " + selected_games[most_rated_idx][1])
print("Number of user ratings: " + str(selected_games[most_rated_idx][5]))
print("Average rating: " + str(selected_games[most_rated_idx][6]))
print("Rank is: " + str(selected_games[most_rated_idx][2]))


int32
Game with the most user ratings is : Terraforming Mars
Number of user ratings: 48339
Average rating: 8.42299
Rank is: 3


<h4><u>Question 2</u></h4>
Which games can be played by more than 4 players at once? 


In [65]:
# Question 2: Which games can be played by more than 4 players at once? 
# Print the names and maximum number of players for games in which more than 4 players can play at one time. 
print([[game[1],game[4]] for game in selected_games if game[4] > 4])

[['Terraforming Mars', 5], ['Twilight Imperium (Fourth Edition)', 6], ['Scythe', 5], ['Terra Mystica', 5], ['Concordia', 5], ['Viticulture Essential Edition', 6]]


<h3 class = "yt">Your Turn</h3>
Using the code above as a hint, answer the questions below. Make sure to post the code for each question in its own cell. 

<ol>
    <li>Which board games can be played by a single player? Print the names of the games that match this criteria. </li>
    <li>What is the total number of single player board games?</li>
    <li>What is the mean rating of the average rating for all board games?</li>
    <li>Which board games have an average user rating higher than the mean average rating for the dataset?</li>
</ol>
    

<h3 class = "yt">Your Turn 1:  </h3>
Which board games can be played by a single player? Print the names of the games that match this criteria.

In [66]:
# Answer to Your Turn 1 
print([[game[1],game[3]] for game in selected_games if game[3] <= 1])

[['Gloomhaven', 1], ['Terraforming Mars', 1], ['Gaia Project', 1], ['Scythe', 1], ['Spirit Island', 1], ['The 7th Continent', 1], ['Viticulture Essential Edition', 1]]


<h3 class = "yt">Your Turn 2: </h3>
What is the total number of single player board games?

In [95]:
#Answer to Your Turn 2
count = 0
for game in selected_games:
     if game[3] == 1:
        count += 1
print('There are a total of', count,'single player board games.')

There are a total of 7 single player board games.


<h3 class = "yt">Your Turn 3: </h3>
What is the mean rating of the average rating for all board games?

In [86]:
# Answer to Your Turn 3
sum = 0
i = 0
for game in selected_games:
    sum += game[6]
    i += 1
avg = sum / i
print('The average rating of all board games is', avg,'.')

The average rating of all board games is 8.3706215 .


<h3 class = "yt">Your Turn 4: </h3>
Which board games have an average user rating higher than the mean average rating for the dataset?

In [87]:
# Answers to Your Turn 4
print([[game[1],game[6]] for game in selected_games if game[6] > avg])

[['Gloomhaven', 8.85292], ['Pandemic Legacy  Season 1', 8.62499], ['Terraforming Mars', 8.42299], ['Through the Ages  A New Story of Civilization', 8.49419], ['Brass  Birmingham', 8.62031], ['Twilight Imperium (Fourth Edition)', 8.68965], ['Star Wars  Rebellion', 8.42602], ['Gaia Project', 8.50686], ['War of the Ring (Second Edition)', 8.44938]]
