## BGG Data Cleaning

### Imports

In [5]:
import numpy as np
import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', 100)

from IPython.display import Image, display, Markdown

import nltk
from nltk.tokenize import word_tokenize

### Functions

In [158]:
# Function to show the dimensions, column zero counts, column datatypes, column null counts, and the first and last 5 rows of the input dataframe
def check_df(df):
    # Display the dimensions of the DataFrame
    display(Markdown("#### DataFrame dimensions"))
    display(df.shape)

    # Add a new line
    display(Markdown("<br>"))

    # Display data types of all columns
    display(Markdown("#### Data Types, zeros and nulls"))
    display(pd.DataFrame({
        "Data type": df.dtypes,
        "Zero counts": (df == 0).sum(),
        "Zero count %": (((df == 0).sum()/df.count())* 100).round(2),
        "Null counts": df.isnull().sum(),
        "Null count %": ((df.isnull().sum())/(df.count()+df.isnull().sum())* 100).round(2)
    }))

# Function to find all games in a dataframe which contain a term, and then take the average of the numerical values of a selected column for all returned games
def search_and_average(df, search_column, search_term, avg_column):
    # Filter the rows where the search term is found in the search column
    filtered_df = df[df[search_column].astype(str).str.contains(search_term, case=False, na=False)]
    
    # Exclude rows where the avg_column contains zero
    filtered_df = filtered_df[filtered_df[avg_column] != 0]
    
    # Calculate the average of the second column for the filtered rows
    avg_value = filtered_df[avg_column].mean()
    
    return avg_value

### Creating dataframe copy for cleaning

In [154]:
# Creating a copy of saved_df safeguard against data loss from any mistakes whilst cleaning the data
clean_df = saved_df.copy()

In [166]:
# Checking the null & zero values
check_df(clean_df)

#### DataFrame dimensions

(4981, 19)

<br>

#### Data Types, zeros and nulls

Unnamed: 0,Data type,Zero counts,Zero count %,Null counts,Null count %
id,int64,0,0.0,0,0.0
name,object,0,0.0,0,0.0
average,float64,0,0.0,0,0.0
usersrated,int64,0,0.0,0,0.0
number of comments,int64,0,0.0,0,0.0
complexity votes,int64,4,0.08,0,0.0
average complexity,float64,4,0.08,0,0.0
year published,int64,6,0.12,0,0.0
min player number,int64,4,0.08,0,0.0
max player number,int64,8,0.16,0,0.0


In [38]:
# Removed all games with null values in columns category, mechanism or game designer
clean_df.dropna(subset=['category','mechanism','game designer'], inplace=True)

# Checking games with null values in these columns have been removed
null_count(clean_df)

Unnamed: 0,null counts,% of nulls
id,0,0.0
name,0,0.0
average,0,0.0
usersrated,0,0.0
number of comments,0,0.0
complexity votes,0,0.0
average complexity,0,0.0
year published,0,0.0
min player number,0,0.0
max player number,0,0.0


### Updating complexity values

Complexity votes and average  

Games with no complexity votes and therefore no complexity average are missing at random.  
To mitigate, an average complexity rating from previous similar entries of game series is taken (see values calculated). 

- 331953 Unlock!: Timeless Adventures – Verloren im Zeitstrudel! - 2.14
- 327913 Unlock!: Timeless Adventures – Arsène Lupin und der große weiße Diamant - 2.14
- 202096 Marvel Dice Masters: Iron Man and War Machine Starter Set - 2.45
- 347747 Mythic Mischief: Headmaster's Box - 2.89

In [161]:
# Calculating average values based on other games which match with a specific search term, excluding games with zero values
game_input = "Unlock!:"

# Average complexity from other similar games
round(search_and_average(clean_df,"name",game_input,"average complexity"),2)

2.14

In [310]:
# Updating games in clean_df with no complexity rating, using average of other similar games

clean_df.loc[clean_df['id']==331953, ['average complexity']] = 2.14
clean_df.loc[clean_df['id']==327913, ['average complexity']] = 2.14
clean_df.loc[clean_df['id']==202096, ['average complexity']] = 2.45
clean_df.loc[clean_df['id']==347747 , ['average complexity']] = 2.89

In [195]:
# Checking average complexity values have been updated
clean_df.loc[clean_df['id'].isin([331953,327913,202096,347747]),['id','name','average complexity']]

Unnamed: 0,id,name,average complexity
4944,347747,Mythic Mischief: Headmaster's Box,2.89
4483,202096,Marvel Dice Masters: Iron Man and War Machine Starter Set,2.45
3805,331953,Unlock!: Timeless Adventures – Verloren im Zeitstrudel!,2.14
4049,327913,Unlock!: Timeless Adventures – Arsène Lupin und der große weiße Diamant,2.14


### Updating player number values

Min & max player numbers missing  

Revised values are quoted below (min - max), first based on BGG or publisher website, then game box (if shown), then based on BGG community:
- 37301 Decktet 1 - 1 (website)
- 248641 Monsterpocalypse Miniatures Game 2 - 2 (website)
- 1585 Burma: The Campaign in Northern Burma, 1 - 3 (website)
- 62214 Aspern-Essling 1809 1 - 2 (community)
- 150012 No Retreat!: Polish & French Fronts 2 - 2 (community) 1944
- 135796 Next War:  1- 2 (community)Taiwan
- 85204 Kings 2 - 2 (website) of War
- 170669 Old School Tactical: Volume 1 - Fighting on the Eastern Front 1 - 2 (community) 1941/42
- 163097 Beyond the Rhine: The Campaign for Northwe 1 - 3 (website)st Europe
- 278373 Twisty Littl 1 - 1 (community)e Passages

In [326]:
zero_min_max_players_ids = clean_df.loc[(clean_df['max player number']==0)|(clean_df['min player number']==0),'id'].tolist()

In [314]:
# Updating games in clean_df with missing min and max player numbers
clean_df.loc[clean_df['id']==37301, ['min player number','max player number']] = [1,1]
clean_df.loc[clean_df['id']==248641, ['min player number','max player number']] = [2,2]
clean_df.loc[clean_df['id']==1585, ['min player number','max player number']] = [1,3]
clean_df.loc[clean_df['id']==62214, ['min player number','max player number']] = [1,2]
clean_df.loc[clean_df['id']==150012, ['min player number','max player number']] = [2,2]
clean_df.loc[clean_df['id']==135796, ['min player number','max player number']] = [1,2]
clean_df.loc[clean_df['id']==85204, ['min player number','max player number']] = [2,2]
clean_df.loc[clean_df['id']==170669, ['min player number','max player number']] = [1,2]
clean_df.loc[clean_df['id']==163097, ['min player number','max player number']] = [1,3]
clean_df.loc[clean_df['id']==278373, ['min player number','max player number']] = [1,1]

In [328]:
# Checking min and max player values have been updated
clean_df.loc[clean_df['id'].isin(zero_min_max_players_ids),['id','name','min player number','max player number']]

Unnamed: 0,id,name,min player number,max player number


Extreme Player numbers (greater than 12 players)

Revised values are quoted below (max), first based on BGG or publisher website, then game box (if shown), then based on BGG community:

- 60815 Black Powder: Second Edition - 6 (community)
- 298586 Demeter - 6 (community)
- 360471 Aquamarine - 8 (community)
- 319604 Ricochet: A la poursuite du Comte courant - 3 (community)
- 330664 Varuna - 4 (community)
- 390903 Villagers of the Oak Dell - 4 (community)
- 322045 Cartographers Heroes: Collector's Edition - 100 (website)
- 362121 Sunshine City - 100 (website)
- 388329 Waypoints - 100 (website)
- 263918 Cartographers - 100 (website)
- 233867 Welcome To... - 100 (website)
- 315767 Cartographers Heroes - 100 (website)
- 350736 Voyages - 100 (website)
- 308565 Roll n Cook - 99 (website)
- 245389 Word Slam Family - 99 (website)
- 208754 Wings of Glory: Tripods & Triplanes - 99 (website)
- 302344 Scooby-Doo: Escape from the Haunted Mansion - 5 (community)
- 236370 Blood Red Skies: Battle of Britain - 6 (community)
- 317434 Exit: The Game – Advent Calendar: The Mystery of the Ice Cave - 3 (community)
- 378833 Happy Campers - 4 (community)
- 343322 Exit: The Game – Advent Calendar: The Hunt for the Golden Book - 3 (community)
- 51 Ricochet Robots - 2 (community)
- 30618 Eat Poop You Cat - 13 (community)
- 139771 Star Trek: Attack Wing - 6 (community)
- 286428 Wits & Wagers: It's Vegas, Baby! - 21 (community)
- 186279 Finska Mini - 6 (community)
- 155689 Dungeons4  & Dragons: Attack Wing - (community)
- 4985 Warmaster - 3 (community)
- 233194 Banned Words - 8 (community)
- 122913 Samurai Battles - 2 (community)
- 216710 Wings of Glory: WW2 Battle of Britain Starter Set - 4 (website)
- 304620 World of Tanks Miniatures Game - 4 (community)
- 210295 Lightseekers - 5 (community)
- 18615 Warmaster Ancients - 10 (community)
- 297234 Taskmaster: The Board Game - 6 (community)
- 362164 Pioneer Rails - 80 (website)
- 152242 Ultimate Werewolf: Deluxe Edition - 75 (website)
- 344415 Trek 12: Amazonia - 50 (box)
- 168680 The Werewolves of Miller's Hollow: The Pact - 18 (box)
- 59335 Wherewolf - 24 (community)
- 22303 Celebrities - 12 (community)
- 3553 Close Action: The Age of Fighting Sail Vol. 1 - 20 (website)
- 235113 30 Seconds: Everyday Life - 12 (community)
- 236475 The Werewolves of Miller's Hollow: Best Of - 18 (box)
- 311920 Ultimate Werewolf: Extreme - 25 (website)
- 164236 Witch Hunt - 22 (website)
- 240980 Blood on the Clocktower - 20 (website)
- 283152 Monikers: Serious Nonsense - 16 (box)
- 10501	Canvas Eagles: War in the Skies 1914 - 1918 - 12 (community)
- 195709 Monikers: Something Something - 16 (box)
- 2470 The Extraordinary Adventures of Baron Munchausen - 10 (community)
- 130705 Super Big Boggle - 8 (community)
- 283151 Monikers: Classics - 16 (box)
- 179448 Monikers: Shmonikers - 16 (box)
- 255249 Monikers: More Monikers - 16 (box)
- 245422 Werewords Deluxe Edition - 20 (box)
- 221248 Monikers: The Shut Up & Sit Down Nonsense Box - 16 (box)
- 36553 Time's Up! Title Recall! - 12 (community)
- 180845 Wibbell++ - 10 (website)
- 184424 Mega Civilization - 18 (website)
- 1353 Time's Up! - 4 (box)
- 37141 Time's Up! Deluxe - 4 (box)
- 262341 TTMC: Tu te mets combien? - 16 (box)
- 206715 Ultimate Werewolf Legacy - 16 (community)
- 22 Magic Realm - 8 (community)
- 225167 Human Punishment: Social Deduction 2.0 - 16 (box)
- 383789 CDSK - 10 (community)
- 156546 Monikers - 16 (box)
- 23604 The World Cup Game - 16 (community)


In [318]:
# Saving the id's of the games with greater than 12 max player number values
extreme_max_player_ids = clean_df.loc[clean_df['max player number']>12,'id'].tolist()

In [320]:
# Updating games in clean_df with missing min and max player numbers
clean_df.loc[clean_df['id']== 60815, ['max player number']] = 6
clean_df.loc[clean_df['id']== 298586, ['max player number']] = 6
clean_df.loc[clean_df['id']== 360471, ['max player number']] = 8
clean_df.loc[clean_df['id']== 319604 , ['max player number']] = 3
clean_df.loc[clean_df['id']== 330664, ['max player number']] = 4
clean_df.loc[clean_df['id']== 390903 , ['max player number']] = 4
clean_df.loc[clean_df['id']== 322045, ['max player number']] = 100
clean_df.loc[clean_df['id']== 362121, ['max player number']] = 100
clean_df.loc[clean_df['id']== 388329, ['max player number']] = 100
clean_df.loc[clean_df['id']== 263918, ['max player number']] = 100
clean_df.loc[clean_df['id']== 233867, ['max player number']] = 100
clean_df.loc[clean_df['id']== 315767, ['max player number']] = 100
clean_df.loc[clean_df['id']== 350736, ['max player number']] = 100
clean_df.loc[clean_df['id']== 308565, ['max player number']] = 99
clean_df.loc[clean_df['id']== 245389, ['max player number']] = 99
clean_df.loc[clean_df['id']== 208754, ['max player number']] = 99
clean_df.loc[clean_df['id']== 302344, ['max player number']] = 5
clean_df.loc[clean_df['id']== 236370, ['max player number']] = 6
clean_df.loc[clean_df['id']== 317434, ['max player number']] = 3
clean_df.loc[clean_df['id']== 378833, ['max player number']] = 4
clean_df.loc[clean_df['id']== 343322, ['max player number']] = 3
clean_df.loc[clean_df['id']== 51, ['max player number']] = 2
clean_df.loc[clean_df['id']== 30618, ['max player number']] = 13
clean_df.loc[clean_df['id']== 139771, ['max player number']] = 6
clean_df.loc[clean_df['id']== 286428, ['max player number']] = 21
clean_df.loc[clean_df['id']== 186279 , ['max player number']] = 6
clean_df.loc[clean_df['id']== 155689, ['max player number']] = 4
clean_df.loc[clean_df['id']== 4985, ['max player number']] = 3
clean_df.loc[clean_df['id']== 233194, ['max player number']] = 8
clean_df.loc[clean_df['id']== 122913, ['max player number']] = 2
clean_df.loc[clean_df['id']== 216710, ['max player number']] = 4
clean_df.loc[clean_df['id']== 304620, ['max player number']] = 4
clean_df.loc[clean_df['id']== 210295, ['max player number']] = 5
clean_df.loc[clean_df['id']== 18615, ['max player number']] = 10
clean_df.loc[clean_df['id']== 297234, ['max player number']] = 6
clean_df.loc[clean_df['id']== 362164, ['max player number']] = 80
clean_df.loc[clean_df['id']== 152242, ['max player number']] = 75
clean_df.loc[clean_df['id']== 344415, ['max player number']] = 50
clean_df.loc[clean_df['id']== 168680, ['max player number']] = 18
clean_df.loc[clean_df['id']== 59335, ['max player number']] = 24
clean_df.loc[clean_df['id']== 22303, ['max player number']] = 12
clean_df.loc[clean_df['id']== 3553, ['max player number']] = 20
clean_df.loc[clean_df['id']== 235113, ['max player number']] = 12
clean_df.loc[clean_df['id']== 236475, ['max player number']] = 18
clean_df.loc[clean_df['id']== 311920, ['max player number']] = 25
clean_df.loc[clean_df['id']== 164236, ['max player number']] = 22
clean_df.loc[clean_df['id']== 240980, ['max player number']] = 20
clean_df.loc[clean_df['id']== 283152, ['max player number']] = 16
clean_df.loc[clean_df['id']== 10501, ['max player number']] = 12
clean_df.loc[clean_df['id']== 195709, ['max player number']] = 16
clean_df.loc[clean_df['id']== 2470, ['max player number']] = 10
clean_df.loc[clean_df['id']== 130705, ['max player number']] = 8
clean_df.loc[clean_df['id']== 283151, ['max player number']] = 16
clean_df.loc[clean_df['id']== 179448, ['max player number']] = 16
clean_df.loc[clean_df['id']== 255249, ['max player number']] = 16
clean_df.loc[clean_df['id']== 245422, ['max player number']] = 20
clean_df.loc[clean_df['id']== 221248, ['max player number']] = 16
clean_df.loc[clean_df['id']== 36553, ['max player number']] = 12
clean_df.loc[clean_df['id']== 180845, ['max player number']] = 10
clean_df.loc[clean_df['id']== 184424, ['max player number']] = 18
clean_df.loc[clean_df['id']== 1353, ['max player number']] = 4
clean_df.loc[clean_df['id']== 37141, ['max player number']] = 4
clean_df.loc[clean_df['id']== 262341, ['max player number']] = 16
clean_df.loc[clean_df['id']== 206715, ['max player number']] = 16
clean_df.loc[clean_df['id']== 22, ['max player number']] = 8
clean_df.loc[clean_df['id']== 225167, ['max player number']] = 16
clean_df.loc[clean_df['id']== 383789, ['max player number']] = 10
clean_df.loc[clean_df['id']== 156546, ['max player number']] = 16
clean_df.loc[clean_df['id']== 23604, ['max player number']] = 16

In [322]:
# Checking extreme max player values have been updated
clean_df.loc[clean_df['id'].isin(extreme_max_player_ids),['id','name','max player number']]

Unnamed: 0,id,name,max player number
147,263918,Cartographers,100
179,233867,Welcome To...,100
828,51,Ricochet Robots,2
1380,22,Magic Realm,8
687,1353,Time's Up!,4
176,240980,Blood on the Clocktower,20
1563,139771,Star Trek: Attack Wing,6
1309,30618,Eat Poop You Cat,13
2315,23604,The World Cup Game,16
602,36553,Time's Up! Title Recall!,12


### Updating missing play time

Missing play time

Rules on revised values for playtime
- Round values up to the nearest 5 minutese:
- If no expected playtime - take average of max and min play times
- If no max play time - take min play time
- If no values in min, max or expected playtimes, and other versions exist, take average of other editions/ versions

Exceptions:
- Chess - values based on estimates from chess.com
- Disney Lorcana - values based on wikipedia article
- Under the Lily Banners, 1914: Offensive à outrance! & 
Iron Curtain: Central Europe, 1945-1989 - values based on similar war games
- 1914: Offensive à outrance! - values based on similar war games
- Digimon Card Game, One Piece Card Game, The Genius Star - no evidence found - taken an initial guess (to be revised if played) - 30 / 60 / 45

In [124]:
# Calculating average values based on other games which match with a specific search term, excluding games with zero values
game_input = "Sherlock Aquelarre: "

# min play time average
display(search_and_average(clean_df,'name',game_input,'min play time'))

# max play time average
display(search_and_average(clean_df,'name',game_input,'max play time'))

# Average of min and min play time average
display((search_and_average(clean_df,'name',game_input,'max play time') + search_and_average(saved_df,'name',game_input,'min play time'))/2)

60.0

nan

nan

Revised playtime values (min / max / expected) in minutes:

- 171	Chess - 10 / 120 / 30
- 242705	Aeon Trespass: Odyssey - 90 / 90 / 90
- 313889	Hoplomachus: Victorum - 40 / 55 / 50
- 296108	Terraforming Mars: The Dice Game - 45 / 60 / 55
- 1540	BattleTech - 120 / 120 / 120
- 369646	Disney Lorcana - 30 / 45 / 40
- 357746	Disney Sorcerer's Arena: Epic Alliances Core Set - 35 / 35 / 35
- 347900	Tin Helm - 30 /30 / 30
- 29285	Case Blue - 90 / 90 / 90
- 17651	Under the Lily Banners - 120 / 360 / 240
- 142889	Enemy Coast Ahead: Operation Chastise – The Dambuster Raid - 20 / 360 / 190
- 249750	Brazen Chariots: Battles for Tobruk, 1941 - 120 / 360 / 240
- 368036	Unlock!: Short Adventures – The Flight of the Angel - 50 / 65 / 60
- 367820	Dungeons & Dragons: The Yawning Portal - 65 / 85 / 75
- 308368	Digimon Card Game - 30 / 60 / 45
- 172844	Charms - 60 / 60 / 60
- 173574	1836Jr - 180 / 180 /180
- 368040	Unlock!: Short Adventures – In Pursuit of Cabrakan - 50 / 65 / 60
- 394961	Penny Black - 20 / 20 / 20
- 362505	One Piece Card Game - 30 / 45 / 60
- 46669	1914: Offensive à outrance - 120 / 360 / 240)
- 273655	Shadows of Brimstone: Gates of Valhalla - 100 / 155 / 130
- 238181	Kamigami Battles: Battle of the Nine Realms - 45 / 45 / 45
- 341080	Warhammer Age of Sigmar (Third Edition) - 90 / 90 / 90
- 340722	Smart10: Family - 20 / 30 / 25
- 310726	Iron Curtain: Central Europe, 1945-1989
- 359835	Dungeons & Dragons: Onslaught - 65 / 85 / 75
- 328862	Looney Tunes Mayhem - 30 / 30 / 30
- 329812	The Genius Star - 30 / 60 / 45
- 353289	Sherlock Aquelarre: El Mercader - 60 / 60 / 60

In [330]:
# Saving the id's of the games with zero min, max and expected play times
zero_min_max_expected_playtimes_ids = clean_df.loc[(clean_df['min play time']==0)|(clean_df['max play time']==0)|(clean_df['expected play time']==0),'id'].tolist()

In [332]:
# Updating min, max & expected play time values (minutes)
clean_df.loc[clean_df['id']== 171, ['min play time','max play time','expected play time']] = [10,120,30]
clean_df.loc[clean_df['id']== 242705, ['min play time','max play time','expected play time']] = [ 90,90,90 ]
clean_df.loc[clean_df['id']== 313889, ['min play time','max play time','expected play time']] = [ 40,55,50 ]
clean_df.loc[clean_df['id']== 296108, ['min play time','max play time','expected play time']] = [ 45,60,55 ]
clean_df.loc[clean_df['id']== 1540, ['min play time','max play time','expected play time']] = [ 120,120,120 ]
clean_df.loc[clean_df['id']== 369646, ['min play time','max play time','expected play time']] = [ 30,45,40 ]
clean_df.loc[clean_df['id']== 357746, ['min play time','max play time','expected play time']] = [ 35,35,35 ]
clean_df.loc[clean_df['id']== 347900, ['min play time','max play time','expected play time']] = [ 30,30,30 ]
clean_df.loc[clean_df['id']== 29285, ['min play time','max play time','expected play time']] = [ 90,90,90 ]
clean_df.loc[clean_df['id']== 17651, ['min play time','max play time','expected play time']] = [ 120,360,240 ]
clean_df.loc[clean_df['id']== 142889, ['min play time','max play time','expected play time']] = [ 20,360,190 ]
clean_df.loc[clean_df['id']== 249750, ['min play time','max play time','expected play time']] = [ 120,360,240 ]
clean_df.loc[clean_df['id']== 368036, ['min play time','max play time','expected play time']] = [ 50,65,60 ]
clean_df.loc[clean_df['id']== 367820, ['min play time','max play time','expected play time']] = [ 65,85,75 ]
clean_df.loc[clean_df['id']== 308368, ['min play time','max play time','expected play time']] = [ 30,60,45 ]
clean_df.loc[clean_df['id']== 172844, ['min play time','max play time','expected play time']] = [ 60,60,60 ]
clean_df.loc[clean_df['id']== 173574, ['min play time','max play time','expected play time']] = [ 180,180,180 ]
clean_df.loc[clean_df['id']== 368040, ['min play time','max play time','expected play time']] = [ 50,65,60 ]
clean_df.loc[clean_df['id']== 394961, ['min play time','max play time','expected play time']] = [ 20,20,20 ]
clean_df.loc[clean_df['id']== 362505, ['min play time','max play time','expected play time']] = [ 30,45,40 ]
clean_df.loc[clean_df['id']== 46669, ['min play time','max play time','expected play time']] = [ 120,360,240 ]
clean_df.loc[clean_df['id']== 273655, ['min play time','max play time','expected play time']] = [ 100,155,130 ]
clean_df.loc[clean_df['id']== 238181, ['min play time','max play time','expected play time']] = [ 45,45,45 ]
clean_df.loc[clean_df['id']== 341080, ['min play time','max play time','expected play time']] = [ 90,90,90 ]
clean_df.loc[clean_df['id']== 340722, ['min play time','max play time','expected play time']] = [ 20,30,25 ]
clean_df.loc[clean_df['id']== 310726, ['min play time','max play time','expected play time']] = [ 120,360,240 ]
clean_df.loc[clean_df['id']== 359835, ['min play time','max play time','expected play time']] = [ 65,85,75 ]
clean_df.loc[clean_df['id']== 328862, ['min play time','max play time','expected play time']] = [ 30,30,30 ]
clean_df.loc[clean_df['id']== 329812, ['min play time','max play time','expected play time']] = [ 30,60,45 ]
clean_df.loc[clean_df['id']== 353289, ['min play time','max play time','expected play time']] = [ 60,60,60 ]

In [263]:
# Checking extreme max player values have been updated
clean_df.loc[clean_df['id'].isin(zero_min_max_expected_playtimes_ids),['id','name','min play time','max play time','expected play time']]

Unnamed: 0,id,name,min play time,max play time,expected play time
448,171,Chess,10,120,30
1222,1540,BattleTech,120,120,120
641,242705,Aeon Trespass: Odyssey,90,90,90
1212,296108,Terraforming Mars: The Dice Game,45,60,55
1027,313889,Hoplomachus: Victorum,40,55,50
2696,29285,Case Blue,90,90,90
2803,17651,Under the Lily Banners,120,360,240
1360,369646,Disney Lorcana,30,45,40
2897,142889,Enemy Coast Ahead: Operation Chastise – The Dambuster Raid,20,360,190
3911,46669,1914: Offensive à outrance,120,360,240


### Updating minimum age limit

Missing minimum age limit  

If no minimum age limit, assume it is the maximum (21), revise values when played, or researched in more depth

In [334]:
# Updating all minimum age limit values for games with zero minimum age limit to 21
clean_df.loc[clean_df['minimum age limit']==0,'minimum age limit'] = 21

In [336]:
# Checking that the update has been succesfull by searching for any remaining games with a zero minimum age limit
clean_df.loc[clean_df['minimum age limit']==0,['minimum age limit']]

Unnamed: 0,minimum age limit
