# 评估和清理热门电子游戏数据

## 分析目标

此数据分析的目的是根据1980年至2023年热门电子游戏数据，通过对该种类游戏用户数量、评论数量等多个维度挖掘玩家普遍喜爱游戏类型。

本项目的目的在于练习评估数据干净和整洁度，并且基于评估结果，对数据进行清洗，从而得到可供下一步分析的数据。

## 简介

数据集包含从 1980 年到 2023 年的视频游戏列表，并提供发布日期、用户评论评级和评论家评论评级等信息。游戏发布的平台不唯一，且游戏平台使用玩家数量亦有差距，数据选取均通过多种评价综合总结得出的热门电子游戏。

每列数据的含义如下：
- `Title`：游戏标题
- `Release Date`：游戏首个版本发布日期
- `Team`：游戏开发团队
- `Rating`：平均评分
- `Times Listed`：列出此游戏的用户数量
- `Number of Reviews`：用户提供的评论数量
- `Genres`：游戏所属的所有类型/流派
- `Summary`：团队提供的摘要/概述
- `Reviews`：用户的评价/评论

## 读取数据

导入数据分析所需要的库，并通过Pandas的`read_csv`函数，将原始数据文件"games.csv"里的数据内容，解析为DataFrame，并赋值给变量`original_data`。

In [1]:
import pandas as pd

In [2]:
original_data = pd.read_csv("games.csv")

In [3]:
original_data.head()

Unnamed: 0.1,Unnamed: 0,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
0,0,Elden Ring,"Feb 25, 2022","['Bandai Namco Entertainment', 'FromSoftware']",4.5,3.9K,3.9K,"['Adventure', 'RPG']","Elden Ring is a fantasy, action and open world...","[""The first playthrough of elden ring is one o...",17K,3.8K,4.6K,4.8K
1,1,Hades,"Dec 10, 2019",['Supergiant Games'],4.3,2.9K,2.9K,"['Adventure', 'Brawler', 'Indie', 'RPG']",A rogue-lite hack and slash dungeon crawler in...,['convinced this is a roguelike for people who...,21K,3.2K,6.3K,3.6K
2,2,The Legend of Zelda: Breath of the Wild,"Mar 03, 2017","['Nintendo', 'Nintendo EPD Production Group No...",4.4,4.3K,4.3K,"['Adventure', 'RPG']",The Legend of Zelda: Breath of the Wild is the...,['This game is the game (that is not CS:GO) th...,30K,2.5K,5K,2.6K
3,3,Undertale,"Sep 15, 2015","['tobyfox', '8-4']",4.2,3.5K,3.5K,"['Adventure', 'Indie', 'RPG', 'Turn Based Stra...","A small child falls into the Underground, wher...",['soundtrack is tied for #1 with nier automata...,28K,679,4.9K,1.8K
4,4,Hollow Knight,"Feb 24, 2017",['Team Cherry'],4.4,3K,3K,"['Adventure', 'Indie', 'Platform']",A 2D metroidvania with an emphasis on close co...,"[""this games worldbuilding is incredible, with...",21K,2.4K,8.3K,2.3K


## 评估数据

在这一部分，我将对在上一部分建立的`original_data`这个DataFrame所包含的数据进行评估。

评估主要从两个方面进行：结构和内容，即整齐度和干净度。数据的结构性问题指不符合“每列是一个变量，每行是一个观察值，每个单元格是一个值”这三个标准，数据的内容性问题包括存在丢失数据、重复数据、无效数据等。

### 评估数据整齐度

In [4]:
original_data.sample(10)

Unnamed: 0.1,Unnamed: 0,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
1431,1431,Silent Hill 2: Restless Dreams,"Dec 19, 2001","['Team Silent', 'Konami']",4.5,90,90,['Adventure'],Silent Hill 2: Restless Dreams is an updated v...,"['*born from a wish', 'pog', 'Best psychologic...",514,14,130,81
667,667,Wanted: Dead,"Feb 14, 2023","['Soleil', '110 Industries']",2.6,81,81,"['Adventure', 'Brawler', 'Shooter']","Wanted: Dead promises ""spectacular melee comba...","[""It's really weird that the narrative around ...",100,15,94,212
531,531,Metal: Hellsinger,"Sep 15, 2022","['Level Infinite', 'The Outsiders']",3.7,322,322,"['Adventure', 'Music', 'Shooter']",Strike terror into the hearts of demons and de...,"['Should have won best OST 2022', ""May revisit...",1.2K,97,419,424
1084,1084,Kirby: Planet Robobot,"Apr 28, 2016","['HAL Laboratory', 'Nintendo']",4.2,669,669,"['Adventure', 'Platform']","""Harness the power of a mysterious mech to sto...","['Remaster plz', 'Kirby can (yet again) do no ...",3.6K,82,1K,644
953,953,Uncharted 2: Among Thieves,"Oct 13, 2009","['Naughty Dog', 'Sony Computer Entertainment']",3.9,1.4K,1.4K,"['Adventure', 'Platform', 'Shooter']","In the sequel to Drake's Fortune, Nathan Drake...","[""Infinitely better than the first game! The p...",12K,96,1.5K,515
675,675,Sonic Advance,"Dec 20, 2001","['Sega', 'Dimps']",3.2,446,446,['Platform'],Sonic Advance is notable for being the first S...,"['Disappointing in every way. Mediocre music, ...",4K,26,508,212
786,786,Persona 5 Royal,"Oct 31, 2019","['Atlus USA', 'Atlus']",4.4,2.7K,2.7K,"['Adventure', 'RPG', 'Turn Based Strategy']",An enhanced version of Persona 5 with some new...,"['Verdadeiro goty 2017, zelda é o caralho. Vai...",12K,2.3K,5.1K,3K
743,743,Fallout 3: Game of the Year Edition,"Oct 13, 2009","['Bethesda Game Studios', 'Bethesda Softworks']",3.6,188,188,"['Adventure', 'RPG', 'Shooter']",Prepare for the Future with Fallout 3: Game of...,['O primeiro jogo da franquia do Fallout a ser...,2.6K,67,694,123
131,131,Titanfall 2,"Oct 28, 2016","['Respawn Entertainment', 'Electronic Arts']",4.1,1.3K,1.3K,"['Adventure', 'Shooter']",Titanfall 2 will deliver a crafted experience ...,"['iyi', ""It's a short game and all but I had f...",12K,163,2.1K,656
1499,1499,WWE SmackDown vs. Raw 2008,"Nov 09, 2007","['THQ', ""YUKE'S Co., Ltd.""]",3.1,62,62,"['Fighting', 'Sport']",The 2008 edition in the Smackdown vs. Raw seri...,['The earliest memory I have of this game was ...,1.4K,1,34,16


从抽样的10行数据数据来看，数据符合“每列是一个变量，每行是一个观察值，每个单元格是一个值”，具体来看每行是关于某个游戏的名称、简介等方面的介绍，每列则是游戏相关的数据变量，因此不存在结构性问题。

### 评估数据干净度

In [5]:
original_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1512 entries, 0 to 1511
Data columns (total 14 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Unnamed: 0         1512 non-null   int64  
 1   Title              1512 non-null   object 
 2   Release Date       1512 non-null   object 
 3   Team               1511 non-null   object 
 4   Rating             1499 non-null   float64
 5   Times Listed       1512 non-null   object 
 6   Number of Reviews  1512 non-null   object 
 7   Genres             1512 non-null   object 
 8   Summary            1511 non-null   object 
 9   Reviews            1512 non-null   object 
 10  Plays              1512 non-null   object 
 11  Playing            1512 non-null   object 
 12  Backlogs           1512 non-null   object 
 13  Wishlist           1512 non-null   object 
dtypes: float64(1), int64(1), object(12)
memory usage: 165.5+ KB


从输出结果来看，数据共有1512条观察值，而`Team`、`Rating`、`Summary`变量存在缺失值。
`Unnamed:0`列表名不规范，应当进行列名更正。

#### 评估缺失数据

在了解`Team`变量存在缺失值，根据条件提取出缺失观察值并进行分析。

In [6]:
original_data[original_data["Team"].isnull()]

Unnamed: 0.1,Unnamed: 0,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
1245,1245,NEET Girl Date Night,"Oct 21, 2022",,2.7,21,21,['Visual Novel'],Your friend sets you up on a date with his NEE...,"['this sucked. ""Omg she is literally me"" is no...",106,1,44,42


只有一条游戏数据缺失`Team`变量值。
由于我们分析目的是挖掘玩家喜爱游戏类型，该变量值缺失不会影响到我们的分析目的，且游戏数据其他变量值完整，应予以保留。

此外，`Summary`变量存在缺失值且只有一条，提取该缺失观察值。
`Summary`变量值缺失不会影响到我们的分析目标，但`Rating`变量值同样存在缺失，我们之后进行评估。

In [7]:
original_data[original_data["Summary"].isnull()]

Unnamed: 0.1,Unnamed: 0,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
649,649,Death Stranding 2,releases on TBD,['Kojima Productions'],,105,105,"['Adventure', 'Shooter']",,[],3,0,209,644


同样`Rating`变量也存在缺失值，采用相同的办法提取出观察值。

In [8]:
original_data[original_data["Rating"].isnull()]

Unnamed: 0.1,Unnamed: 0,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
587,587,Final Fantasy XVI,"Jun 22, 2023","['Square Enix', 'Square Enix Creative Business...",,422,422,['RPG'],Final Fantasy XVI is an upcoming action role-p...,[],37,10,732,2.4K
649,649,Death Stranding 2,releases on TBD,['Kojima Productions'],,105,105,"['Adventure', 'Shooter']",,[],3,0,209,644
713,713,Final Fantasy VII Rebirth,"Dec 31, 2023",['Square Enix'],,192,192,[],This next standalone chapter in the FINAL FANT...,[],20,3,354,1.1K
719,719,Lies of P,"Aug 01, 2023","['NEOWIZ', 'Round8 Studio']",,175,175,['RPG'],"Inspired by the familiar story of Pinocchio, L...",[],5,0,260,939
726,726,Judas,"Mar 31, 2025",['Ghost Story Games'],,90,90,"['Adventure', 'Shooter']",A disintegrating starship. A desperate escape ...,[],1,0,92,437
746,746,Like a Dragon Gaiden: The Man Who Erased His Name,"Dec 31, 2023","['Ryū Ga Gotoku Studios', 'Sega']",,118,118,"['Adventure', 'Brawler', 'RPG']",This game covers Kiryu's story between Yakuza ...,[],2,1,145,588
972,972,The Legend of Zelda: Tears of the Kingdom,"May 12, 2023","['Nintendo', 'Nintendo EPD Production Group No...",,581,581,"['Adventure', 'RPG']",The Legend of Zelda: Tears of the Kingdom is t...,[],72,6,1.6K,5.4K
1130,1130,Star Wars Jedi: Survivor,"Apr 28, 2023","['Respawn Entertainment', 'Electronic Arts']",,250,250,['Adventure'],The story of Cal Kestis continues in Star Wars...,[],13,2,367,1.4K
1160,1160,We Love Katamari Reroll + Royal Reverie,"Jun 02, 2023","['Bandai Namco Entertainment', 'MONKEYCRAFT Co...",,51,51,"['Adventure', 'Puzzle']",We Love Katamari Reroll + Royal Reverie is a r...,[],3,0,74,291
1202,1202,Earthblade,"Dec 31, 2024",['Extremely OK Games'],,83,83,"['Adventure', 'Indie', 'RPG']","You are Névoa, an enigmatic child of Fate retu...",[],0,1,103,529


有13条游戏数据缺失`Rating`变量值。

从输出的结果来看，这些缺失`Rating`的游戏数据的发行日期并不一定是2023年年末发行而导致变量值缺失，存在其他情况导致数据异常，而游戏评分是我们分析玩家热爱游戏类型的重要指标之一，认为这些数据无法提供有效含义，因此后续可以进行删除。

#### 评估重复数据

根据数据变量含义来看，`Title`,`Team`都是唯一标识符，但是一个团队可以发行多种游戏，`Team`可以存在重复，而游戏名也能存在重复，但是两者不能同时重复。

所以针对该数据集，我们需要检查是否存在变量值`Title`,`Team`都存在重复的数据。

In [10]:
original_data[original_data.duplicated(subset = ['Title','Team'])]

Unnamed: 0.1,Unnamed: 0,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
326,326,Elden Ring,"Feb 25, 2022","['Bandai Namco Entertainment', 'FromSoftware']",4.5,3.9K,3.9K,"['Adventure', 'RPG']","Elden Ring is a fantasy, action and open world...","[""The first playthrough of elden ring is one o...",17K,3.8K,4.6K,4.8K
327,327,Hades,"Dec 10, 2019",['Supergiant Games'],4.3,2.9K,2.9K,"['Adventure', 'Brawler', 'Indie', 'RPG']",A rogue-lite hack and slash dungeon crawler in...,['convinced this is a roguelike for people who...,21K,3.2K,6.3K,3.6K
328,328,The Legend of Zelda: Breath of the Wild,"Mar 03, 2017","['Nintendo', 'Nintendo EPD Production Group No...",4.4,4.3K,4.3K,"['Adventure', 'RPG']",The Legend of Zelda: Breath of the Wild is the...,['This game is the game (that is not CS:GO) th...,30K,2.5K,5K,2.6K
329,329,Undertale,"Sep 15, 2015","['tobyfox', '8-4']",4.2,3.5K,3.5K,"['Adventure', 'Indie', 'RPG', 'Turn Based Stra...","A small child falls into the Underground, wher...",['soundtrack is tied for #1 with nier automata...,28K,679,4.9K,1.8K
330,330,Hollow Knight,"Feb 24, 2017",['Team Cherry'],4.4,3K,3K,"['Adventure', 'Indie', 'Platform']",A 2D metroidvania with an emphasis on close co...,"[""this games worldbuilding is incredible, with...",21K,2.4K,8.3K,2.3K
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1269,1269,Final Fantasy XIII-2,"Dec 15, 2011",['Square Enix'],3.3,482,482,"['Adventure', 'RPG']",FINAL FANTASY XIII-2 is created with the aim o...,"[""Oh boy. Playing the XIII series is looking m...",2.3K,58,1.4K,449
1270,1270,Agar.io,"Apr 28, 2015","['Miniclip.com', 'Matheus Valadares']",2.2,81,81,"['Indie', 'Strategy']",Agar.io is a Massively-multiplayer top-down st...,"['""A Ganância que te move... É a mesma que te ...",4.4K,8,40,12
1271,1271,Fatal Frame II: Crimson Butterfly,"Nov 27, 2003","['Tecmo Co., Ltd.', 'Ubisoft Entertainment']",4.2,398,398,['Adventure'],Crimson Butterfly is the second installment in...,['Pretty cool albeit a bit similar to the firs...,1K,38,690,513
1282,1282,Super Mario Sunshine,"Sep 18, 2020","['Nintendo EAD', 'Nintendo']",3.7,19,19,"['Adventure', 'Platform']",A port of Super Mario Sunshine included in Sup...,['What an amazing remaster of an already amazi...,340,6,83,14


从输出结果来看，有398条`Title`,`Team`变量值同时相同的游戏数据，虽然分析目的不涉及到对该游戏类型下游戏评分、游玩人数、评分数量的数值计算，但重复数据会导致数据集冗余，因此在之后需要进行删除。

#### 评估不一致数据

针对此数据集，不存在多个变量指代一个游戏发行公司现象，因此无需进行评估。

#### 评估无效或错误数据

可以通过DataFrame的`describe`方法，对数值统计信息进行快速了解。

In [16]:
original_data.describe()

Unnamed: 0.1,Unnamed: 0,Rating
count,1512.0,1499.0
mean,755.5,3.719346
std,436.621117,0.532608
min,0.0,0.7
25%,377.75,3.4
50%,755.5,3.8
75%,1133.25,4.1
max,1511.0,4.8


从输出结果看，`Rating`数据暂未出现异常或负数，因此大致可以认定该数据集不存在无效或错误数据。

此外，`Reviews`对于我们分析目的并无大用且为了保持数据集的直观、简洁，可进行删除操作。

## 清理数据

根据前面评估部分得到的结论，我们需要进行数据的清理包括：
- 把列名`Unnamed: 0`进行纠正
- 把`Rating`缺失的观察值删除
- 把同时存在`Title`,`Team`相同的观察值删除
- 把列`Reviews`删除

为了区分开经过清理的数据和原始的数据，我们创建新的变量`cleaned_data`，让它为`original_data`复制出的副本。我们之后的清理步骤都将被运用在`cleaned_data`上。

In [21]:
cleaned_data = original_data.copy()
cleaned_data.head()

Unnamed: 0.1,Unnamed: 0,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
0,0,Elden Ring,"Feb 25, 2022","['Bandai Namco Entertainment', 'FromSoftware']",4.5,3.9K,3.9K,"['Adventure', 'RPG']","Elden Ring is a fantasy, action and open world...","[""The first playthrough of elden ring is one o...",17K,3.8K,4.6K,4.8K
1,1,Hades,"Dec 10, 2019",['Supergiant Games'],4.3,2.9K,2.9K,"['Adventure', 'Brawler', 'Indie', 'RPG']",A rogue-lite hack and slash dungeon crawler in...,['convinced this is a roguelike for people who...,21K,3.2K,6.3K,3.6K
2,2,The Legend of Zelda: Breath of the Wild,"Mar 03, 2017","['Nintendo', 'Nintendo EPD Production Group No...",4.4,4.3K,4.3K,"['Adventure', 'RPG']",The Legend of Zelda: Breath of the Wild is the...,['This game is the game (that is not CS:GO) th...,30K,2.5K,5K,2.6K
3,3,Undertale,"Sep 15, 2015","['tobyfox', '8-4']",4.2,3.5K,3.5K,"['Adventure', 'Indie', 'RPG', 'Turn Based Stra...","A small child falls into the Underground, wher...",['soundtrack is tied for #1 with nier automata...,28K,679,4.9K,1.8K
4,4,Hollow Knight,"Feb 24, 2017",['Team Cherry'],4.4,3K,3K,"['Adventure', 'Indie', 'Platform']",A 2D metroidvania with an emphasis on close co...,"[""this games worldbuilding is incredible, with...",21K,2.4K,8.3K,2.3K


把列名`Unnamed: 0`进行纠正.

In [23]:
cleaned_data = cleaned_data.rename(columns={'Unnamed: 0':'Unnamed'})
cleaned_data

Unnamed: 0,Unnamed,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist
0,0,Elden Ring,"Feb 25, 2022","['Bandai Namco Entertainment', 'FromSoftware']",4.5,3.9K,3.9K,"['Adventure', 'RPG']","Elden Ring is a fantasy, action and open world...","[""The first playthrough of elden ring is one o...",17K,3.8K,4.6K,4.8K
1,1,Hades,"Dec 10, 2019",['Supergiant Games'],4.3,2.9K,2.9K,"['Adventure', 'Brawler', 'Indie', 'RPG']",A rogue-lite hack and slash dungeon crawler in...,['convinced this is a roguelike for people who...,21K,3.2K,6.3K,3.6K
2,2,The Legend of Zelda: Breath of the Wild,"Mar 03, 2017","['Nintendo', 'Nintendo EPD Production Group No...",4.4,4.3K,4.3K,"['Adventure', 'RPG']",The Legend of Zelda: Breath of the Wild is the...,['This game is the game (that is not CS:GO) th...,30K,2.5K,5K,2.6K
3,3,Undertale,"Sep 15, 2015","['tobyfox', '8-4']",4.2,3.5K,3.5K,"['Adventure', 'Indie', 'RPG', 'Turn Based Stra...","A small child falls into the Underground, wher...",['soundtrack is tied for #1 with nier automata...,28K,679,4.9K,1.8K
4,4,Hollow Knight,"Feb 24, 2017",['Team Cherry'],4.4,3K,3K,"['Adventure', 'Indie', 'Platform']",A 2D metroidvania with an emphasis on close co...,"[""this games worldbuilding is incredible, with...",21K,2.4K,8.3K,2.3K
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1507,1507,Back to the Future: The Game,"Dec 22, 2010",['Telltale Games'],3.2,94,94,"['Adventure', 'Point-and-Click']",Back to the Future: The Game is one of Telltal...,['Very enjoyable game. The story adds onto the...,763,5,223,67
1508,1508,Team Sonic Racing,"May 21, 2019","['Sumo Digital', 'Sega']",2.9,264,264,"['Arcade', 'Racing']",Team Sonic Racing combines the best elements o...,"['jogo morto mas bom', 'not my cup of tea', ""C...",1.5K,49,413,107
1509,1509,Dragon's Dogma,"May 22, 2012",['Capcom'],3.7,210,210,"['Brawler', 'RPG']","Set in a huge open world, Dragon’s Dogma: Dark...","['Underrated.', 'A grandes rasgos, es como un ...",1.1K,45,487,206
1510,1510,Baldur's Gate 3,"Oct 06, 2020",['Larian Studios'],4.1,165,165,"['Adventure', 'RPG', 'Strategy', 'Tactical', '...","An ancient evil has returned to Baldur's Gate,...",['Bu türe bu oyunla girmeye çalışmak hataydı s...,269,79,388,602


把`Rating`缺失的观察值删除,并查看删除后该列空缺值个数和：

In [24]:
cleaned_data.dropna(subset = ['Rating'],inplace = True)

In [26]:
cleaned_data['Rating'].isnull().sum()

0

把同时存在`Title`,`Team`相同的观察值删除,并检查是否还存在重复数据：

In [27]:
cleaned_data = cleaned_data.drop_duplicates(subset = ['Title','Team'],keep='first')

In [29]:
cleaned_data[cleaned_data.duplicated(subset = ['Title','Team'])]

Unnamed: 0,Unnamed,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Reviews,Plays,Playing,Backlogs,Wishlist


把列`Reviews`删除。

In [31]:
cleaned_data = cleaned_data.drop('Reviews',axis = 1)
cleaned_data

Unnamed: 0,Unnamed,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Plays,Playing,Backlogs,Wishlist
0,0,Elden Ring,"Feb 25, 2022","['Bandai Namco Entertainment', 'FromSoftware']",4.5,3.9K,3.9K,"['Adventure', 'RPG']","Elden Ring is a fantasy, action and open world...",17K,3.8K,4.6K,4.8K
1,1,Hades,"Dec 10, 2019",['Supergiant Games'],4.3,2.9K,2.9K,"['Adventure', 'Brawler', 'Indie', 'RPG']",A rogue-lite hack and slash dungeon crawler in...,21K,3.2K,6.3K,3.6K
2,2,The Legend of Zelda: Breath of the Wild,"Mar 03, 2017","['Nintendo', 'Nintendo EPD Production Group No...",4.4,4.3K,4.3K,"['Adventure', 'RPG']",The Legend of Zelda: Breath of the Wild is the...,30K,2.5K,5K,2.6K
3,3,Undertale,"Sep 15, 2015","['tobyfox', '8-4']",4.2,3.5K,3.5K,"['Adventure', 'Indie', 'RPG', 'Turn Based Stra...","A small child falls into the Underground, wher...",28K,679,4.9K,1.8K
4,4,Hollow Knight,"Feb 24, 2017",['Team Cherry'],4.4,3K,3K,"['Adventure', 'Indie', 'Platform']",A 2D metroidvania with an emphasis on close co...,21K,2.4K,8.3K,2.3K
...,...,...,...,...,...,...,...,...,...,...,...,...,...
1507,1507,Back to the Future: The Game,"Dec 22, 2010",['Telltale Games'],3.2,94,94,"['Adventure', 'Point-and-Click']",Back to the Future: The Game is one of Telltal...,763,5,223,67
1508,1508,Team Sonic Racing,"May 21, 2019","['Sumo Digital', 'Sega']",2.9,264,264,"['Arcade', 'Racing']",Team Sonic Racing combines the best elements o...,1.5K,49,413,107
1509,1509,Dragon's Dogma,"May 22, 2012",['Capcom'],3.7,210,210,"['Brawler', 'RPG']","Set in a huge open world, Dragon’s Dogma: Dark...",1.1K,45,487,206
1510,1510,Baldur's Gate 3,"Oct 06, 2020",['Larian Studios'],4.1,165,165,"['Adventure', 'RPG', 'Strategy', 'Tactical', '...","An ancient evil has returned to Baldur's Gate,...",269,79,388,602


## 保存清理后的数据

完成数据清理后，把干净整齐的数据保存到新的文件里，文件名为`games_cleaned.csv`。

In [33]:
cleaned_data.head()

Unnamed: 0,Unnamed,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Plays,Playing,Backlogs,Wishlist
0,0,Elden Ring,"Feb 25, 2022","['Bandai Namco Entertainment', 'FromSoftware']",4.5,3.9K,3.9K,"['Adventure', 'RPG']","Elden Ring is a fantasy, action and open world...",17K,3.8K,4.6K,4.8K
1,1,Hades,"Dec 10, 2019",['Supergiant Games'],4.3,2.9K,2.9K,"['Adventure', 'Brawler', 'Indie', 'RPG']",A rogue-lite hack and slash dungeon crawler in...,21K,3.2K,6.3K,3.6K
2,2,The Legend of Zelda: Breath of the Wild,"Mar 03, 2017","['Nintendo', 'Nintendo EPD Production Group No...",4.4,4.3K,4.3K,"['Adventure', 'RPG']",The Legend of Zelda: Breath of the Wild is the...,30K,2.5K,5K,2.6K
3,3,Undertale,"Sep 15, 2015","['tobyfox', '8-4']",4.2,3.5K,3.5K,"['Adventure', 'Indie', 'RPG', 'Turn Based Stra...","A small child falls into the Underground, wher...",28K,679,4.9K,1.8K
4,4,Hollow Knight,"Feb 24, 2017",['Team Cherry'],4.4,3K,3K,"['Adventure', 'Indie', 'Platform']",A 2D metroidvania with an emphasis on close co...,21K,2.4K,8.3K,2.3K


In [34]:
cleaned_data.to_csv("games_cleaned.csv", index=False)

In [35]:
pd.read_csv("games_cleaned.csv").head()

Unnamed: 0,Unnamed,Title,Release Date,Team,Rating,Times Listed,Number of Reviews,Genres,Summary,Plays,Playing,Backlogs,Wishlist
0,0,Elden Ring,"Feb 25, 2022","['Bandai Namco Entertainment', 'FromSoftware']",4.5,3.9K,3.9K,"['Adventure', 'RPG']","Elden Ring is a fantasy, action and open world...",17K,3.8K,4.6K,4.8K
1,1,Hades,"Dec 10, 2019",['Supergiant Games'],4.3,2.9K,2.9K,"['Adventure', 'Brawler', 'Indie', 'RPG']",A rogue-lite hack and slash dungeon crawler in...,21K,3.2K,6.3K,3.6K
2,2,The Legend of Zelda: Breath of the Wild,"Mar 03, 2017","['Nintendo', 'Nintendo EPD Production Group No...",4.4,4.3K,4.3K,"['Adventure', 'RPG']",The Legend of Zelda: Breath of the Wild is the...,30K,2.5K,5K,2.6K
3,3,Undertale,"Sep 15, 2015","['tobyfox', '8-4']",4.2,3.5K,3.5K,"['Adventure', 'Indie', 'RPG', 'Turn Based Stra...","A small child falls into the Underground, wher...",28K,679,4.9K,1.8K
4,4,Hollow Knight,"Feb 24, 2017",['Team Cherry'],4.4,3K,3K,"['Adventure', 'Indie', 'Platform']",A 2D metroidvania with an emphasis on close co...,21K,2.4K,8.3K,2.3K
