# 1. Introduction

The video game data extracted from the [RAWG database](https://rawg.io/) is massive, with a size of about 1.04GB. Thus, it is necessary to trim this data to only contain information that is deemed to be important.

## 1.1 Set Up
Import the necessary libraries and CSV file.

In [1]:
import numpy as np
import pandas as pd

# import warnings
# warnings.filterwarnings("ignore")
# from google.colab import files

(if using Google Colab) Upload the saved CSV file if starting a new session. <br>
Not recommended as it took about 3.5 hours to upload (longer than extracting).

In [None]:
# if using Google Colab
# uploaded = files.upload()

Saving data.csv to data.csv


Else, read the csv file as per normal:

In [2]:
df = pd.read_csv("../Data/data.csv", index_col=0)

# or
# if there is no unnamed index column
# df = pd.read_csv("../Data/data.csv")

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


# 2. Exploratory Data Analysis

## 2.1 Data Processing
Looking into the data:

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 413320 entries, 0 to 39
Data columns (total 40 columns):
 #   Column                   Non-Null Count   Dtype  
---  ------                   --------------   -----  
 0   id                       413320 non-null  int64  
 1   slug                     413318 non-null  object 
 2   name                     413318 non-null  object 
 3   released                 390099 non-null  object 
 4   tba                      413320 non-null  bool   
 5   background_image         395192 non-null  object 
 6   rating                   413320 non-null  float64
 7   rating_top               413320 non-null  int64  
 8   ratings                  413320 non-null  object 
 9   ratings_count            413320 non-null  int64  
 10  reviews_text_count       413320 non-null  int64  
 11  added                    413320 non-null  int64  
 12  metacritic               4113 non-null    float64
 13  playtime                 413320 non-null  int64  
 14  suggesti

In this dataset there are 40 columns. Their names and data types are as follows:

* **id** - Unique identifier

* **slug** - The game's name?

* **name** - The name of the game

* **released** - Release date of game in YYYY-MM-DD

* **tba** - ?

* **background_image** - Link to image

* **rating** - Ratings out of 5

* **rating_top** - ?

* **ratings** - list?

* **ratings_count** - Number of ratings given

* **reviews_text_count** - Number of text in reviews

* **added** - ?

* **metacritic** - Metracritic's official score out of 100

* **playtime** - Suggested playtime in hours?

* **suggestions_count** - Number of suggestions?

* **user_game** - Binary indicator if user owns the game?

* **reviews_count** - Number of reviews given

* **saturated_color** - ?

* **dominant_color** - ?

* **platforms** - Platform of the games (i.e. PC,PS4, etc.), list

* **parent_platforms** - list

* **genres** - The genre(s) of the game , list

* **stores** - Digital store page, list

* **tags** - Tags of the game, list

* **short_screenshots** - Preview screenshots of the game? , list

* **added_by_status.yet** - Number of gamers who haven't played the game, float

* **added_by_status.owned** - Number of gamers who owned the game, float

* **added_by_status.beaten** - Number of gamers who have beated the game, float

* **added_by_status.toplay** - Number of gamers who intend to play the game, float

* **added_by_status.dropped** - SNumber of gamers who stopped playing the game, float

* **added_by_status.playing** - Number of gamers who are currently playing the game, float

* **clip.clip** - Likely to be a short video clip

* **clip.clips.320** - Likely to be a short video clip

* **clip.clips.640** - Likely to be a short video clip

* **clip.clips.full** - Likely to be a short video clip

* **clip.video** - Likely to be a short video clip

* **clip.preview** - Likely to be a short video clip

* **clip** - Likely to be a short video clip

* **community_rating** - ?
* **added_by_status** - ?
															

I will remove unneccsary columns to save computational space and reduce noise. These include the `added_by_status` columns since their sample sizes only include the RAWG database users.

In [4]:
# axis=1 means drop columns
# axis=0 means drop index
# inplace=True means do operation inplace and return None
df.drop(labels=[#'Unnamed: 0', 
                'slug',
                'tba', 
                'background_image', 
                'user_game', 
                'saturated_color', 
                'dominant_color',
                'stores',
                'short_screenshots',
                'clip.clip', 
                'clip.clips.320', 
                'clip.clips.640',
                'clip.clips.full', 
                'clip.video',
                'clip.preview', 
                'clip',
                'community_rating',
                'added_by_status',
               'added_by_status.yet',
                 'added_by_status.owned',
                 'added_by_status.beaten',
                 'added_by_status.toplay',
                 'added_by_status.dropped',
                 'added_by_status.playing'],
        axis=1,inplace=True)

In [5]:
print(f'{df.shape} entries in the extracted dataset.')

(413320, 17) entries in the extracted dataset.


The dataframe is now down to 17 columns.

In [6]:
# Create a backup copy
df2 = df.copy()
df2

Unnamed: 0,id,name,released,rating,rating_top,ratings,ratings_count,reviews_text_count,added,metacritic,playtime,suggestions_count,reviews_count,platforms,parent_platforms,genres,tags
0,3498,Grand Theft Auto V,2013-09-17,4.48,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3796,21,12383,97.0,68,422,3832,"[{'platform': {'id': 187, 'name': 'PlayStation...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
1,4200,Portal 2,2011-04-19,4.61,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3229,16,10880,95.0,11,596,3253,"[{'platform': {'id': 16, 'name': 'PlayStation ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 2, 'name': 'Shooter', 'slug': 'shooter...","[{'id': 40833, 'name': 'Captions available', '..."
2,3328,The Witcher 3: Wild Hunt,2015-05-18,4.67,5,"[{'id': 5, 'title': 'exceptional', 'count': 27...",3486,36,10603,93.0,52,680,3535,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
3,5286,Tomb Raider (2013),2013-03-05,4.06,4,"[{'id': 4, 'title': 'recommended', 'count': 13...",2252,6,9844,86.0,11,680,2266,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
4,5679,The Elder Scrolls V: Skyrim,2011-11-11,4.41,5,"[{'id': 5, 'title': 'exceptional', 'count': 15...",2719,10,9717,94.0,44,626,2734,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 97, 'name': 'Action RPG', 'slug': 'act..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35,2801,Rainbow Moon PS4 Upgrade,2016-02-16,0.00,0,[],0,0,0,,0,0,0,"[{'platform': {'id': 18, 'name': 'PlayStation ...","[{'platform': {'id': 2, 'name': 'PlayStation',...",[],"[{'id': 25, 'name': 'Space', 'slug': 'space', ..."
36,2797,Word Mage,2014-02-20,0.00,0,[],0,0,0,,0,81,0,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 5, 'name': 'RPG', 'slug': 'role-playin...","[{'id': 24, 'name': 'RPG', 'slug': 'rpg', 'lan..."
37,2795,Over The Net Beach Volley,2010-02-12,0.00,0,[],0,0,0,,0,122,0,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 14, 'name': 'Simulation', 'slug': 'sim...","[{'id': 402, 'name': 'Training', 'slug': 'trai..."
38,2790,Little Luca,2013-05-23,0.00,0,[],0,0,0,,0,39,0,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 3, 'name': 'Adventure', 'slug': 'adven...","[{'id': 136, 'name': 'Music', 'slug': 'music',..."


Since I will be considering `rating` as one of the features to predict Metacritic scores, let's have a look at the titles with a `rating` of 0.00:

In [7]:
temp_df = df2[df2.rating == 0.00]
temp_df

Unnamed: 0,id,name,released,rating,rating_top,ratings,ratings_count,reviews_text_count,added,metacritic,playtime,suggestions_count,reviews_count,platforms,parent_platforms,genres,tags
13,22128,Interkosmos,2017-04-25,0.0,0,"[{'id': 4, 'title': 'recommended', 'count': 1,...",3,0,811,,1,220,3,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 69, 'name': 'Action-Adventure', 'slug'..."
8,23628,Will Glow the Wisp,2017-09-12,0.0,0,"[{'id': 3, 'title': 'meh', 'count': 3, 'percen...",4,0,557,,1,138,4,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 45, 'name': '2D', 'slug': '2d', 'langu..."
19,18399,Dark Future: Blood Red States,2019-05-15,0.0,1,"[{'id': 1, 'title': 'skip', 'count': 3, 'perce...",5,0,532,,4,502,5,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
29,457916,Cyberpunk 2077 Goodies Collection,,0.0,0,"[{'id': 4, 'title': 'recommended', 'count': 2,...",3,0,480,,0,281,3,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...",[],"[{'id': 226, 'name': 'Cyberpunk', 'slug': 'cyb..."
18,10226,i saw her standing there,2018-05-31,0.0,1,"[{'id': 1, 'title': 'skip', 'count': 3, 'perce...",5,0,434,,3,191,5,"[{'platform': {'id': 5, 'name': 'macOS', 'slug...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35,2801,Rainbow Moon PS4 Upgrade,2016-02-16,0.0,0,[],0,0,0,,0,0,0,"[{'platform': {'id': 18, 'name': 'PlayStation ...","[{'platform': {'id': 2, 'name': 'PlayStation',...",[],"[{'id': 25, 'name': 'Space', 'slug': 'space', ..."
36,2797,Word Mage,2014-02-20,0.0,0,[],0,0,0,,0,81,0,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 5, 'name': 'RPG', 'slug': 'role-playin...","[{'id': 24, 'name': 'RPG', 'slug': 'rpg', 'lan..."
37,2795,Over The Net Beach Volley,2010-02-12,0.0,0,[],0,0,0,,0,122,0,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 14, 'name': 'Simulation', 'slug': 'sim...","[{'id': 402, 'name': 'Training', 'slug': 'trai..."
38,2790,Little Luca,2013-05-23,0.0,0,[],0,0,0,,0,39,0,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 3, 'name': 'Adventure', 'slug': 'adven...","[{'id': 136, 'name': 'Music', 'slug': 'music',..."


In [8]:
temp_df['name'].tolist()

['Interkosmos',
 'Will Glow the Wisp',
 'Dark Future: Blood Red States',
 'Cyberpunk 2077 Goodies Collection',
 'i saw her standing there',
 'MO:Astray',
 "Flora's Fruit Farm",
 'Avernum 3: Ruined World',
 'Two Brothers',
 'Egyptian Senet',
 'Kabitis',
 'AI War 2',
 "The King's Bird",
 'Unforgiving Trials: The Space Crusade',
 'Sideway New York',
 'TeraBlaster',
 'Indie Game Battle',
 'Labyrinthine Dreams',
 'Max Gentlemen',
 'Doorways: The Underworld',
 'Wild Island Quest',
 'Crucible',
 'Starward Rogue',
 'Data Jammers: FastForward',
 'Airstrike HD',
 'Blackbay Asylum',
 'Mobile Astro',
 'Remnants of Naezith',
 'Lost Chronicles of Zerzura',
 'Ignite',
 'Alpha Polaris: A Horror Adventure Game',
 'Alex Hunter - Lord of the Mind Platinum Edition',
 'Yargis - Space Melee',
 'Pixel Puzzles 2: Birds',
 'Cubemen 2',
 'Victory: The Age of Racing',
 'Unfortunate Spacemen',
 'Avalanche 2: Super Avalanche',
 'Defend The Highlands',
 'Remnants Of Isolation',
 'Razor2: Hidden Skies',
 "Tobe's Ver

Since most of these titles aren't (relatively) very popular titles, it could explain why their `rating` were not filled. Nevertheless, I will simply remove these titles from the main dataframe as I do not want any of the features to have missing data.

In [9]:
df3 = df2[df2.rating != 0.00]
df3

Unnamed: 0,id,name,released,rating,rating_top,ratings,ratings_count,reviews_text_count,added,metacritic,playtime,suggestions_count,reviews_count,platforms,parent_platforms,genres,tags
0,3498,Grand Theft Auto V,2013-09-17,4.48,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3796,21,12383,97.0,68,422,3832,"[{'platform': {'id': 187, 'name': 'PlayStation...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
1,4200,Portal 2,2011-04-19,4.61,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3229,16,10880,95.0,11,596,3253,"[{'platform': {'id': 16, 'name': 'PlayStation ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 2, 'name': 'Shooter', 'slug': 'shooter...","[{'id': 40833, 'name': 'Captions available', '..."
2,3328,The Witcher 3: Wild Hunt,2015-05-18,4.67,5,"[{'id': 5, 'title': 'exceptional', 'count': 27...",3486,36,10603,93.0,52,680,3535,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
3,5286,Tomb Raider (2013),2013-03-05,4.06,4,"[{'id': 4, 'title': 'recommended', 'count': 13...",2252,6,9844,86.0,11,680,2266,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor..."
4,5679,The Elder Scrolls V: Skyrim,2011-11-11,4.41,5,"[{'id': 5, 'title': 'exceptional', 'count': 15...",2719,10,9717,94.0,44,626,2734,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 97, 'name': 'Action RPG', 'slug': 'act..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3,52200,Cheapshot,,4.83,5,"[{'id': 5, 'title': 'exceptional', 'count': 5,...",4,0,4,,0,102,6,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 59, 'name': 'Massively Multiplayer', '...","[{'id': 40932, 'name': 'AR support', 'slug': '..."
34,387309,Stay Out,,3.33,5,"[{'id': 5, 'title': 'exceptional', 'count': 3,...",6,0,3,,0,481,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40837, 'name': 'In-App Purchases', 'sl..."
36,387303,Ray Eager,2019-11-11,4.83,5,"[{'id': 5, 'title': 'exceptional', 'count': 5,...",5,0,3,,0,258,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40834, 'name': 'Commentary available',..."
37,330541,Stalker Online,2019-10-15,2.83,3,"[{'id': 3, 'title': 'meh', 'count': 2, 'percen...",5,0,3,,0,624,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 14, 'name': 'Early Access', 'slug': 'e..."


Save the trimmed dataframe for further use.

In [15]:
df3.to_csv('../Data/data2.csv',index=False)

In [None]:
# if using Google Colab
# files.download("data2.csv")

# 3. Expand Listed Data in Columns
From the dataframe, there is a need to expand some of the columns as each row contains a single list containing important information that could be more useful if they are included in the dataframe as columns instead.

## 3.1 Quick Checks on Dataframe

In [23]:
df3 = pd.read_csv('../Data/data2.csv')

In [24]:
# Look for missing data and check dtypes
df3.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10896 entries, 0 to 10895
Data columns (total 17 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   id                  10896 non-null  int64  
 1   name                10896 non-null  object 
 2   released            10744 non-null  object 
 3   rating              10896 non-null  float64
 4   rating_top          10896 non-null  int64  
 5   ratings             10896 non-null  object 
 6   ratings_count       10896 non-null  int64  
 7   reviews_text_count  10896 non-null  int64  
 8   added               10896 non-null  int64  
 9   metacritic          3305 non-null   float64
 10  playtime            10896 non-null  int64  
 11  suggestions_count   10896 non-null  int64  
 12  reviews_count       10896 non-null  int64  
 13  platforms           10896 non-null  object 
 14  parent_platforms    10896 non-null  object 
 15  genres              10896 non-null  object 
 16  tags

The columns with the dtype `object` contain lists.

## 3.2 Expanding "List" Columns

The following columns are the ones to be "expanded":
- `ratings`
- `platforms`
- `parent_platforms`
- `genres`
- `tags`

### 3.2.1 Platforms

In [25]:
df3['platforms']

0        [{'platform': {'id': 187, 'name': 'PlayStation...
1        [{'platform': {'id': 16, 'name': 'PlayStation ...
2        [{'platform': {'id': 4, 'name': 'PC', 'slug': ...
3        [{'platform': {'id': 4, 'name': 'PC', 'slug': ...
4        [{'platform': {'id': 4, 'name': 'PC', 'slug': ...
                               ...                        
10891    [{'platform': {'id': 3, 'name': 'iOS', 'slug':...
10892    [{'platform': {'id': 4, 'name': 'PC', 'slug': ...
10893    [{'platform': {'id': 4, 'name': 'PC', 'slug': ...
10894    [{'platform': {'id': 4, 'name': 'PC', 'slug': ...
10895    [{'platform': {'id': 21, 'name': 'Android', 's...
Name: platforms, Length: 10896, dtype: object

As mentioned earlier, the unexpanded columns are lists that could be converted into dictionaries for access.

In [26]:
# check the list in the 1st row of the dataframe
df3['platforms'][0]

"[{'platform': {'id': 187, 'name': 'PlayStation 5', 'slug': 'playstation5', 'image': None, 'year_end': None, 'year_start': 2020, 'games_count': 64, 'image_background': 'https://media.rawg.io/media/games/34b/34b1f1850a1c06fd971bc6ab3ac0ce0e.jpg'}, 'released_at': '2013-09-17', 'requirements_en': None, 'requirements_ru': None}, {'platform': {'id': 4, 'name': 'PC', 'slug': 'pc', 'image': None, 'year_end': None, 'year_start': None, 'games_count': 236016, 'image_background': 'https://media.rawg.io/media/games/8d6/8d69eb6c32ed6acfd75f82d532144993.jpg'}, 'released_at': '2013-09-17', 'requirements_en': {'minimum': 'Minimum:OS: Windows 10 64 Bit, Windows 8.1 64 Bit, Windows 8 64 Bit, Windows 7 64 Bit Service Pack 1, Windows Vista 64 Bit Service Pack 2* (*NVIDIA video card recommended if running Vista OS)Processor: Intel Core 2 Quad CPU Q6600 @ 2.40GHz (4 CPUs) / AMD Phenom 9850 Quad-Core Processor (4 CPUs) @ 2.5GHzMemory: 4 GB RAMGraphics: NVIDIA 9800 GT 1GB / AMD HD 4870 1GB (DX 10, 10.1, 11)St

Use the Abstract Syntax Trees `ast` module to process trees of the Python abstract syntax grammar. <br>

`ast.literal_eval()` safely evaluates an expression node or a string containing a Python literal or container display. The string or node provided should only consist of these Python literal structures: <br>
strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None, bytes & sets

I will do a simple test for 1 row, say the second row:

In [27]:
import ast
ast.literal_eval(df3['platforms'][1])

[{'platform': {'id': 16,
   'name': 'PlayStation 3',
   'slug': 'playstation3',
   'image': None,
   'year_end': None,
   'year_start': None,
   'games_count': 3601,
   'image_background': 'https://media.rawg.io/media/games/1bb/1bb86c35ffa3eb0d299b01a7c65bf908.jpg'},
  'released_at': '2011-04-19',
  'requirements_en': None,
  'requirements_ru': None},
 {'platform': {'id': 4,
   'name': 'PC',
   'slug': 'pc',
   'image': None,
   'year_end': None,
   'year_start': None,
   'games_count': 236014,
   'image_background': 'https://media.rawg.io/media/games/7cf/7cfc9220b401b7a300e409e539c9afd5.jpg'},
  'released_at': '2011-04-19',
  'requirements_en': None,
  'requirements_ru': {'minimum': 'Core 2 Duo/Athlon X2 2 ГГц,1 Гб памяти,GeForce 7600/Radeon X800,10 Гб на винчестере,интернет-соединение',
   'recommended': 'Core 2 Duo/Athlon X2 2.5 ГГц,2 Гб памяти,GeForce GTX 280/Radeon HD 2600,10 Гб на винчестере,интернет-соединение'}},
 {'platform': {'id': 14,
   'name': 'Xbox 360',
   'slug': 'xbox3

It works for a single row! Now to compile all the platforms of the second row into a single list:

In [28]:
single_game_platform = []

# loop through all values in list inside the second row of platforms
for platform in ast.literal_eval(df3['platforms'][1]):
    # prints content of the list
    print(platform['platform']['name'])
    # add that content into the resultant list, replacing any whitespaces
    single_game_platform.append(platform['platform']['name'].replace(" ", "_"))

single_game_platform

PlayStation 3
PC
Xbox 360
Linux
macOS


['PlayStation_3', 'PC', 'Xbox_360', 'Linux', 'macOS']

That's for one single row. Now to compile for every row in the dataframe:

In [29]:
# apply ast.literal_eval on every row's platform
df3['platforms'] = df3['platforms'].apply(ast.literal_eval)

In [30]:
platform_column = []

for platform in df3['platforms']:
    # initialize an empty list for each row
    platform_list = []
    # loop through the list of console names
    for console in platform:
        platform_list.append(console['platform']['name'].replace(" ", "_"))
    
    # add the list into the final platform_column list
    platform_column.append(platform_list)

# each item in this list is a list of consoles for one game
platform_column

[['PlayStation_5',
  'PC',
  'PlayStation_4',
  'PlayStation_3',
  'Xbox_360',
  'Xbox_One'],
 ['PlayStation_3', 'PC', 'Xbox_360', 'Linux', 'macOS'],
 ['PC', 'Xbox_One', 'Nintendo_Switch', 'PlayStation_4'],
 ['PC', 'PlayStation_4', 'PlayStation_3', 'Xbox_360', 'Xbox_One', 'macOS'],
 ['PC', 'PlayStation_3', 'Xbox_360', 'Nintendo_Switch'],
 ['PC', 'Xbox_360'],
 ['PC', 'PlayStation_3', 'Xbox_360', 'PlayStation_4', 'macOS', 'Xbox_One'],
 ['Linux', 'PC', 'PlayStation_3', 'PlayStation_4', 'Xbox_360', 'Xbox_One'],
 ['macOS', 'Android', 'PC', 'Linux', 'PlayStation_3', 'Xbox_360'],
 ['Android',
  'Xbox_360',
  'macOS',
  'iOS',
  'Linux',
  'Xbox_One',
  'PlayStation_4',
  'PC',
  'PlayStation_3'],
 ['PC', 'Xbox_360', 'PlayStation_3'],
 ['PC',
  'iOS',
  'Linux',
  'macOS',
  'Android',
  'Xbox_One',
  'Xbox_360',
  'PlayStation_3',
  'PlayStation_4',
  'Nintendo_Switch',
  'PS_Vita'],
 ['Xbox_One',
  'PC',
  'PlayStation_4',
  'PlayStation_3',
  'macOS',
  'iOS',
  'Xbox_360'],
 ['Android', 'L

In [31]:
len(platform_column)

10896

It looks like all 10896 rows are filled!

Next, I will create a temporary dataframe with each console value regarded as a separate column for each row (converting a list into a series of columns):

In [32]:
# to create platform column to be installed into the main dataframe
temp = pd.DataFrame(platform_column, columns=['1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22'])

Only by having 22 columns will the above code work. This means that the longest list in the `platform_column` list has 22 values in it. Thus 22 columns are required to house each value into a separate column of its own.

In [33]:
# why are there 22 columns instead of 1?
temp

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,13,14,15,16,17,18,19,20,21,22
0,PlayStation_5,PC,PlayStation_4,PlayStation_3,Xbox_360,Xbox_One,,,,,...,,,,,,,,,,
1,PlayStation_3,PC,Xbox_360,Linux,macOS,,,,,,...,,,,,,,,,,
2,PC,Xbox_One,Nintendo_Switch,PlayStation_4,,,,,,,...,,,,,,,,,,
3,PC,PlayStation_4,PlayStation_3,Xbox_360,Xbox_One,macOS,,,,,...,,,,,,,,,,
4,PC,PlayStation_3,Xbox_360,Nintendo_Switch,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10891,iOS,,,,,,,,,,...,,,,,,,,,,
10892,PC,,,,,,,,,,...,,,,,,,,,,
10893,PC,,,,,,,,,,...,,,,,,,,,,
10894,PC,,,,,,,,,,...,,,,,,,,,,


Merge all these 22 columns into a single column:

In [34]:
# merges all 22 columns into 1
temp['platforms2'] = temp[temp.columns[0:]].apply(lambda x: ', '.join(x.dropna().astype(str)), axis=1)

# remove all 22 unwanted columns
temp.drop(temp.iloc[:, 0:22], axis=1, inplace=True)
temp

Unnamed: 0,platforms2
0,"PlayStation_5, PC, PlayStation_4, PlayStation_..."
1,"PlayStation_3, PC, Xbox_360, Linux, macOS"
2,"PC, Xbox_One, Nintendo_Switch, PlayStation_4"
3,"PC, PlayStation_4, PlayStation_3, Xbox_360, Xb..."
4,"PC, PlayStation_3, Xbox_360, Nintendo_Switch"
...,...
10891,iOS
10892,PC
10893,PC
10894,PC


Add the newly-created column into the main dataframe:

In [35]:
df4 = df3.copy()
df4 = pd.concat([df4, temp], axis=1)
df4

Unnamed: 0,id,name,released,rating,rating_top,ratings,ratings_count,reviews_text_count,added,metacritic,playtime,suggestions_count,reviews_count,platforms,parent_platforms,genres,tags,platforms2
0,3498,Grand Theft Auto V,2013-09-17,4.48,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3796,21,12383,97.0,68,422,3832,"[{'platform': {'id': 187, 'name': 'PlayStation...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor...","PlayStation_5, PC, PlayStation_4, PlayStation_..."
1,4200,Portal 2,2011-04-19,4.61,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3229,16,10880,95.0,11,596,3253,"[{'platform': {'id': 16, 'name': 'PlayStation ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 2, 'name': 'Shooter', 'slug': 'shooter...","[{'id': 40833, 'name': 'Captions available', '...","PlayStation_3, PC, Xbox_360, Linux, macOS"
2,3328,The Witcher 3: Wild Hunt,2015-05-18,4.67,5,"[{'id': 5, 'title': 'exceptional', 'count': 27...",3486,36,10603,93.0,52,680,3535,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor...","PC, Xbox_One, Nintendo_Switch, PlayStation_4"
3,5286,Tomb Raider (2013),2013-03-05,4.06,4,"[{'id': 4, 'title': 'recommended', 'count': 13...",2252,6,9844,86.0,11,680,2266,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor...","PC, PlayStation_4, PlayStation_3, Xbox_360, Xb..."
4,5679,The Elder Scrolls V: Skyrim,2011-11-11,4.41,5,"[{'id': 5, 'title': 'exceptional', 'count': 15...",2719,10,9717,94.0,44,626,2734,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 97, 'name': 'Action RPG', 'slug': 'act...","PC, PlayStation_3, Xbox_360, Nintendo_Switch"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10891,52200,Cheapshot,,4.83,5,"[{'id': 5, 'title': 'exceptional', 'count': 5,...",4,0,4,,0,102,6,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 59, 'name': 'Massively Multiplayer', '...","[{'id': 40932, 'name': 'AR support', 'slug': '...",iOS
10892,387309,Stay Out,,3.33,5,"[{'id': 5, 'title': 'exceptional', 'count': 3,...",6,0,3,,0,481,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40837, 'name': 'In-App Purchases', 'sl...",PC
10893,387303,Ray Eager,2019-11-11,4.83,5,"[{'id': 5, 'title': 'exceptional', 'count': 5,...",5,0,3,,0,258,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40834, 'name': 'Commentary available',...",PC
10894,330541,Stalker Online,2019-10-15,2.83,3,"[{'id': 3, 'title': 'meh', 'count': 2, 'percen...",5,0,3,,0,624,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 14, 'name': 'Early Access', 'slug': 'e...",PC


In [None]:
# df4.to_csv('../Data/data3.csv', index=False)

In [None]:
# df4 = pd.read_csv('../Data/data3.csv')

### 3.2.2 Ratings	

In [36]:
# check the array in the 1st row of the dataframe
df4['ratings'][0]

"[{'id': 5, 'title': 'exceptional', 'count': 2269, 'percent': 59.21}, {'id': 4, 'title': 'recommended', 'count': 1273, 'percent': 33.22}, {'id': 3, 'title': 'meh', 'count': 226, 'percent': 5.9}, {'id': 1, 'title': 'skip', 'count': 64, 'percent': 1.67}]"

`id` is the sentimental score for each game: <br>
- 1 = skip
- 3 = meh
- 4 = recommended
- 5 = exceptional

Thus there's a need to extract out `id` and `count` from this list.

Similarly, do a simple test for one single row:

In [37]:
import ast

single_game_ratings_id = []

for ratings in ast.literal_eval(df4['ratings'][1]):
    print(ratings['id'])
    single_game_ratings_id.append(ratings['id'])

single_game_ratings_id

5
4
3
1


[5, 4, 3, 1]

Now for the entire `ratings` column:

In [38]:
# apply ast.literal_eval on every row's ratings
df4['ratings'] = df4['ratings'].apply(ast.literal_eval)

In [39]:
ratings_id_column = []

for rating in df4['ratings']:
    # initialize an empty list for each row
    ratings_id_list = []
    # loop through the list of ratings
    for id in rating:
        ratings_id_list.append(id['id'])
    
    # add the list into the final ratings_id_column list
    ratings_id_column.append(ratings_id_list)

# each item in this list is a list of rating ids for one game
ratings_id_column

[[5, 4, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [4, 3, 5, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [4, 3, 5, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [4, 3, 1, 5],
 [4, 3, 5, 1],
 [4, 5, 3, 1],
 [3, 4, 1, 5],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 3, 5, 1],
 [4, 3, 1, 5],
 [4, 3, 5, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 3, 1, 5],
 [4, 3, 5, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [4, 3, 5, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [4, 5, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [5, 4, 3, 1],
 [4, 3, 5, 1],
 [4, 5, 3, 1],
 [4, 3, 1, 5],
 [5, 4, 3, 1],
 [4, 5, 3,

The ratings `id` aren't sorted consistently. Some rows have only 3 `id` instead of 4.

In [40]:
len(ratings_id_column)

10896

No missing values here, good! <br>
Now to create a separate column with `count`:

In [41]:
ratings_count_column = []

for rating in df4['ratings']:
    # initialize an empty list for each row
    ratings_count_list = []
    # loop through the list of ratings
    for count in rating:
        ratings_count_list.append(count['count'])
    
    # add the list into the final ratings_count_column list
    ratings_count_column.append(ratings_count_list)

# each item in this list is a list of counts for one game
ratings_count_column

[[2269, 1273, 226, 64],
 [2266, 841, 85, 61],
 [2757, 555, 144, 79],
 [1367, 579, 259, 61],
 [1539, 899, 240, 56],
 [1005, 589, 223, 59],
 [969, 649, 329, 75],
 [1379, 822, 222, 64],
 [1663, 922, 102, 48],
 [959, 831, 312, 144],
 [883, 501, 283, 185],
 [1065, 703, 195, 69],
 [1125, 772, 142, 60],
 [1406, 643, 127, 59],
 [795, 438, 346, 155],
 [1860, 504, 157, 80],
 [1180, 817, 159, 50],
 [620, 336, 137, 124],
 [958, 546, 385, 117],
 [746, 682, 178, 44],
 [496, 446, 371, 228],
 [1958, 489, 98, 88],
 [678, 560, 133, 52],
 [913, 453, 214, 68],
 [731, 608, 221, 77],
 [727, 465, 227, 106],
 [329, 175, 106, 102],
 [509, 377, 179, 147],
 [647, 312, 188, 82],
 [1173, 781, 199, 60],
 [641, 484, 92, 43],
 [557, 392, 204, 69],
 [941, 275, 273, 57],
 [595, 437, 65, 34],
 [925, 356, 219, 42],
 [677, 279, 198, 62],
 [578, 406, 136, 64],
 [643, 400, 44, 39],
 [727, 595, 169, 36],
 [609, 362, 134, 55],
 [1510, 617, 106, 63],
 [335, 154, 132, 98],
 [731, 388, 174, 37],
 [803, 579, 114, 56],
 [230, 156,

Combine these two lists with their corresponding `id` and `count` using `zip()`:

In [42]:
combined = []

for i in range(len(ratings_count_column)):
    combined.append(list(zip(ratings_id_column[i], ratings_count_column[i])))

combined

[[(5, 2269), (4, 1273), (3, 226), (1, 64)],
 [(5, 2266), (4, 841), (3, 85), (1, 61)],
 [(5, 2757), (4, 555), (3, 144), (1, 79)],
 [(4, 1367), (5, 579), (3, 259), (1, 61)],
 [(5, 1539), (4, 899), (3, 240), (1, 56)],
 [(4, 1005), (5, 589), (3, 223), (1, 59)],
 [(4, 969), (5, 649), (3, 329), (1, 75)],
 [(5, 1379), (4, 822), (3, 222), (1, 64)],
 [(5, 1663), (4, 922), (3, 102), (1, 48)],
 [(5, 959), (4, 831), (3, 312), (1, 144)],
 [(4, 883), (3, 501), (5, 283), (1, 185)],
 [(4, 1065), (5, 703), (3, 195), (1, 69)],
 [(5, 1125), (4, 772), (3, 142), (1, 60)],
 [(5, 1406), (4, 643), (3, 127), (1, 59)],
 [(4, 795), (3, 438), (5, 346), (1, 155)],
 [(5, 1860), (4, 504), (3, 157), (1, 80)],
 [(5, 1180), (4, 817), (3, 159), (1, 50)],
 [(4, 620), (3, 336), (1, 137), (5, 124)],
 [(4, 958), (3, 546), (5, 385), (1, 117)],
 [(4, 746), (5, 682), (3, 178), (1, 44)],
 [(3, 496), (4, 446), (1, 371), (5, 228)],
 [(5, 1958), (4, 489), (3, 98), (1, 88)],
 [(5, 678), (4, 560), (3, 133), (1, 52)],
 [(4, 913), (5,

In [43]:
# check for missing values
len(combined)

10896

All 10896 rows are accounted for!

Now to sort the tuples with the lowest `id` in the front. `id` is the first element of the tuple, hence sort by the first index of each tuple.

In [44]:
# create a copy
combined_sorted = combined.copy()

for rows in combined_sorted:
    rows.sort(key=lambda tup: tup[0])

combined_sorted

[[(1, 64), (3, 226), (4, 1273), (5, 2269)],
 [(1, 61), (3, 85), (4, 841), (5, 2266)],
 [(1, 79), (3, 144), (4, 555), (5, 2757)],
 [(1, 61), (3, 259), (4, 1367), (5, 579)],
 [(1, 56), (3, 240), (4, 899), (5, 1539)],
 [(1, 59), (3, 223), (4, 1005), (5, 589)],
 [(1, 75), (3, 329), (4, 969), (5, 649)],
 [(1, 64), (3, 222), (4, 822), (5, 1379)],
 [(1, 48), (3, 102), (4, 922), (5, 1663)],
 [(1, 144), (3, 312), (4, 831), (5, 959)],
 [(1, 185), (3, 501), (4, 883), (5, 283)],
 [(1, 69), (3, 195), (4, 1065), (5, 703)],
 [(1, 60), (3, 142), (4, 772), (5, 1125)],
 [(1, 59), (3, 127), (4, 643), (5, 1406)],
 [(1, 155), (3, 438), (4, 795), (5, 346)],
 [(1, 80), (3, 157), (4, 504), (5, 1860)],
 [(1, 50), (3, 159), (4, 817), (5, 1180)],
 [(1, 137), (3, 336), (4, 620), (5, 124)],
 [(1, 117), (3, 546), (4, 958), (5, 385)],
 [(1, 44), (3, 178), (4, 746), (5, 682)],
 [(1, 371), (3, 496), (4, 446), (5, 228)],
 [(1, 88), (3, 98), (4, 489), (5, 1958)],
 [(1, 52), (3, 133), (4, 560), (5, 678)],
 [(1, 68), (3, 

Create a small dataframe with the ratings information (to be added to the main dataframe):

In [45]:
# creates a ratings column for installation into the main dataframe
temp_ratings = pd.DataFrame(combined_sorted, columns=['1st', '2nd', '3rd', '4th'])
temp_ratings

Unnamed: 0,1st,2nd,3rd,4th
0,"(1, 64)","(3, 226)","(4, 1273)","(5, 2269)"
1,"(1, 61)","(3, 85)","(4, 841)","(5, 2266)"
2,"(1, 79)","(3, 144)","(4, 555)","(5, 2757)"
3,"(1, 61)","(3, 259)","(4, 1367)","(5, 579)"
4,"(1, 56)","(3, 240)","(4, 899)","(5, 1539)"
...,...,...,...,...
10891,"(4, 1)","(5, 5)",,
10892,"(1, 2)","(3, 1)","(5, 3)",
10893,"(4, 1)","(5, 5)",,
10894,"(1, 2)","(3, 2)","(4, 1)","(5, 1)"


In [46]:
# merges all 4 columns into 1
temp_ratings['ratings2'] = temp_ratings[temp_ratings.columns[0:]].apply(lambda x: ', '.join(x.dropna().astype(str)), axis=1)

# remove all 4 unwanted columns
temp_ratings.drop(temp_ratings.iloc[:, 0:4], axis=1, inplace=True)
temp_ratings

Unnamed: 0,ratings2
0,"(1, 64), (3, 226), (4, 1273), (5, 2269)"
1,"(1, 61), (3, 85), (4, 841), (5, 2266)"
2,"(1, 79), (3, 144), (4, 555), (5, 2757)"
3,"(1, 61), (3, 259), (4, 1367), (5, 579)"
4,"(1, 56), (3, 240), (4, 899), (5, 1539)"
...,...
10891,"(4, 1), (5, 5)"
10892,"(1, 2), (3, 1), (5, 3)"
10893,"(4, 1), (5, 5)"
10894,"(1, 2), (3, 2), (4, 1), (5, 1)"


### 3.2.3 Genres

In [47]:
# check the array in the 1st row of the dataframe
df4['genres'][0]

"[{'id': 4, 'name': 'Action', 'slug': 'action', 'games_count': 90029, 'image_background': 'https://media.rawg.io/media/games/c4b/c4b0cab189e73432de3a250d8cf1c84e.jpg'}, {'id': 3, 'name': 'Adventure', 'slug': 'adventure', 'games_count': 62681, 'image_background': 'https://media.rawg.io/media/games/2ad/2add0a457df49ab5efe2b5f754ef924f.jpg'}]"

In [48]:
import ast

# apply ast.literal_eval on every row's generes
df4['genres'] = df4['genres'].apply(ast.literal_eval)

In [49]:
genres_column = []

for genres in df4['genres']:
    # initialize an empty list for each row
    genre_list = []
    # loop through the list of genre names
    for name in genres:
        genre_list.append(name['name'])

    # add the list into the final genres_column list
    genres_column.append(genre_list)

# each item in this list is a list of genres for one game
genres_column

[['Action', 'Adventure'],
 ['Shooter', 'Puzzle'],
 ['Action', 'Adventure', 'RPG'],
 ['Action', 'Adventure'],
 ['Action', 'RPG'],
 ['Action', 'Shooter'],
 ['Action', 'Shooter', 'RPG'],
 ['Action', 'Shooter'],
 ['Action', 'Adventure', 'Puzzle'],
 ['Adventure'],
 ['Action', 'Shooter'],
 ['Action', 'Adventure', 'Indie', 'Puzzle', 'Platformer'],
 ['Action', 'Shooter'],
 ['Action', 'Shooter'],
 ['Action', 'Shooter'],
 ['Action', 'Adventure'],
 ['Action', 'Shooter'],
 ['Action', 'Shooter'],
 ['Action', 'RPG'],
 ['Action', 'Adventure'],
 ['Action', 'Massively Multiplayer'],
 ['Action', 'Adventure'],
 ['Action', 'Adventure'],
 ['Action', 'Sports', 'Racing', 'Indie'],
 ['Action', 'Shooter'],
 ['Action', 'Shooter', 'Massively Multiplayer'],
 ['Action'],
 ['Action', 'Shooter', 'Massively Multiplayer'],
 ['Action', 'Shooter'],
 ['Action', 'Adventure', 'RPG'],
 ['RPG'],
 ['Action', 'Indie', 'Platformer'],
 ['Action', 'RPG'],
 ['Action', 'Shooter'],
 ['Action', 'Platformer'],
 ['Action', 'Shooter'],


In [50]:
len(genres_column)

10896

All 10896 rows are filled!

Create the small dataframe to be added into the main dataframe:

In [51]:
# creates a genres column for installation into the main dataframe
temp_genres = pd.DataFrame(genres_column ,columns=['1', '2', '3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19'])
temp_genres

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,Action,Adventure,,,,,,,,,,,,,,,,,
1,Shooter,Puzzle,,,,,,,,,,,,,,,,,
2,Action,Adventure,RPG,,,,,,,,,,,,,,,,
3,Action,Adventure,,,,,,,,,,,,,,,,,
4,Action,RPG,,,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10891,Massively Multiplayer,,,,,,,,,,,,,,,,,,
10892,Action,RPG,Massively Multiplayer,Indie,,,,,,,,,,,,,,,
10893,Action,Massively Multiplayer,,,,,,,,,,,,,,,,,
10894,Action,RPG,Massively Multiplayer,Indie,,,,,,,,,,,,,,,


The highest number of genres ever assigned to a game is 19, therefore there are 19 columns in this `temp_genres` dataframe.

In [52]:
# merges all 19 columns into 1
temp_genres['genres2'] = temp_genres[temp_genres.columns[0:]].apply(lambda x: ', '.join(x.dropna().astype(str)), axis=1)

# remove all 19 unwanted columns
temp_genres.drop(temp_genres.iloc[:, 0:19], axis=1, inplace=True)
temp_genres

Unnamed: 0,genres2
0,"Action, Adventure"
1,"Shooter, Puzzle"
2,"Action, Adventure, RPG"
3,"Action, Adventure"
4,"Action, RPG"
...,...
10891,Massively Multiplayer
10892,"Action, RPG, Massively Multiplayer, Indie"
10893,"Action, Massively Multiplayer"
10894,"Action, RPG, Massively Multiplayer, Indie"


### 3.2.4 Tags
Same process for `tags`.

In [53]:
df4['tags'][0]

"[{'id': 40836, 'name': 'Full controller support', 'slug': 'full-controller-support', 'language': 'eng', 'games_count': 8589, 'image_background': 'https://media.rawg.io/media/games/157/15742f2f67eacff546738e1ab5c19d20.jpg'}, {'id': 40847, 'name': 'Steam Achievements', 'slug': 'steam-achievements', 'language': 'eng', 'games_count': 18291, 'image_background': 'https://media.rawg.io/media/games/6cd/6cd653e0aaef5ff8bbd295bf4bcb12eb.jpg'}, {'id': 13, 'name': 'Atmospheric', 'slug': 'atmospheric', 'language': 'eng', 'games_count': 8429, 'image_background': 'https://media.rawg.io/media/games/81b/81b138691f027ed1f8720758daa0d895.jpg'}, {'id': 123, 'name': 'Comedy', 'slug': 'comedy', 'language': 'eng', 'games_count': 4390, 'image_background': 'https://media.rawg.io/media/games/c89/c89ca70716080733d03724277df2c6c7.jpg'}, {'id': 18, 'name': 'Co-op', 'slug': 'co-op', 'language': 'eng', 'games_count': 5254, 'image_background': 'https://media.rawg.io/media/games/48c/48cb04ca483be865e3a83119c94e6097.j

In [54]:
import ast

# apply ast.literal_eval on every row's tags
df4['tags'] = df4['tags'].apply(ast.literal_eval)

In [55]:
tags_column = []

for tags in df4['tags']:
    # initialize an empty list for each row
    tags_list = []
    # loop through the list of tag names
    for name in tags:
        tags_list.append(name['name'])

    # add the list into the final tags_column list
    tags_column.append(tags_list)

# each item in this list is a list of tags for all 10896 games
tags_column

[['Full controller support',
  'Steam Achievements',
  'Atmospheric',
  'Comedy',
  'Co-op',
  'Crime',
  'First-Person',
  'Funny',
  'Great Soundtrack',
  'Moddable',
  'Multiplayer',
  'Open World',
  'RPG',
  'Sandbox',
  'Singleplayer',
  'Third Person',
  'Third-Person Shooter',
  'cooperative'],
 ['Captions available',
  'Commentary available',
  'Full controller support',
  'Includes level editor',
  'Steam Achievements',
  'Steam Cloud',
  'Atmospheric',
  'Steam Workshop',
  'Comedy',
  'Co-op',
  'Female Protagonist',
  'First-Person',
  'FPS',
  'Funny',
  'Local Co-Op',
  'Multiplayer',
  'Online Co-Op',
  'Science',
  'Sci-fi',
  'Singleplayer',
  'Space',
  'Story Rich',
  'cooperative',
  'steam-trading-cards',
  'stats'],
 ['Full controller support',
  'Action RPG',
  'Atmospheric',
  'Choices Matter',
  'Dark',
  'Dark Fantasy',
  'Fantasy',
  'Great Soundtrack',
  'Magic',
  'Mature',
  'Medieval',
  'Multiple Endings',
  'Nudity',
  'Open World',
  'RPG',
  'Sandbox

In [56]:
# check for missing values
len(tags_column)

10896

All 10896 rows are filled! :)

In [57]:
# there will be 55 columns in this df

# custom function to create a list with ascending numbers up to x (inclusive)
def number_list_gen(x):
    number_list = []
    for i in range(1, x+1):
        number_list.append(str(i))
    return number_list

In [58]:
# creates a tags column for installation into the main dataframe
temp_tags = pd.DataFrame(tags_column ,columns=number_list_gen(55))
temp_tags

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,46,47,48,49,50,51,52,53,54,55
0,Full controller support,Steam Achievements,Atmospheric,Comedy,Co-op,Crime,First-Person,Funny,Great Soundtrack,Moddable,...,,,,,,,,,,
1,Captions available,Commentary available,Full controller support,Includes level editor,Steam Achievements,Steam Cloud,Atmospheric,Steam Workshop,Comedy,Co-op,...,,,,,,,,,,
2,Full controller support,Action RPG,Atmospheric,Choices Matter,Dark,Dark Fantasy,Fantasy,Great Soundtrack,Magic,Mature,...,,,,,,,,,,
3,Full controller support,Action-Adventure,Atmospheric,Cinematic,Classic,Dinosaurs,Exploration,Female Protagonist,Lara Croft,Multiplayer,...,,,,,,,,,,
4,Action RPG,Partial Controller Support,Steam Achievements,Steam Cloud,Atmospheric,Steam Workshop,Character Customization,Dark Fantasy,Dragons,Fantasy,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10891,AR support,,,,,,,,,,...,,,,,,,,,,
10892,In-App Purchases,Early Access,Free to Play,RPG,Violent,Online multiplayer,mmo,,,,...,,,,,,,,,,
10893,Commentary available,Includes level editor,Steam Achievements,Steam Leaderboards,Early Access,Multiplayer,Online multiplayer,mmo,stats,,...,,,,,,,,,,
10894,Early Access,Free to Play,RPG,Violent,Online multiplayer,mmo,,,,,...,,,,,,,,,,


The highest number of tags ever assigned to a single game is 55, therefore there are 55 columns in this `temp_tags` dataframe.

In [59]:
# merges all 55 columns into 1
temp_tags['tags2'] = temp_tags[temp_tags.columns[0:]].apply(lambda x: ', '.join(x.dropna().astype(str)), axis=1)

# remove all 55 unwanted columns
temp_tags.drop(temp_tags.iloc[:, 0:55], axis=1, inplace=True)
temp_tags

Unnamed: 0,tags2
0,"Full controller support, Steam Achievements, A..."
1,"Captions available, Commentary available, Full..."
2,"Full controller support, Action RPG, Atmospher..."
3,"Full controller support, Action-Adventure, Atm..."
4,"Action RPG, Partial Controller Support, Steam ..."
...,...
10891,AR support
10892,"In-App Purchases, Early Access, Free to Play, ..."
10893,"Commentary available, Includes level editor, S..."
10894,"Early Access, Free to Play, RPG, Violent, Onli..."


Now to merge the `temp_ratings`, `temp_genres` and `temp_tags` dataframes into the mainframe!

In [60]:
df5 = df4.copy()
df5 = pd.concat([df5, temp_ratings, temp_genres, temp_tags], axis=1)
df5

Unnamed: 0,id,name,released,rating,rating_top,ratings,ratings_count,reviews_text_count,added,metacritic,...,suggestions_count,reviews_count,platforms,parent_platforms,genres,tags,platforms2,ratings2,genres2,tags2
0,3498,Grand Theft Auto V,2013-09-17,4.48,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3796,21,12383,97.0,...,422,3832,"[{'platform': {'id': 187, 'name': 'PlayStation...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor...","PlayStation_5, PC, PlayStation_4, PlayStation_...","(1, 64), (3, 226), (4, 1273), (5, 2269)","Action, Adventure","Full controller support, Steam Achievements, A..."
1,4200,Portal 2,2011-04-19,4.61,5,"[{'id': 5, 'title': 'exceptional', 'count': 22...",3229,16,10880,95.0,...,596,3253,"[{'platform': {'id': 16, 'name': 'PlayStation ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 2, 'name': 'Shooter', 'slug': 'shooter...","[{'id': 40833, 'name': 'Captions available', '...","PlayStation_3, PC, Xbox_360, Linux, macOS","(1, 61), (3, 85), (4, 841), (5, 2266)","Shooter, Puzzle","Captions available, Commentary available, Full..."
2,3328,The Witcher 3: Wild Hunt,2015-05-18,4.67,5,"[{'id': 5, 'title': 'exceptional', 'count': 27...",3486,36,10603,93.0,...,680,3535,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor...","PC, Xbox_One, Nintendo_Switch, PlayStation_4","(1, 79), (3, 144), (4, 555), (5, 2757)","Action, Adventure, RPG","Full controller support, Action RPG, Atmospher..."
3,5286,Tomb Raider (2013),2013-03-05,4.06,4,"[{'id': 4, 'title': 'recommended', 'count': 13...",2252,6,9844,86.0,...,680,2266,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40836, 'name': 'Full controller suppor...","PC, PlayStation_4, PlayStation_3, Xbox_360, Xb...","(1, 61), (3, 259), (4, 1367), (5, 579)","Action, Adventure","Full controller support, Action-Adventure, Atm..."
4,5679,The Elder Scrolls V: Skyrim,2011-11-11,4.41,5,"[{'id': 5, 'title': 'exceptional', 'count': 15...",2719,10,9717,94.0,...,626,2734,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 97, 'name': 'Action RPG', 'slug': 'act...","PC, PlayStation_3, Xbox_360, Nintendo_Switch","(1, 56), (3, 240), (4, 899), (5, 1539)","Action, RPG","Action RPG, Partial Controller Support, Steam ..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10891,52200,Cheapshot,,4.83,5,"[{'id': 5, 'title': 'exceptional', 'count': 5,...",4,0,4,,...,102,6,"[{'platform': {'id': 3, 'name': 'iOS', 'slug':...","[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","[{'id': 59, 'name': 'Massively Multiplayer', '...","[{'id': 40932, 'name': 'AR support', 'slug': '...",iOS,"(4, 1), (5, 5)",Massively Multiplayer,AR support
10892,387309,Stay Out,,3.33,5,"[{'id': 5, 'title': 'exceptional', 'count': 3,...",6,0,3,,...,481,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40837, 'name': 'In-App Purchases', 'sl...",PC,"(1, 2), (3, 1), (5, 3)","Action, RPG, Massively Multiplayer, Indie","In-App Purchases, Early Access, Free to Play, ..."
10893,387303,Ray Eager,2019-11-11,4.83,5,"[{'id': 5, 'title': 'exceptional', 'count': 5,...",5,0,3,,...,258,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 40834, 'name': 'Commentary available',...",PC,"(4, 1), (5, 5)","Action, Massively Multiplayer","Commentary available, Includes level editor, S..."
10894,330541,Stalker Online,2019-10-15,2.83,3,"[{'id': 3, 'title': 'meh', 'count': 2, 'percen...",5,0,3,,...,624,6,"[{'platform': {'id': 4, 'name': 'PC', 'slug': ...","[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","[{'id': 4, 'name': 'Action', 'slug': 'action',...","[{'id': 14, 'name': 'Early Access', 'slug': 'e...",PC,"(1, 2), (3, 2), (4, 1), (5, 1)","Action, RPG, Massively Multiplayer, Indie","Early Access, Free to Play, RPG, Violent, Onli..."


# 4. Clean Up Main Dataframe

In [61]:
# removes unwanted unexpanded columns
df5.drop(labels=['ratings', 'platforms', 'genres', 'tags'], axis=1, inplace=True)

# rename newly-created columns for better readability
df5.rename(columns={'platforms2': 'platforms'}, inplace=True)
df5.rename(columns={'ratings2': 'ratings'}, inplace=True)
df5.rename(columns={'genres2': 'genres'}, inplace=True)
df5.rename(columns={'tags2': 'tags'}, inplace=True)

df5.head()

Unnamed: 0,id,name,released,rating,rating_top,ratings_count,reviews_text_count,added,metacritic,playtime,suggestions_count,reviews_count,parent_platforms,platforms,ratings,genres,tags
0,3498,Grand Theft Auto V,2013-09-17,4.48,5,3796,21,12383,97.0,68,422,3832,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PlayStation_5, PC, PlayStation_4, PlayStation_...","(1, 64), (3, 226), (4, 1273), (5, 2269)","Action, Adventure","Full controller support, Steam Achievements, A..."
1,4200,Portal 2,2011-04-19,4.61,5,3229,16,10880,95.0,11,596,3253,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PlayStation_3, PC, Xbox_360, Linux, macOS","(1, 61), (3, 85), (4, 841), (5, 2266)","Shooter, Puzzle","Captions available, Commentary available, Full..."
2,3328,The Witcher 3: Wild Hunt,2015-05-18,4.67,5,3486,36,10603,93.0,52,680,3535,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, Xbox_One, Nintendo_Switch, PlayStation_4","(1, 79), (3, 144), (4, 555), (5, 2757)","Action, Adventure, RPG","Full controller support, Action RPG, Atmospher..."
3,5286,Tomb Raider (2013),2013-03-05,4.06,4,2252,6,9844,86.0,11,680,2266,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, PlayStation_4, PlayStation_3, Xbox_360, Xb...","(1, 61), (3, 259), (4, 1367), (5, 579)","Action, Adventure","Full controller support, Action-Adventure, Atm..."
4,5679,The Elder Scrolls V: Skyrim,2011-11-11,4.41,5,2719,10,9717,94.0,44,626,2734,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, PlayStation_3, Xbox_360, Nintendo_Switch","(1, 56), (3, 240), (4, 899), (5, 1539)","Action, RPG","Action RPG, Partial Controller Support, Steam ..."


Save the newly-processed dataframe.

In [62]:
df5.to_csv('../Data/data4.csv', index=False)

In [None]:
df5 = pd.read_csv('../Data/data4.csv')

# 5. Add in Additional Information

## 5.1 Check for Missing Data

In [63]:
df5.isnull().sum()

id                       0
name                     0
released               152
rating                   0
rating_top               0
ratings_count            0
reviews_text_count       0
added                    0
metacritic            7591
playtime                 0
suggestions_count        0
reviews_count            0
parent_platforms         0
platforms                0
ratings                  0
genres                   0
tags                     0
dtype: int64

There are still 152 missing `released` dates and 7591 missing `metacritic` scores.

In [None]:
# Removes rows with missing data
# df5.dropna(inplace=True)

## 5.2 Add in Year of Release

In [64]:
# filter out missing released data
df5 = df5[df5['released'].notnull()]

# convert dtype
df5['released'] = df5['released'].astype("datetime64")
df5['released'].dtype

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df5['released'] = df5['released'].astype("datetime64")


dtype('<M8[ns]')

In [65]:
# create Year column
df5['Year'] = df5['released'].dt.year
df5

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df5['Year'] = df5['released'].dt.year


Unnamed: 0,id,name,released,rating,rating_top,ratings_count,reviews_text_count,added,metacritic,playtime,suggestions_count,reviews_count,parent_platforms,platforms,ratings,genres,tags,Year
0,3498,Grand Theft Auto V,2013-09-17,4.48,5,3796,21,12383,97.0,68,422,3832,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PlayStation_5, PC, PlayStation_4, PlayStation_...","(1, 64), (3, 226), (4, 1273), (5, 2269)","Action, Adventure","Full controller support, Steam Achievements, A...",2013
1,4200,Portal 2,2011-04-19,4.61,5,3229,16,10880,95.0,11,596,3253,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PlayStation_3, PC, Xbox_360, Linux, macOS","(1, 61), (3, 85), (4, 841), (5, 2266)","Shooter, Puzzle","Captions available, Commentary available, Full...",2011
2,3328,The Witcher 3: Wild Hunt,2015-05-18,4.67,5,3486,36,10603,93.0,52,680,3535,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, Xbox_One, Nintendo_Switch, PlayStation_4","(1, 79), (3, 144), (4, 555), (5, 2757)","Action, Adventure, RPG","Full controller support, Action RPG, Atmospher...",2015
3,5286,Tomb Raider (2013),2013-03-05,4.06,4,2252,6,9844,86.0,11,680,2266,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, PlayStation_4, PlayStation_3, Xbox_360, Xb...","(1, 61), (3, 259), (4, 1367), (5, 579)","Action, Adventure","Full controller support, Action-Adventure, Atm...",2013
4,5679,The Elder Scrolls V: Skyrim,2011-11-11,4.41,5,2719,10,9717,94.0,44,626,2734,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, PlayStation_3, Xbox_360, Nintendo_Switch","(1, 56), (3, 240), (4, 899), (5, 1539)","Action, RPG","Action RPG, Partial Controller Support, Steam ...",2011
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10888,51427,Tekken (mobile),2017-08-18,2.67,4,4,2,5,,0,279,6,"[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","iOS, Android","(1, 2), (3, 2), (4, 2)",Fighting,"Story, versus, friends, battle, build, combat,...",2017
10889,31876,Bad Dudes,2018-03-21,3.17,3,6,0,5,,0,166,6,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, Atari_ST, Commodore_/_Amiga, Apple_II, NES...","(3, 5), (4, 1)",Action,Multiplayer,2018
10893,387303,Ray Eager,2019-11-11,4.83,5,5,0,3,,0,258,6,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...",PC,"(4, 1), (5, 5)","Action, Massively Multiplayer","Commentary available, Includes level editor, S...",2019
10894,330541,Stalker Online,2019-10-15,2.83,3,5,0,3,,0,624,6,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...",PC,"(1, 2), (3, 2), (4, 1), (5, 1)","Action, RPG, Massively Multiplayer, Indie","Early Access, Free to Play, RPG, Violent, Onli...",2019


In [66]:
year = df5.groupby("Year")

In [67]:
# To see exact amount of games released in each year
# Sorted by Year
year.size()

Year
1962       1
1971       1
1972       1
1976       1
1977       1
1978       2
1979       4
1980       5
1981       7
1982      10
1983      13
1984      19
1985      23
1986      23
1987      34
1988      25
1989      31
1990      47
1991      64
1992      76
1993     109
1994     109
1995     117
1996     106
1997     123
1998     138
1999     190
2000     165
2001     165
2002     160
2003     213
2004     210
2005     228
2006     240
2007     258
2008     323
2009     407
2010     446
2011     418
2012     537
2013     618
2014     852
2015    1012
2016    1141
2017     872
2018     615
2019     456
2020     127
2021       1
dtype: int64

To communicate this information more clearly and efficently, we shall use graphs to represent it (in the next part).

## 5.2 Add in Metacritic Score Rank

In [68]:
df6 = df5.copy()

# sort by descending metacritic score
df6.sort_values(by='metacritic', ascending=False, inplace=True)
df6.reset_index(drop=True, inplace=True)

# insert new rank column
df6.insert(0, 'rank', range(1, 1+len(df6)))

df6

Unnamed: 0,rank,id,name,released,rating,rating_top,ratings_count,reviews_text_count,added,metacritic,playtime,suggestions_count,reviews_count,parent_platforms,platforms,ratings,genres,tags,Year
0,1,25097,The Legend of Zelda: Ocarina of Time,1998-11-21,4.39,5,486,3,872,99.0,7,350,490,"[{'platform': {'id': 7, 'name': 'Nintendo', 's...","GameCube, Wii, Nintendo_64, Wii_U","(1, 31), (3, 32), (4, 110), (5, 317)","Action, Adventure, RPG",Singleplayer,1998
1,2,54751,Soulcalibur,1998-07-30,4.38,5,47,0,167,98.0,6,578,47,"[{'platform': {'id': 3, 'name': 'Xbox', 'slug'...","iOS, Android, Xbox_360, Dreamcast, Xbox_One","(3, 5), (4, 19), (5, 23)","Action, Fighting",2 players,1998
2,3,3498,Grand Theft Auto V,2013-09-17,4.48,5,3796,21,12383,97.0,68,422,3832,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PlayStation_5, PC, PlayStation_4, PlayStation_...","(1, 64), (3, 226), (4, 1273), (5, 2269)","Action, Adventure","Full controller support, Steam Achievements, A...",2013
3,4,22511,The Legend of Zelda: Breath of the Wild,2017-03-02,4.56,5,1711,25,3284,97.0,121,344,1743,"[{'platform': {'id': 7, 'name': 'Nintendo', 's...","Nintendo_Switch, Wii_U","(1, 81), (3, 91), (4, 268), (5, 1303)","Action, Adventure, RPG","Action-Adventure, Open World, RPG, Sandbox, ex...",2017
4,5,27036,Super Mario Galaxy 2,2010-05-23,4.37,5,222,2,420,97.0,18,326,226,"[{'platform': {'id': 7, 'name': 'Nintendo', 's...","Wii, Wii_U","(1, 15), (3, 14), (4, 54), (5, 143)",Platformer,"Solo, princess, Gravity, mario, collect, baby,...",2010
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10739,10740,51427,Tekken (mobile),2017-08-18,2.67,4,4,2,5,,0,279,6,"[{'platform': {'id': 4, 'name': 'iOS', 'slug':...","iOS, Android","(1, 2), (3, 2), (4, 2)",Fighting,"Story, versus, friends, battle, build, combat,...",2017
10740,10741,31876,Bad Dudes,2018-03-21,3.17,3,6,0,5,,0,166,6,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...","PC, Atari_ST, Commodore_/_Amiga, Apple_II, NES...","(3, 5), (4, 1)",Action,Multiplayer,2018
10741,10742,387303,Ray Eager,2019-11-11,4.83,5,5,0,3,,0,258,6,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...",PC,"(4, 1), (5, 5)","Action, Massively Multiplayer","Commentary available, Includes level editor, S...",2019
10742,10743,330541,Stalker Online,2019-10-15,2.83,3,5,0,3,,0,624,6,"[{'platform': {'id': 1, 'name': 'PC', 'slug': ...",PC,"(1, 2), (3, 2), (4, 1), (5, 1)","Action, RPG, Massively Multiplayer, Indie","Early Access, Free to Play, RPG, Violent, Onli...",2019


In [69]:
df6.to_csv('../Data/data5.csv', index=False)