<a href="https://colab.research.google.com/github/Jcc329/Jessica_DATA606/blob/main/Raw_data/Accessing_Steam_APIs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data 606 - Data Science Capstone
### Jessica Conroy

Project Stage: Data Acquisition

This notebook aims to access and request data from the Steam API and Steamspy API. 

### Accessing Steam Data Process

The first call to the steam API gets a list of all games currently or soon to be available on the Steam service.

This list is then converted into a pandas dataframe and cleaned by removing as many blank, test, or beta games as possible based on the name of the game. This is so that the final dataset doesn't contain new games that don't have enough review information, or 'games' that were created without any associated data (for example, by someone testing how to use the platform).

The final dataframe is then passed to a function define below. That function randomizes the dataframe using sklearn shuffle and then impliments 3 api calls for each appid in the list, adding the data for that game to a dictionary. The first API requests the general steam data, the second requests the top 20 reviews and associated review metadata, the third requests supplementary data available from the steamspy API. 

This loop runs for 6 hours and then ends. The goal being to collect a large random sample of games that I can then analyze while keeping in mind time limitations and rate limits.

The function then converts the final dictionary into a dataframe and returns that dataframe.

### Saving the data

Output data is saved as a CSV to my local machine.

### Primary Analysis

Basic descriptive statistics are run.

### Data cleaning

The final dataset contains several columns with many values (subdictionaries).
To handle these, I will identify all columns containing desired data, remove unneccessary columns and use the apply function to convert the multidimentional columns into their own dataframes that can be appened back onto the original dataframe. 

#### Text Cleaning

Any text data will undergo additional cleaning to prepare it for analysis, including converting the text to lowercase, removing symbols and punctuation, and generally tidying the data.

This concludes the goals of this notebook. A cleaned dataset will be save and the next stage of EDA will occur in the next notebook in this series. 

### Sources

Inspiration came from https://nik-davis.github.io/posts/2019/steam-data-collection/ 


In [1]:
!pip install steamspypi

Collecting steamspypi
  Downloading steamspypi-1.1.1-py3-none-any.whl (11 kB)
Installing collected packages: steamspypi
Successfully installed steamspypi-1.1.1


In [2]:
# standard library imports
import csv
import datetime as dt
import json
import os
import statistics
import time

# third-party imports
import numpy as np
import pandas as pd
import requests
import steamspypi
from sklearn.utils import shuffle

pd.set_option("max_columns", 100)
pd.set_option('display.max_rows', None)

# Stage 1: Collect all Game IDs and Clean

In [10]:
#Get all game ids and names
#URL call found here: https://partner.steamgames.com/doc/webapi/ISteamApps
URL = 'https://api.steampowered.com/ISteamApps/GetAppList/v2/'

response = requests.get(url=URL)
json_data = response.json()
GameIDs = pd.DataFrame.from_dict(json_data['applist']['apps'])
#Clean up the dataframe to remove empty strings and test/demo games
GameIDs['name'] = GameIDs['name'].str.strip()
GameIDs['name'] = GameIDs['name'].str.lower()
GameIDs = GameIDs[GameIDs['name'].isin(['','pieterw test app76 ( 216938 )','test2','test3', 'tidewoken public test', 
                                        'now testing: 407', 'test re(quietmansion1 special teaser)', '<h1>test</h1>', 
                                        'test', 'test project', 'steamvr performance test', 'testcontent', 'vrq test'
                                        ]) == False]
GameIDs = GameIDs[GameIDs['name'].str.contains('playtest')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('closed testing')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('testapp')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains(' test ')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('betatest')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('test server')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('beta test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('tidewoken public test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('open test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('dev test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('- test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('feature test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('technical test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('early access testing')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('_test')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains(' demo')==False]
GameIDs = GameIDs[GameIDs['name'].str.contains('public test')==False]


In [None]:
GameIDs.shape

(125752, 2)

# Stage 2: Gather data for a Sample of the games

In [None]:
#Create function to collect data from APIs
def CollectSteamData(GameIDDF):
    '''
    input: dataframe containing IDs and names of games 
    output: dataframe containing all api data from a random sample of the games
    '''
    #Steam API 1: primary game data
    #https://stackoverflow.com/questions/69512319/steam-api-to-get-game-info
    #Steam API 2: Review data
    #https://partner.steamgames.com/doc/store/getreviews
    #Steamspy API: Supplemental usage and cost data
    # https://pypi.org/project/steamspypi/
    # https://steamspy.com/api.php
    
    #Randomize the data frame
    IDs = shuffle(GameIDDF)
    GameDict = {}
    starttime = time.time()
    for appid in IDs['appid']:
        try:
            gameURL = 'http://store.steampowered.com/api/appdetails?appids=' + str(appid)
            response = requests.get(url=gameURL)
            json_data = response.json()
            GameData = json_data[str(appid)]['data']
            time.sleep(1) # 1 second rate limit on API calls
            reviewURL = 'http://store.steampowered.com/appreviews/' + str(appid) + '?json=1'
            response = requests.get(url=reviewURL)
            json_data = response.json()
            ReviewScore = json_data['query_summary']['review_score']
            ReviewScoreDesc = json_data['query_summary']['review_score_desc']
            reviewText = ''
            for review in json_data['reviews']:
                reviewText = reviewText + review['review']
            
            ReviewDict = {'Review Score':ReviewScore, 'Review Score Description': ReviewScoreDesc, 'Top Reviews by Upvotes':reviewText}

            data_request = dict()
            data_request['request'] = 'appdetails'
            data_request['appid'] = str(appid)
            steamspydata = steamspypi.download(data_request)

            # Combine all three json dictionaries and convert to dataframe
            GameData.update(ReviewDict)
            GameData.update(steamspydata)
            time.sleep(1) # 1 second rate limit on API calls

        except: #games that do not have any associated data or other failed api calls
            time.sleep(1)
        endtime = time.time()
        elapsedtime = (endtime-starttime)/60
        if elapsedtime >= 360: #If Greater than or equal to 6 hours, then end
            break
        #add all data for current app loop to GameDict
        GameDict.update({str(appid): GameData})
    #Convert to Dataframe
    GameDF = pd.DataFrame.from_dict(GameDict, orient='index')

    return GameDF

In [None]:
Sample_Game_Data = CollectSteamData(GameIDs)

In [None]:
from google.colab import files
Sample_Game_Data.to_csv('RawSteamGameData.csv') 
files.download('RawSteamGameData.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Step 3: Explore and clean data
I ended up with 7,309 games and 62 fields upon initial data extraction. 

Several fields were dropped due to high number of nulls while others were dropped because they represented duplicate data or weren't relevant. 

For the remaining columns, fields containing multiple data values were expanded into individual columns. 

Only about 250 games in the sample had metacritic scores. I will therefore work primarily with review scores for predicting success, which are based on weighted scores of users leaving reviews.

In [None]:
Sample_Game_Data.shape

(7309, 62)

In [None]:
Sample_Game_Data.columns

Index(['type', 'name', 'steam_appid', 'required_age', 'is_free',
       'detailed_description', 'about_the_game', 'short_description',
       'supported_languages', 'header_image', 'website', 'pc_requirements',
       'mac_requirements', 'linux_requirements', 'developers', 'publishers',
       'price_overview', 'packages', 'package_groups', 'platforms',
       'categories', 'genres', 'screenshots', 'movies', 'release_date',
       'support_info', 'background', 'content_descriptors', 'Review Score',
       'Review Score Description', 'Top Reviews by Upvotes', 'appid',
       'developer', 'publisher', 'score_rank', 'positive', 'negative',
       'userscore', 'owners', 'average_forever', 'average_2weeks',
       'median_forever', 'median_2weeks', 'price', 'initialprice', 'discount',
       'ccu', 'languages', 'genre', 'tags', 'fullgame', 'reviews',
       'achievements', 'legal_notice', 'dlc', 'controller_support',
       'recommendations', 'ext_user_account_notice', 'demos', 'metacritic'

In [None]:
Sample_Game_Data.describe(include='all')

Unnamed: 0,type,name,steam_appid,required_age,is_free,detailed_description,about_the_game,short_description,supported_languages,header_image,website,pc_requirements,mac_requirements,linux_requirements,developers,publishers,price_overview,packages,package_groups,platforms,categories,genres,screenshots,movies,release_date,support_info,background,content_descriptors,Review Score,Review Score Description,Top Reviews by Upvotes,appid,developer,publisher,score_rank,positive,negative,userscore,owners,average_forever,average_2weeks,median_forever,median_2weeks,price,initialprice,discount,ccu,languages,genre,tags,fullgame,reviews,achievements,legal_notice,dlc,controller_support,recommendations,ext_user_account_notice,demos,metacritic,drm_notice,alternate_appid
count,7309,7308.0,7309.0,7309.0,7309,7309.0,7309.0,7309.0,7043,7309,4022,7309,7309,7309,6632,7309,5283,5366,7309,7309,7003,6507,6652,4338,7309,7309,7309.0,7309,7291.0,7291,7291.0,7291.0,7291.0,7291.0,7291.0,7291.0,7291.0,7291.0,7291,7291.0,7291.0,7291.0,7291.0,6779.0,6779.0,6779.0,7291.0,6779,7290.0,7291,3002,564,1809,2703,533,2205,814,66,391,250,40,1.0
unique,10,6712.0,,12.0,2,6084.0,6084.0,6526.0,1429,6661,2807,5049,1245,796,4472,3731,255,4928,4908,5,1314,685,6115,3978,2501,4306,6116.0,620,,19,3442.0,,4074.0,3364.0,4.0,,,,11,,,,,146.0,66.0,37.0,,1145,622.0,3336,1512,531,1652,2018,504,1,539,55,361,235,15,1.0
top,game,,,0.0,False,,,,English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,http://www.fantasygrounds.com,[],[],[],[TigerQiuQiu],[],"{'currency': 'USD', 'initial': 99, 'final': 99...",[130890],[],"{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}]","[{'id': '1', 'description': 'Action'}]","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256786325, 'name': 'Soul of Empress Tr...","{'coming_soon': False, 'date': ''}","{'url': '', 'email': ''}",,"{'ids': [], 'notes': None}",,No user reviews,,,,,,,,,"0 .. 20,000",,,,,0.0,0.0,0.0,,English,,[],"{'appid': '252690', 'name': 'Fantasy Grounds C...","“Spot on acting, intelligent music and an art ...",{'total': 0},© 2015 UBISOFT ENTERTAINMENT. ALL RIGHTS RESER...,"[579510, 881830]",full,{'total': 108},PlayFab (Supports Linking to Steam Account),"[{'appid': 1196960, 'description': ''}]","{'score': 79, 'url': 'https://www.metacritic.c...",Denuvo Anti-tamper<br>5 different PC within a ...,243580.0
freq,4089,7.0,,7038.0,6676,645.0,645.0,132.0,1949,11,119,735,4636,5213,151,1311,707,8,1980,5234,1138,353,5,4,94,806,657.0,6347,,3548,3548.0,,1144.0,1774.0,7276.0,,,,6382,,,,,1350.0,1350.0,6325.0,,3424,1272.0,3150,119,4,37,74,2,2205,11,3,2,3,14,1.0
mean,,,999509.7,,,,,,,,,,,,,,,,,,,,,,,,,,1.545878,,,999742.8,,,,813.404197,109.467151,0.169387,,64.316692,3.386778,57.441092,3.637361,,,,44.250171,,,,,,,,,,,,,,,
std,,,496969.8,,,,,,,,,,,,,,,,,,,,,,,,,,2.829286,,,497020.7,,,,14050.711455,2164.124272,3.767921,,1085.400584,55.440596,1050.501621,59.419759,,,,1134.643738,,,,,,,,,,,,,,,
min,,,70.0,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,70.0,,,,0.0,0.0,0.0,,0.0,0.0,0.0,0.0,,,,0.0,,,,,,,,,,,,,,,
25%,,,595754.0,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,595972.0,,,,0.0,0.0,0.0,,0.0,0.0,0.0,0.0,,,,0.0,,,,,,,,,,,,,,,
50%,,,962960.0,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,962990.0,,,,1.0,0.0,0.0,,0.0,0.0,0.0,0.0,,,,0.0,,,,,,,,,,,,,,,
75%,,,1427740.0,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,1428240.0,,,,20.0,6.0,0.0,,0.0,0.0,0.0,0.0,,,,0.0,,,,,,,,,,,,,,,


In [34]:
GameData.isnull().sum()

Unnamed: 0                     0
type                           0
name                           8
steam_appid                    0
required_age                   0
is_free                        0
detailed_description         645
about_the_game               645
short_description            132
supported_languages          266
header_image                   0
website                     3306
pc_requirements                0
mac_requirements               0
linux_requirements             0
developers                   677
publishers                     0
price_overview              2026
packages                    1943
package_groups                 0
platforms                      0
categories                   306
genres                       802
screenshots                  657
movies                      2971
release_date                   0
support_info                   0
background                   657
content_descriptors            0
Review Score                  18
Review Sco

In [4]:
#Expand columns containing multiple datapoints
#https://stackoverflow.com/questions/38231591/split-explode-a-column-of-dictionaries-into-separate-columns-with-pandas

GameData = pd.read_csv('RawSteamGameData.csv',  engine='python')

In [5]:
GameData.columns

Index(['Unnamed: 0', 'type', 'name', 'steam_appid', 'required_age', 'is_free',
       'detailed_description', 'about_the_game', 'short_description',
       'supported_languages', 'header_image', 'website', 'pc_requirements',
       'mac_requirements', 'linux_requirements', 'developers', 'publishers',
       'price_overview', 'packages', 'package_groups', 'platforms',
       'categories', 'genres', 'screenshots', 'movies', 'release_date',
       'support_info', 'background', 'content_descriptors', 'Review Score',
       'Review Score Description', 'Top Reviews by Upvotes', 'appid',
       'developer', 'publisher', 'score_rank', 'positive', 'negative',
       'userscore', 'owners', 'average_forever', 'average_2weeks',
       'median_forever', 'median_2weeks', 'price', 'initialprice', 'discount',
       'ccu', 'languages', 'genre', 'tags', 'fullgame', 'reviews',
       'achievements', 'legal_notice', 'dlc', 'controller_support',
       'recommendations', 'ext_user_account_notice', 'demos'

In [146]:
GameData.head(10)

Unnamed: 0.1,Unnamed: 0,type,name,steam_appid,required_age,is_free,detailed_description,about_the_game,short_description,supported_languages,header_image,website,pc_requirements,mac_requirements,linux_requirements,developers,publishers,price_overview,packages,package_groups,platforms,categories,genres,screenshots,movies,release_date,support_info,background,content_descriptors,Review Score,Review Score Description,Top Reviews by Upvotes,appid,developer,publisher,score_rank,positive,negative,userscore,owners,average_forever,average_2weeks,median_forever,median_2weeks,price,initialprice,discount,ccu,languages,genre,tags,fullgame,reviews,achievements,legal_notice,dlc,controller_support,recommendations,ext_user_account_notice,demos,metacritic,drm_notice,alternate_appid
0,1212600,game,Budo War Girl: maid of desire,1212600,0,False,"Young madam, AI beast girl, Chinese girl, Insa...","Young madam, AI beast girl, Chinese girl, Insa...",you will get 500 diamonds to start the shop.Ma...,"English, Japanese<strong>*</strong>, Simplifie...",https://cdn.akamai.steamstatic.com/steam/apps/...,,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],['武道戰姬製作委員會'],['武道戰姬製作委員會'],"{'currency': 'USD', 'initial': 699, 'final': 6...",[419350],"[{'name': 'default', 'title': 'Buy Budo War Gi...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}]","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256770061, 'name': 'video', 'thumbnail...","{'coming_soon': False, 'date': 'Jan 23, 2020'}","{'url': '', 'email': 'liuyueyanfeng@vip.qq.com'}",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [1, 3, 5], 'notes': 'The content of th...",0.0,5 user reviews,[h1]At a Glance[/h1]\n[table]\n [tr]\n ...,1212600.0,武道戰姬製作委員會,武道戰姬製作委員會,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,699.0,699.0,0.0,0.0,"English, Japanese, Simplified Chinese, Traditi...","Action, Casual, Indie, RPG, Simulation",[],,,,,,,,,,,,
1,1453070,dlc,Sinister Halloween - Asylum DLC,1453070,0,False,During the events on Halloween night in Sinist...,During the events on Halloween night in Sinist...,"As a journalist, you receive a secret email fr...",English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,https://celeritasgames.mailchimpsites.com/sini...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,['Celeritas Games'],['Celeritas Games'],"{'currency': 'USD', 'initial': 399, 'final': 3...",[511374],"[{'name': 'default', 'title': 'Buy Sinister Ha...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256808554, 'name': 'Asylum Trailer', '...","{'coming_soon': False, 'date': 'Oct 30, 2020'}","{'url': '', 'email': 'sinisterhalloween@outloo...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [2, 5], 'notes': 'Dead corpses\r\nBloo...",0.0,6 user reviews,So I wasn't sure if I should give this a posit...,1453070.0,Celeritas Games,Celeritas Games,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,399.0,399.0,0.0,0.0,English,"Action, Adventure, Indie",[],"{'appid': '747690', 'name': 'Sinister Halloween'}",,,,,,,,,,,
2,384441,dlc,Sinister Halloween - Asylum DLC,1453070,0,False,During the events on Halloween night in Sinist...,During the events on Halloween night in Sinist...,"As a journalist, you receive a secret email fr...",English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,https://celeritasgames.mailchimpsites.com/sini...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,['Celeritas Games'],['Celeritas Games'],"{'currency': 'USD', 'initial': 399, 'final': 3...",[511374],"[{'name': 'default', 'title': 'Buy Sinister Ha...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256808554, 'name': 'Asylum Trailer', '...","{'coming_soon': False, 'date': 'Oct 30, 2020'}","{'url': '', 'email': 'sinisterhalloween@outloo...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [2, 5], 'notes': 'Dead corpses\r\nBloo...",0.0,6 user reviews,So I wasn't sure if I should give this a posit...,1453070.0,Celeritas Games,Celeritas Games,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,399.0,399.0,0.0,0.0,English,"Action, Adventure, Indie",[],"{'appid': '747690', 'name': 'Sinister Halloween'}",,,,,,,,,,,
3,1644840,game,Kittens and Yarn,1644840,0,False,<h1>🐾 Also Recommended for You: 🐾</h1><p><a hr...,Lots of cuteness and mess! That's exactly what...,Move the pieces of place and untangle the wool...,"English, Portuguese - Brazil, Simplified Chinese",https://cdn.akamai.steamstatic.com/steam/apps/...,,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],['Pinel Games'],['Pinel Games'],"{'currency': 'USD', 'initial': 199, 'final': 1...",[584471],"[{'name': 'default', 'title': 'Buy Kittens and...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '4', 'description': 'Casual'}]","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256838348, 'name': 'Kittens and Yarn T...","{'coming_soon': False, 'date': 'Jun 17, 2021'}","{'url': 'https://pinelgames.com', 'email': 'co...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",8.0,Very Positive,A sweet little time waster when you just want ...,1644840.0,Pinel Games,Pinel Games,,94.0,5.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,199.0,199.0,0.0,0.0,"English, Portuguese - Brazil, Simplified Chinese",Casual,"{'Cats': 113, 'Cute': 107, 'Minimalist': 102, ...",,“This game had me smiling during a tough time ...,"{'total': 8, 'highlighted': [{'name': 'Meet Fl...",,,,,,,,,
4,429040,game,Furfly,429040,0,False,"Furfly is a fascinating, black and white world...","Furfly is a fascinating, black and white world...",Help the little furry friend to escape the evi...,English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,http://www.smd-gaming-studio.com/games/furfly/,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,['SMD Gaming Studio'],['SMD Gaming Studio'],"{'currency': 'USD', 'initial': 499, 'final': 4...",[88788],"[{'name': 'default', 'title': 'Buy Furfly', 'd...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256659195, 'name': 'Furfly Trailer', '...","{'coming_soon': False, 'date': 'Dec 18, 2015'}",{'url': 'http://www.smd-gaming-studio.com/cont...,https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",0.0,9 user reviews,This is actually a very tough game to play. Th...,429040.0,SMD Gaming Studio,SMD Gaming Studio,,16.0,4.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,499.0,499.0,0.0,0.0,English,"Action, Adventure, Casual, Indie","{'2D': 289, 'Time Attack': 283, 'Difficult': 2...",,"7.3 (GOOD) – <a href=""http://ro.ign.com/furfly...",,2015 - SMD Gaming Studio - Developed by Serban...,,,,,,,,
5,80670,movie,ARMA II - CDF Trailer (ESRB),80670,0,False,,,,,https://cdn.akamai.steamstatic.com/steam/apps/...,,[],[],[],,[''],,,[],"{'windows': True, 'mac': False, 'linux': False}",,,,"[{'id': 80670, 'name': 'ARMA II - CDF Trailer ...","{'coming_soon': False, 'date': ''}","{'url': '', 'email': ''}",,"{'ids': [], 'notes': None}",0.0,No user reviews,,80670.0,,,,1731.0,357.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,"{'Simulation': 523, 'Military': 429, 'Action':...",,,,,,,,,,,,
6,1581518,dlc,Dominion - Adventures,1581518,0,False,"<img src=""https://cdn.akamai.steamstatic.com/s...","<img src=""https://cdn.akamai.steamstatic.com/s...",This is the 9th expansion to the game of Domin...,English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],['Temple Gates Games'],['Temple Gates Games'],"{'currency': 'USD', 'initial': 999, 'final': 9...",[559542],"[{'name': 'default', 'title': 'Buy Dominion - ...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 1, 'description': 'Multi-player'}, {'i...","[{'id': '2', 'description': 'Strategy'}]","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...",,"{'coming_soon': False, 'date': 'Oct 4, 2021'}","{'url': '', 'email': 'info@templegatesgames.com'}",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",0.0,No user reviews,,1581518.0,Temple Gates Games,Temple Gates Games,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,999.0,999.0,0.0,0.0,English,Strategy,[],"{'appid': '1131620', 'name': 'Dominion'}",“I tremendously enjoy the event cards. Some of...,,,,,,,,,,
7,1153530,game,Old School Maze,1153530,0,False,"Old School Maze - a series of mazes, made in t...","Old School Maze - a series of mazes, made in t...",Get out of the maze of nostalgia.,English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],['CSM'],['W. T. B.'],,,[],"{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}]","[{'id': '25', 'description': 'Adventure'}, {'i...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...",,"{'coming_soon': True, 'date': 'Sep 25, 2019'}","{'url': '', 'email': 'excellente23@gmail.com'}",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",0.0,No user reviews,,1153530.0,CSM,W. T. B.,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,English,"Adventure, Casual, Indie, Early Access",[],,,,,[1154870],,,,,,,
8,686200,game,Door Kickers: Action Squad,686200,0,False,<strong>Door Kickers: Action Squad</strong> is...,<strong>Door Kickers: Action Squad</strong> is...,"Rescue hostages, disarm bombs and save the day...","English<strong>*</strong>, French, German, Spa...",https://cdn.akamai.steamstatic.com/steam/apps/...,https://inthekillhouse.com/actionsquad/,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],"['PixelShard', 'KillHouse Games']",['KillHouse Games'],"{'currency': 'USD', 'initial': 1399, 'final': ...","[196428, 230324]","[{'name': 'default', 'title': 'Buy Door Kicker...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256733266, 'name': 'Halloween Update',...","{'coming_soon': False, 'date': 'Sep 10, 2018'}","{'url': 'https://inthekillhouse.com/contact', ...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",8.0,Very Positive,It is a very good and fun game! \n\nYou can pl...,686200.0,"PixelShard, KillHouse Games",KillHouse Games,,6839.0,377.0,0.0,"200,000 .. 500,000",320.0,43.0,182.0,72.0,279.0,1399.0,80.0,516.0,"English, French, German, Spanish - Spain, Russ...","Action, Casual, Indie, Simulation, Strategy","{'Action': 197, 'Pixel Graphics': 196, 'Co-op'...",,“Door Kickers: Action Squad is like a 2-player...,"{'total': 51, 'highlighted': [{'name': 'Heatin...",,"[1788610, 926800, 1286400]",full,{'total': 6225},,,,,
9,1637530,dlc,Crossout - The Creation,1637530,0,False,— Armored car “First prototype”;<br>— Unique p...,— Armored car “First prototype”;<br>— Unique p...,— Armored car “First prototype”; — Unique port...,"English, Russian, German, Spanish - Spain, Jap...",https://cdn.akamai.steamstatic.com/steam/apps/...,http://crossout.net/,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],['Targem Games'],['Gaijin Distribution KFT'],"{'currency': 'USD', 'initial': 6999, 'final': ...",[581612],"[{'name': 'default', 'title': 'Buy Crossout - ...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 1, 'description': 'Multi-player'}, {'i...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256846504, 'name': 'Crossout — The Cre...","{'coming_soon': False, 'date': 'Aug 9, 2021'}","{'url': 'http://support.gaijinent.com', 'email...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",0.0,No user reviews,,1637530.0,Targem Games,Gaijin Distribution KFT,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,6999.0,6999.0,0.0,0.0,"English, Russian, German, Spanish - Spain, Jap...","Action, Free to Play, Massively Multiplayer, R...",[],"{'appid': '386180', 'name': 'Crossout'}",,,Developed Targem Games and published by Gaijin...,,full,,,,,,


In [6]:
#drop unncessary fields
# Dropping 'drm_notice', 'alternate_appid','score_rank', 'ext_user_account_notice', 'demos', 'dlc',  high number of missing data
# Dropping 'legal_notice', 'header_image', 'website', 'pc_requirements', 'packages','mac_requirements', 'linux_requirements', 'screenshots', 'movies', due to irrelevance
#Dropping duplicated developer, publisher


GameData = GameData[['Unnamed: 0', 'type', 'name', 'steam_appid', 'required_age', 'is_free',
       'detailed_description', 'about_the_game', 'short_description',
       'supported_languages', 'developers', 'publishers',
       'price_overview', 'packages', 'platforms',
       'categories', 'genres', 'release_date','content_descriptors', 'Review Score',
       'Review Score Description', 'Top Reviews by Upvotes', 'appid',
       'positive', 'negative',
       'userscore', 'owners', 'average_forever', 'average_2weeks',
       'median_forever', 'median_2weeks', 'price', 'initialprice', 'discount',
       'ccu', 'languages', 'genre', 'tags', 'fullgame', 'reviews',
       'achievements', 'dlc', 'controller_support',
       'recommendations','metacritic'
       ]]

In [7]:
 #handle dictionaries


#  'price_overview', 'platforms', 'categories', 'genres','release_date', 'recommendations'
# #lists
# 'genre', 'tags'
#  #convert acheivements to has or doesn't have

#  dlc = downloadable content

In [8]:
GameData.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7309 entries, 0 to 7308
Data columns (total 45 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Unnamed: 0                7309 non-null   int64  
 1   type                      7309 non-null   object 
 2   name                      7301 non-null   object 
 3   steam_appid               7309 non-null   int64  
 4   required_age              7309 non-null   object 
 5   is_free                   7309 non-null   bool   
 6   detailed_description      6664 non-null   object 
 7   about_the_game            6664 non-null   object 
 8   short_description         7177 non-null   object 
 9   supported_languages       7043 non-null   object 
 10  developers                6632 non-null   object 
 11  publishers                7309 non-null   object 
 12  price_overview            5283 non-null   object 
 13  packages                  5366 non-null   object 
 14  platform

In [9]:
# create a function to convert the lists of dictionaries stored as strings to dictionaries for processing

def makelist(DictStrList):
    ''' 
    Takes list of dictionaries stored as a string and converts to a list of dictionaries
    '''
    try:    
        if len(DictStrList.split(', '))==2:
            x = eval(DictStrList)
        else:
            x = list(eval(DictStrList))
    except: 
        x = DictStrList
        # print(DictStrList)
    return x

In [10]:
# Handle categories
GameData['categories'] = GameData['categories'].apply(makelist)
GameData = GameData.explode('categories')
# categories = GameData['categories'].apply(pd.Series)
GameData = pd.concat([GameData.drop(['categories'], axis=1), GameData['categories'].apply(pd.Series)], axis=1)

#pivot so each category has it's own column
# GameData = GameData.pivot()

In [13]:
GameData.reset_index().pivot(columns = ['description'], values = ['id'])

In [259]:
#handle genres



count    0.0
mean     NaN
std      NaN
min      NaN
25%      NaN
50%      NaN
75%      NaN
max      NaN
Name: 0, dtype: float64

In [198]:
GameData['genres'] = GameData['genres'].str.strip('][')
GameData['genres'] = GameData['genres'].apply(makelist)
GameData.head()

Unnamed: 0.1,Unnamed: 0,type,name,steam_appid,required_age,is_free,detailed_description,about_the_game,short_description,supported_languages,header_image,website,pc_requirements,mac_requirements,linux_requirements,developers,publishers,price_overview,packages,package_groups,platforms,categories,genres,screenshots,movies,release_date,support_info,background,content_descriptors,Review Score,Review Score Description,Top Reviews by Upvotes,appid,developer,publisher,score_rank,positive,negative,userscore,owners,average_forever,average_2weeks,median_forever,median_2weeks,price,initialprice,discount,ccu,languages,genre,tags,fullgame,reviews,achievements,legal_notice,dlc,controller_support,recommendations,ext_user_account_notice,demos,metacritic,drm_notice,alternate_appid
0,1212600,game,Budo War Girl: maid of desire,1212600,0,False,"Young madam, AI beast girl, Chinese girl, Insa...","Young madam, AI beast girl, Chinese girl, Insa...",you will get 500 diamonds to start the shop.Ma...,"English, Japanese<strong>*</strong>, Simplifie...",https://cdn.akamai.steamstatic.com/steam/apps/...,,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],['武道戰姬製作委員會'],['武道戰姬製作委員會'],"{'currency': 'USD', 'initial': 699, 'final': 6...",[419350],"[{'name': 'default', 'title': 'Buy Budo War Gi...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}]","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256770061, 'name': 'video', 'thumbnail...","{'coming_soon': False, 'date': 'Jan 23, 2020'}","{'url': '', 'email': 'liuyueyanfeng@vip.qq.com'}",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [1, 3, 5], 'notes': 'The content of th...",0.0,5 user reviews,[h1]At a Glance[/h1]\n[table]\n [tr]\n ...,1212600.0,武道戰姬製作委員會,武道戰姬製作委員會,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,699.0,699.0,0.0,0.0,"English, Japanese, Simplified Chinese, Traditi...","Action, Casual, Indie, RPG, Simulation",[],,,,,,,,,,,,
1,1453070,dlc,Sinister Halloween - Asylum DLC,1453070,0,False,During the events on Halloween night in Sinist...,During the events on Halloween night in Sinist...,"As a journalist, you receive a secret email fr...",English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,https://celeritasgames.mailchimpsites.com/sini...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,['Celeritas Games'],['Celeritas Games'],"{'currency': 'USD', 'initial': 399, 'final': 3...",[511374],"[{'name': 'default', 'title': 'Buy Sinister Ha...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256808554, 'name': 'Asylum Trailer', '...","{'coming_soon': False, 'date': 'Oct 30, 2020'}","{'url': '', 'email': 'sinisterhalloween@outloo...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [2, 5], 'notes': 'Dead corpses\r\nBloo...",0.0,6 user reviews,So I wasn't sure if I should give this a posit...,1453070.0,Celeritas Games,Celeritas Games,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,399.0,399.0,0.0,0.0,English,"Action, Adventure, Indie",[],"{'appid': '747690', 'name': 'Sinister Halloween'}",,,,,,,,,,,
2,384441,dlc,Sinister Halloween - Asylum DLC,1453070,0,False,During the events on Halloween night in Sinist...,During the events on Halloween night in Sinist...,"As a journalist, you receive a secret email fr...",English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,https://celeritasgames.mailchimpsites.com/sini...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,['Celeritas Games'],['Celeritas Games'],"{'currency': 'USD', 'initial': 399, 'final': 3...",[511374],"[{'name': 'default', 'title': 'Buy Sinister Ha...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256808554, 'name': 'Asylum Trailer', '...","{'coming_soon': False, 'date': 'Oct 30, 2020'}","{'url': '', 'email': 'sinisterhalloween@outloo...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [2, 5], 'notes': 'Dead corpses\r\nBloo...",0.0,6 user reviews,So I wasn't sure if I should give this a posit...,1453070.0,Celeritas Games,Celeritas Games,,0.0,0.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,399.0,399.0,0.0,0.0,English,"Action, Adventure, Indie",[],"{'appid': '747690', 'name': 'Sinister Halloween'}",,,,,,,,,,,
3,1644840,game,Kittens and Yarn,1644840,0,False,<h1>🐾 Also Recommended for You: 🐾</h1><p><a hr...,Lots of cuteness and mess! That's exactly what...,Move the pieces of place and untangle the wool...,"English, Portuguese - Brazil, Simplified Chinese",https://cdn.akamai.steamstatic.com/steam/apps/...,,{'minimum': '<strong>Minimum:</strong><br><ul ...,[],[],['Pinel Games'],['Pinel Games'],"{'currency': 'USD', 'initial': 199, 'final': 1...",[584471],"[{'name': 'default', 'title': 'Buy Kittens and...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","{'id': '4', 'description': 'Casual'}","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256838348, 'name': 'Kittens and Yarn T...","{'coming_soon': False, 'date': 'Jun 17, 2021'}","{'url': 'https://pinelgames.com', 'email': 'co...",https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",8.0,Very Positive,A sweet little time waster when you just want ...,1644840.0,Pinel Games,Pinel Games,,94.0,5.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,199.0,199.0,0.0,0.0,"English, Portuguese - Brazil, Simplified Chinese",Casual,"{'Cats': 113, 'Cute': 107, 'Minimalist': 102, ...",,“This game had me smiling during a tough time ...,"{'total': 8, 'highlighted': [{'name': 'Meet Fl...",,,,,,,,,
4,429040,game,Furfly,429040,0,False,"Furfly is a fascinating, black and white world...","Furfly is a fascinating, black and white world...",Help the little furry friend to escape the evi...,English<strong>*</strong><br><strong>*</strong...,https://cdn.akamai.steamstatic.com/steam/apps/...,http://www.smd-gaming-studio.com/games/furfly/,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,{'minimum': '<strong>Minimum:</strong><br><ul ...,['SMD Gaming Studio'],['SMD Gaming Studio'],"{'currency': 'USD', 'initial': 499, 'final': 4...",[88788],"[{'name': 'default', 'title': 'Buy Furfly', 'd...","{'windows': True, 'mac': False, 'linux': False}","[{'id': 2, 'description': 'Single-player'}, {'...","[{'id': '1', 'description': 'Action'}, {'id': ...","[{'id': 0, 'path_thumbnail': 'https://cdn.akam...","[{'id': 256659195, 'name': 'Furfly Trailer', '...","{'coming_soon': False, 'date': 'Dec 18, 2015'}",{'url': 'http://www.smd-gaming-studio.com/cont...,https://cdn.akamai.steamstatic.com/steam/apps/...,"{'ids': [], 'notes': None}",0.0,9 user reviews,This is actually a very tough game to play. Th...,429040.0,SMD Gaming Studio,SMD Gaming Studio,,16.0,4.0,0.0,"0 .. 20,000",0.0,0.0,0.0,0.0,499.0,499.0,0.0,0.0,English,"Action, Adventure, Casual, Indie","{'2D': 289, 'Time Attack': 283, 'Difficult': 2...",,"7.3 (GOOD) – <a href=""http://ro.ign.com/furfly...",,2015 - SMD Gaming Studio - Developed by Serban...,,,,,,,,


In [41]:
price_data = GameData['price_overview'].apply(pd.Series)
platforms = GameData['platforms'].apply(pd.Series)
categories = pd.DataFrame(GameData['categories'].tolist())
categories

Unnamed: 0,0
0,"[{'id': 2, 'description': 'Single-player'}]"
1,"[{'id': 2, 'description': 'Single-player'}, {'..."
2,"[{'id': 2, 'description': 'Single-player'}, {'..."
3,"[{'id': 2, 'description': 'Single-player'}, {'..."
4,"[{'id': 2, 'description': 'Single-player'}, {'..."
5,
6,"[{'id': 1, 'description': 'Multi-player'}, {'i..."
7,"[{'id': 2, 'description': 'Single-player'}]"
8,"[{'id': 2, 'description': 'Single-player'}, {'..."
9,"[{'id': 1, 'description': 'Multi-player'}, {'i..."
